The Risk of Archive Extraction

Archives are often used to import data sets in web applications. Especially in Java, archives like Jar, War or Apk are used to aggregate Java class files and resources into one single file. Vulnerabilities resulting from an insecure extraction of archives are already known for a long time. In 2018 Snyk disclosed multiple vulnerabilities affecting this issue in various software libraries under the name Zip Slip.
However, the problem still exists if developers decide to implement their own extraction functionalities. During the last months RIPS Code Analysis found similar issues in popular Java software which led to Remote Code Execution (CVE-2019-3397, CVE-2019-12309).

The problem occurs if developers do not validate or sanitize the user input which is received out of an archive. An attacker can prepare a malicious ZIP file with the ../ notation to traverse out of the intended directory and drop a malicious executable file. The following listing shows a malicious ZIP file entry.

Malicious ZIP file

1
2
3
  Length      Date    Time    Name
---------  ---------- -----   ----
      133  2019-05-23 17:43   ../../../../../../../../../[WEBROOT_PATH]/zipslip.jsp

If we extract such a ZIP file using the following vulnerable code snippet, the JSP file zipslip.jsp is not being extracted into /my/target/directory/ but instead dropped into the [WEBROOT_PATH] directory of the web server. The function extract iterates over all file entries which are part of the passed ZIP file. In this example, the user input is received from the method ZipEntry.getName() and directly flows into the sensitive sink java.io.File in line 11. At this point a file object is created with the parent directory /my/target/directory/ and the child directory ../../../../../../../../../[WEBROOT_PATH]/zipslip.jsp which resolves to [WEBROOT_PATH]/zipslip.jsp. The file is then written to the file system in line 15.


Vulnerable Code Snippet

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import java.util.zip.ZipFile;
import java.util.zip.ZipEntry;

public void extract(ZipFile zip) {
     
     String toDir = "/my/target/directory/";
     Enumeration entries = zip.entries();
     while (entries.hasMoreElements()) {
        ZipEntry zipEntry = entries.nextElement();
        
        File file = new File(toDir, zipEntry.getName())
        InputStream istr = zipFile.getInputStream(zipEntry);
        final OutputStream os = Files.newOutputStream(file.toPath());
        bos  = new BufferedOutputStream(os);
        IOUtils.copy(bis, bos);
	
    }
}

Not only java.util.zip.ZipEntry needs to be treated carefully, also the popular library org.apache.commons.compress.archivers contains multiple ArchiveEntry classes which pose a security risk if handled wrongly.

The following listing illustrates one way to circumvent this security issue. In line 11 a check is performed if the entry received from the ZIP file is within the intended target directory. If the canonical path of the file does not start with the path of the target directory a security exception is thrown. Note, that this fix is only complete if the attacker does not control the toDir.


Patched Code Snippet

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import java.util.zip.ZipFile;
import java.util.zip.ZipEntry;

public void extract(ZipFile zip) {
     
     String toDir = "/my/target/directory/";
     Enumeration entries = zip.entries();
     while (entries.hasMoreElements()) {
        ZipEntry zipEntry = entries.nextElement();
        
        File file = new File(toDir, zipEntry.getName())
        if( !file.getCanonicalPath().startsWith(toDir) ){
          throw new SecurityException("ZipEntry not within target directory!");
        }
        InputStream istr = zipFile.getInputStream(zipEntry);
        final OutputStream os = Files.newOutputStream(file.toPath());
        bos  = new BufferedOutputStream(os);
        IOUtils.copy(bis, bos);
    }
}

Summary

In this post, we explained the risk behind the extraction of untrusted archives. This issue mostly results in remote code execution since arbitrary files can be overwritten with the permissions of the web server or corresponding user. Those bugs are already known for a long time but they are still present as the CVE’s (CVE-2019-3397, CVE-2019-12309) showed. With this blog post we want to raise awareness about the security risk when handling untrusted archives.