Path Traversal

When it comes to the structure of web pages, they are often organized into directories. Typically, elements such as scripts, images, CSS files, and HTML files are stored in a hierarchical folder structure. This systematic arrangement allows for efficient management and easier access to different types of files that constitute a web page.

It’s common to refer to various elements within the folder structure, such as images, scripts, and other resources. However, if an attacker gains control over these references and the web application fails to validate the paths, they could potentially manipulate these references to gain access to files outside the designated web folder structure. This is known as a Path Traversal (or Directory Traversal) attack.

Imagine a web app that has an image and if we look into the html code we see that the image is retrieved by a get_image function:

  
<img src="/get_image?filename=my_image.jpg" alt="Image">

It gets a filename parameter with a image name. However, if the filename parameter is not properly validated, an attacker could manipulate it to traverse directories and access files outside of the intended directory structure appending ../.

The notation ../ is a relative path expression used to refer to the parent directory of the current directory in a file system. For example, suppose you have a file path /var/www/html/images/ and you want to move up one level to the html directory. You can use the ../ notation to do so, resulting in the path /var/www/html/../. This path simplifies to /var/www/, which represents the parent directory of the html directory.

For instance, if the attacker could submit this filename: /../../../../../../etc/passwd and get the passwd file instead of the image.

As usual, lets see some examples using labs from the Web Security Academy.

We have this web page that simulates a shopping web:

If we check the sourcecode, we see that the image is retrieved from the /image endpoint using filename=31.jpg as a parameter. We can try to do a path traversal attack.

Using BurpSuite (and validating that we don’t have a filter that hides img requests), we go to the HTTP history tab and search for Image request and send it to the repeater:

There, we modify the request and add several ../(since we don’t know how many directories we have to traverse to reach the root /, we add a a lot of them):

The response is the content of the /etc/passwd file:

Ok, but it is not always that easy. Most applications will implement defenses against this attack, but depending on what kind of validation they are doing, they can be bypassed.

Absolute path: If you can’t use ../ , try writing the absolute path without trying to “traverse”

Stripping: Sometimes, the application will delete/stripe the ../ substring. Striping it from this this string img/../../../etc/passwd will result in img/etc/passwd that may not exist. However, imagine that we provide this: ....// . In this string there is only one occurrence of ../, but if you stripe it, the result is again ../ . If the web app doesn’t notice this, we could bypass this restriction by providing something like: img/….//….//….//….//etc/passwd , and after striping, the resulting string would be: img/../../../../etc/passwd.
In this example we can see that the app is stripping the ../ string, because when including them before the picture name, the server is able to find the photo, implying that the strings where stripped.
However, if we use ….//….//….//….//etc/passwd:
Sometimes all occurrences of ../ will be deleted. You can try to use some encoding such as URL encoding (or even URL encoding the same string two times, since the web application may URL decode before searching for ../. In this example, I had to URL encode two times:
Specific start: Sometimes the web application may require the parameter to start with the specific path that it is expecting, such as /var/www/images, but this is not a protection at all, since with the path traversal we can “traverse” these directories. The only thing to bear in mind is that the payload will need to start with that specific string:
Specific end: On the other hand, if the application is retrieving a image, it may ensure that the parameter ends with .jpg, .png, or other image extensions. In such cases, we can use the a null byte (%00) to terminate the parameter with the expected extension but not affecting the path that will be executed.

The null byte (%00) can act as a string terminator in certain programming languages and systems, effectively truncating the input string.

How to avoid Path Traversal attacks

The best way to avoid path traversal attacks is to avoid using parameter that can be modified by the end user when accessing the file system. However, other strong mechanisms to prevent them are:

Strong Input Validation: By implementing strict filters and sanitization procedures, you can obstruct any attempt at injecting malicious path traversal characters.
Whitelisting: Implementing a stringent whitelist policy for permitted inputs can significantly reduce the risk.
Secure File Access Mechanisms: Establishing secure file access mechanisms is essential when preventing path traversal attacks. Implement robust access control policies, ensuring that files and directories are accessible only through designated, controlled routes.

Path Traversal

Path Traversal

Further Reading

Single-packet Attack

Race Conditions

XML External Entity