Description

Depending on your application, you may need to let anonymous users interact with your web services such that files will have to be written to the filesystem. It is a boilerplate case where user-input takes part in the decision where those files have to be written.

One typical scenario is a web-server where an anonymous user performs a GET request such as:

http://server.tld/path/to/file.html

and your web-server will typically take a base-path, as the path to a document root, say:

C:\Web\docRoot

and combine the document root with the user-supplied path, in order to obtain:

C:\Web\docRoot\path\to\file.html

and be able to read the file.

Combining the local path to the document root and the user-supplied path is typically achieved with some sort of "path join" function depending on your application's API.

Nevertheless, it is entirely permitted for the anonymous user to supply a path such as:

http://server.tld/../../../../../../../../shadow-

such that given any document root path, for instance:

/var/www

the combined path would result in:

/var/www/../../../../../../../../shadow-

which is a semantically valid path for most common filesystems.

The result is that the path traversal will allow any anonymous user to access any file owned by the user that your web-server is currently running under. Furthermore, if your web-server is configured (or implements) directory listings, then anonymous users could potentially probe your filesystem by listing the contents of directories that your web-server user has access to.

There are "worse" scenarios, when, for instance, you have, say, a path that combines with an username in order to obtain the path to a log file. In that instance, users could arbitrarily write to files.

Standard Mitigation

Obviously, given a document root, your application needs to read or write from and to files that are under the document root - such that access to any file outside the document root does not make any sense.

To judge on a basic example, consider the document root to be:

/var/www/website.tld

and that GET requests of the form:

http://website.tld/css/style.css
http://website.tld/html/errors/404.html

can be performed by anonymous users and that the document root path will have to be combined with the requests in order to achieve the filesystem path:

/var/www/website.tld/css/style.css
/var/www/website.tld/html/errors/404.html

such that files can be read or written.

To prevent path injections of the form:

/var/www/website.tld/../../../../../etc/shadow-

The following standard mitigation steps have to be performed before attempting to read from the reading from the resulting combined path:

  1. Combine the document root path with the user-requested path (this step can, in most cases, just be performed through concatenation. Following the example, this results in the path /var/www/website.tld/../../../../../etc/shadow-.
  2. Take the resulting combined path, following the example /var/www/website.tld/../../../../../etc/shadow- and then use a real path resolver to determine the resolved path. The keyword to look for in the API is usually realpath that is typically a function or method that takes as parameter a path such as /var/www/website.tld/../../../../../etc/shadow- and resolves it, in this case, to, say /etc/shadow- by climbing up the filesystem tree and following parent directories (..).

You will now have the "real path" to the requested file:

/etc/shadow-

and you will also know your document root:

/var/www/website.tld

The next step in determining whether the requested file /etc/shadow- is a child of your document root would be to:

  1. Split (explode) the base path into an array of path parts (for instance, on the / character - although, it is much better to search the API for a path separator compile-time constant that is operating-system agnostic), in this case, the array would contain the following elements: var, www, website.tld.
  2. Split the "real path" to the requested file into an array of path parts, in this case the array would contain the elements etc, shadow-.

Finally, you will have to loop sequentially over the array of document root path-parts and check that each element is equal to the corresponding element in the "real path" to the requested file. This loop will have to be performed until all the elements in the document root path-parts array have been exhausted. If at any point during the loop, a path part from the document root array is not equal to the path part of the "real path" to the requested file at the same index, then the file is not a child of the document root and you should deny access.

Figurative Examples

Here we list some examples just reasoning on the document root path and the "real" requested path without mentioning code.

Example

Consider the following document root:

/var/www/website.tld

and the following GET request performed by an anonymous user:

http://website.tld/css/../img/icon.png

Splitting the document root path into an array, you will obtain:

[ ''var'', ''www', ''website.tld'' ]

Combining the document root path with the requested path and retrieving the "real path", you will obtain the requested path:

[ ''var'', ''www'', ''website.tld'', ''img'', ''icon.png'' ]

You now compare the document root path with the requested path, element-by-element until no more elements remain in the document root path:

  1. var is the same as var
  2. www is the same as www
  3. website.tld is the same as website.tld

Since all the elements of the document root path array have been exhausted, you can conclude that the file lies within your document root.

Counter-Example

Consider the following document root:

/var/www/website.tld

and the following GET request performed by an anonymous user:

http://website.tld/css/../img/../../../icon.png

Splitting the document root path into an array, you will obtain:

[ ''var'', ''www', ''website.tld'' ]

Combining the document root path with the requested path and retrieving the "real path", you will obtain the requested path:

[ ''var'', ''icon.png'' ]

Comparing the document root path with "real" requested path:

  1. var is the same as var
  2. www is not the same as icon.png

You can now abort the loop and you know that the requested file lies outside the document root and you can deny access.

How To Mess It Up

A weaker check would be to perform a "set equals" operation on the two arrays by checking that all the elements of the document root path array:

[ ''var'', ''www', ''website.tld'' ]

are contained within the "real" resolved path array:

[ ''var'', ''website.tld'', ''www'', ''icon.png'' ]

Although sets are by definition unordered, given a filesystem hierarchy, the order of the elements in the set is important. You can notice that in this example, all the elements of the document root path are contained within the "real" resolved path but that the "real" requested resolved path is still outside the document root path.

Do not perform a set-equals check: you need a point-wise sequential set comparison between the document root path and the "real" resolved requested path!

Index


security/mitigating_path_traversals_for_web_services.txt ยท Last modified: 2022/04/19 08:28 by 127.0.0.1

Wizardry and Steamworks

© 2025 Wizardry and Steamworks

Access website using Tor Access website using i2p Wizardry and Steamworks PGP Key


For the contact, copyright, license, warranty and privacy terms for the usage of this website please see the contact, license, privacy, copyright.