-
-
Save kousu/bf5610187b608d79d415b1436091ab2d to your computer and use it in GitHub Desktop.
import os.path | |
def sanitize_path(path): | |
""" | |
Sanitize a path against directory traversals | |
Based on https://stackoverflow.com/questions/13939120/sanitizing-a-file-path-in-python. | |
>>> sanitize_path('../test') | |
'test' | |
>>> sanitize_path('../../test') | |
'test' | |
>>> sanitize_path('../../abc/../test') | |
'test' | |
>>> sanitize_path('../../abc/../test/fixtures') | |
'test/fixtures' | |
>>> sanitize_path('../../abc/../.test/fixtures') | |
'.test/fixtures' | |
>>> sanitize_path('/test/foo') | |
'test/foo' | |
>>> sanitize_path('./test/bar') | |
'test/bar' | |
>>> sanitize_path('.test/baz') | |
'.test/baz' | |
>>> sanitize_path('qux') | |
'qux' | |
""" | |
# - pretending to chroot to the current directory | |
# - cancelling all redundant paths (/.. = /) | |
# - making the path relative | |
return os.path.relpath(os.path.join("/", path), "/") | |
if __name__ == '__main__': | |
import doctest | |
doctest.testmod() |
I don't know why this is so hard. The top StackOverflow answer says to use secure_filename()
but that only works on file_names_ -- it forces all '/
to _
-- which isn't my use case. When I wrote this I had in mind accessing HTML snippets or files in an upload folder, and for that I want to preserve some directory structure just not access to the whole system.
Reading the docs, one thing that resolve()
does is follow all symlinks. So my version is not safe against symlink poisoning. I assumed my apps don't make symlinks and no one else should be writing into their folder, but it's a risk. There's no way to protect against symlink poisoning without actually reading the disk.
To be safe against that I guess you need to take a different approach and check if you're in a given restricted folder. But that's tricky to get right as well:
sudo mkdir -p /var/www/app/uploads
>>> def is_safe(base, target):
... base = pathlib.Path(base)
... target = pathlib.Path(target)
...
...
... try:
... return (base/target.parent).resolve(strict=True).is_relative_to(base)
... except FileNotFoundError:
... return False # not safe, someone tried
>>> is_safe("/var/www/app/uploads", "../../../etc/passwd")
False
>>> is_safe("/var/www/app/uploads", "/etc/passwd")
False
>>> is_safe("/var/www/app/uploads", "some_nonexistent_file.txt")
True
>>> is_safe("/var/www/app/uploads", "somesubfolder/some_nonexistent_file.txt") # but it fails incorrectly for not-yet-created subfolders
False
I, thank you for the feedback here. My use case is also web apps. I did not face the issue you mention, but it is problematic indeed.
From the pathlib doc (emphasis mine) :
I suppose then one would like to write :
but the
PurePath
class does not implementresolve()
. I am also not familiar enough with the library to suggest a solution.