When you call suprocess.Popen, you can pass the command to execute in two ways:
- A string that corresponds to the commandline that you would run at the shell prompt
command = "yum whatprovides 'java = 1.8.0'"
- A list where each argument to the command is a separate entry:
command = ["yum", "whatprovides", "java = 1.8.0"]
When you pass a string to subprocess.Popen, the command has to either have no arguments or be passed through the shell
to be executed (via an argument to Popen, shell=True). That means that you have to be wary of any input to the command
that you don't control as those could contain characters interpreted by the shell. You end up having to quote those
characters:
command = "yum" + " whatprovides" + shlex_quote("java = 1.8.0")
shlex.quote
(pipes.quote
on python2) will quote 'java = 1.8.0'
so that it is a single argument after being parsed by the
shell. If you didn't quote it, the command line would be: "yum whatprovides java = 1.8.0"
and yum think that you gave it
four arguments and lookup what provides "java" and what provides "=" and what provides "1.8.0".
The security issue comes in if you forget to quote something that a malicious user could have created. For instance, say I do this:
files = os.listdir('/tmp/')
for filename in files:
if os.path.isfile(filename):
subprocess.Popen('cat /tmp/' + filename)
I would expect this to cat the contents of all the files in /tmp that I have permissions on. But what if a malicious
user creates a filename like this:
touch '/tmp/\;chmod\ -R\ a+rwx\ .'
Then my subprocess.Popen call will end up listing all of /tmp/ and then make everything in the directory I invoked the python script from readable, writable, and executable by anyone. Imagine if I ran this from my home directory: private things like my .ssh/ private keys would then be exposed.
Let's say you write a helper function for calling subprocess.Popen called utils.run_subprocess(). This function
takes a string for the command + arguments and then uses shlex.split()
internally to change that into a list before
passing to Popen. By doing it this way, we do not use shell=True
and therefore we do not open ourselves up to sa ;
or other shell special characters being interpreted by a shell and doing things we don't intend. We can pass exactly what we would type
on the command line to utils.run_subprocess() and it will do the right thing.
This seems straightforward when we do something like utils.run_subprocess("yum whatprovides 'java = 1.8.0'")
. However,
the drawback to this strategy is that there are other bugs (usually harder to exploit as a security vulnerability but
still sometimes possible) that are still easy to make when doing things this way. Consider this: when we type directly
into the shell, we seldom use shell variables. By contrast, when we are constructing our commands in python, we
frequently have to use variables. This difference is important because we have to remember to always quote the contents
of our variables when calling utils.run_subprocess()
. For instance, consider our listdir example again but with the
run_subprocess command:
files = os.listdir('/tmp')
for filename in files:
if os.path.isfile(filename):
utils.run_subprocess('cat /tmp/' + filename)
Since we're not using the shell, a malicious user could not easly get the program to execute a filename like ';chown -R[...]'
.
However, they could make a filename like this which might expose information we weren't intending:
touch '/tmp/empty .passwords'
If we ran this in our home directory and we had clear text passwords stored in ~/.passwords, then we'd end up running
the equivalent of this shell command:
$ cat /tmp/empty .passwords
... which would first cat the empty file and then cat the contents of the ~/.passwords file. If this information was logged to a world readable file, that could spell trouble for us.
The fix is easy to implement but also easy to forget to do:
try:
from shlex import quote as shlex_quote
except ImportError:
# Python 2
from pipes import quote as shlex_quote
files = os.listdir('/tmp')
for filename in files:
if os.path.isfile(filename):
utils.run_subprocess('cat /tmp/' + shlex_quote(filename))
This will run the equivalent of this shell command:
$ cat /tmp/'empty .passwords'
which is what we want to happen.
Because it is easy to forget the shlex.quote() step, though, I favour giving subprocess.Popen() a list to begin with rather than having the helper function parse the string. That way the programmer is doing the work of cutting their input into distinct arguments up front. They can't forget about having to properly delimit their arguments (by quoting) when using variables because the arguments must always be separated before calling subprocess.Popen:
files = os.listdir('/tmp')
for filename in files:
if os.path.isfile(filename):
subprocess.Popen(['cat', '/tmp', filename])
This is especially valid if you see that you have a lot of code which joins lists when calling subprocess or code which builds up an argument string by appending whole arguments onto a string:
command = ['yum', 'whatprovides', shlex.quote('java == 1.8.0')]
utils.run_subprocess(' '.join(command))
# [...]
command = 'yum'
if provides:
command += ' whatprovides'
for requirement in requires:
command += ' ' + shlex.quote(requirement)
utils.run_subprocess(command)
In both of these patterns, it's more natural to work with lists natively so converting to a string (and remembering to quote) before calling your helper (where it will e converted back into a list) is just extra work.