Calling subprocess.Popen with strings.

Background of security problem

When you call suprocess.Popen, you can pass the command to execute in two ways:

A string that corresponds to the commandline that you would run at the shell prompt command = "yum whatprovides 'java = 1.8.0'"
A list where each argument to the command is a separate entry: command = ["yum", "whatprovides", "java = 1.8.0"]

When you pass a string to subprocess.Popen, the command has to either have no arguments or be passed through the shell to be executed (via an argument to Popen, shell=True). That means that you have to be wary of any input to the command that you don't control as those could contain characters interpreted by the shell. You end up having to quote those characters: command = "yum" + " whatprovides" + shlex_quote("java = 1.8.0")

shlex.quote (pipes.quote on python2) will quote 'java = 1.8.0' so that it is a single argument after being parsed by the shell. If you didn't quote it, the command line would be: "yum whatprovides java = 1.8.0" and yum think that you gave it four arguments and lookup what provides "java" and what provides "=" and what provides "1.8.0".

The security issue comes in if you forget to quote something that a malicious user could have created. For instance, say I do this:

  files = os.listdir('/tmp/')
  for filename in files:
      if os.path.isfile(filename):
          subprocess.Popen('cat /tmp/' + filename)

I would expect this to cat the contents of all the files in /tmp that I have permissions on. But what if a malicious user creates a filename like this: touch '/tmp/\;chmod\ -R\ a+rwx\ .'

Then my subprocess.Popen call will end up listing all of /tmp/ and then make everything in the directory I invoked the python script from readable, writable, and executable by anyone. Imagine if I ran this from my home directory: private things like my .ssh/ private keys would then be exposed.

One partial mitigation

Let's say you write a helper function for calling subprocess.Popen called utils.run_subprocess(). This function takes a string for the command + arguments and then uses shlex.split() internally to change that into a list before passing to Popen. By doing it this way, we do not use shell=True and therefore we do not open ourselves up to sa ; or other shell special characters being interpreted by a shell and doing things we don't intend. We can pass exactly what we would type on the command line to utils.run_subprocess() and it will do the right thing.

This seems straightforward when we do something like utils.run_subprocess("yum whatprovides 'java = 1.8.0'"). However, the drawback to this strategy is that there are other bugs (usually harder to exploit as a security vulnerability but still sometimes possible) that are still easy to make when doing things this way. Consider this: when we type directly into the shell, we seldom use shell variables. By contrast, when we are constructing our commands in python, we frequently have to use variables. This difference is important because we have to remember to always quote the contents of our variables when calling utils.run_subprocess(). For instance, consider our listdir example again but with the run_subprocess command:

files = os.listdir('/tmp')
  for filename in files:
      if os.path.isfile(filename):
          utils.run_subprocess('cat /tmp/' + filename)

Since we're not using the shell, a malicious user could not easly get the program to execute a filename like ';chown -R[...]'. However, they could make a filename like this which might expose information we weren't intending:

touch '/tmp/empty .passwords'

If we ran this in our home directory and we had clear text passwords stored in ~/.passwords, then we'd end up running the equivalent of this shell command: $ cat /tmp/empty .passwords

... which would first cat the empty file and then cat the contents of the ~/.passwords file. If this information was logged to a world readable file, that could spell trouble for us.

The fix is easy to implement but also easy to forget to do:

try:
    from shlex import quote as shlex_quote
except ImportError:
    # Python 2
    from pipes import quote as shlex_quote

  files = os.listdir('/tmp')
  for filename in files:
      if os.path.isfile(filename):
          utils.run_subprocess('cat /tmp/' + shlex_quote(filename))

This will run the equivalent of this shell command: $ cat /tmp/'empty .passwords'

which is what we want to happen.

Recommendation to use subprocess.Popen with a list instead

Because it is easy to forget the shlex.quote() step, though, I favour giving subprocess.Popen() a list to begin with rather than having the helper function parse the string. That way the programmer is doing the work of cutting their input into distinct arguments up front. They can't forget about having to properly delimit their arguments (by quoting) when using variables because the arguments must always be separated before calling subprocess.Popen:

  files = os.listdir('/tmp')
  for filename in files:
      if os.path.isfile(filename):
          subprocess.Popen(['cat', '/tmp', filename])

This is especially valid if you see that you have a lot of code which joins lists when calling subprocess or code which builds up an argument string by appending whole arguments onto a string:

command = ['yum', 'whatprovides', shlex.quote('java == 1.8.0')]
utils.run_subprocess(' '.join(command))
# [...]
command = 'yum'
if provides:
    command += ' whatprovides'
    for requirement in requires:
        command += ' ' + shlex.quote(requirement)
utils.run_subprocess(command)

In both of these patterns, it's more natural to work with lists natively so converting to a string (and remembering to quote) before calling your helper (where it will e converted back into a list) is just extra work.

abadger/gist:ea234ee7a7238b343bcfd6537d6e92e7

Background of security problem

One partial mitigation

Recommendation to use subprocess.Popen with a list instead