Attach to a running Python process using gdb
:
sudo gdb python <pid>
List the Python frames:
(gdb) py-bt
List Python local variables:
(gdb) py-locals
My motivating example was an unknown bug in one of our systems that caused a process's CPU usage to permanently spike to 100%. Because the bug was tickled by a web request that never completed, we couldn't tell what was wrong as no log message for the request was ever written.
Later we discovered that it was a two-part failure: how we were calling PyPDF and what PyPDF does if you give it a buffer that does not, in fact, contain a PDF. Here is a small program that demonstrates the problem (pyPdf==1.3):
from StringIO import StringIO
from pyPdf import PdfFileReader
PdfFileReader(StringIO('I\'m Not a PDF. BOOM!'))
In the talk, I first sketched out the problem, then ran this program and demonstrated using the commands above to attach to it and view the execution state:
(gdb) py-bt #5 Frame 0x7f7c97173608, for file /home/dan/.envs/3a45b9a9208e0d61/local/lib/python2.7/site-packages/pyPdf/pdf.py, line 870, in readNextEndLine (self=<PdfFileReader(flattenedPages=None, resolvedObjects={}) at remote 0x7f7c9848fd10>, stream=<StringIO(softspace=0, buflist=[], pos=0, len=20, closed=False, buf="I'm Not a PDF. BOOM!") at remote 0x7f7c9717b7e8>, line='IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII...(truncated) line = x + line #8 Frame 0x201cea0, for file /home/dan/.envs/3a45b9a9208e0d61/local/lib/python2.7/site-packages/pyPdf/pdf.py, line 705, in read (self=<PdfFileReader(flattenedPages=None, resolvedObjects={}) at remote 0x7f7c9848fd10>, stream=<StringIO(softspace=0, buflist=[], pos=0, len=20, closed=False, buf="I'm Not a PDF. BOOM!") at remote 0x7f7c9717b7e8>, line='') line = self.readNextEndLine(stream) #11 Frame 0x7f7c97174210, for file /home/dan/.envs/3a45b9a9208e0d61/local/lib/python2.7/site-packages/pyPdf/pdf.py, line 374, in __init__ (self=<PdfFileReader(flattenedPages=None, resolvedObjects={}) at remote 0x7f7c9848fd10>, stream=<StringIO(softspace=0, buflist=[], pos=0, len=20, closed=False, buf="I'm Not a PDF. BOOM!") at remote 0x7f7c9717b7e8>) self.read(stream) #22 Frame 0x7f7c984ef208, for file /home/dan/source/sandbox/boom.py, line 4, in <module> () PdfFileReader(StringIO('I\'m Not a PDF. BOOM!')) (gdb)
For Ubuntu systems, gdb
and the Python↔gdb
tools need to be installed:
sudo apt-get install gdb python2.7-dbg
Older versions of Ubuntu (< 13.10?) need a patch to the Python tools (see here for details: https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1241668):
wget -O - http://hg.python.org/cpython/raw-file/ef4636faf8bd/Tools/gdb/libpython.py | sudo tee /usr/lib/debug/usr/bin/python2.7-gdb.py
If you see messages like this when connecting to a process (note the in ??
s in the backtrace):
(gdb) py-bt #11 (unable to read python frame information) (gdb) bt #0 0x00007fa3df3b5ef6 in ?? () #1 0x000000000048c9a8 in PyFloat_FromString (v=<optimized out>, pend=<optimized out>) at ../Objects/floatobject.c:223 #2 0x0000000000536a69 in getc_unlocked (__fp=0x5c00e82af6) at /usr/include/x86_64-linux-gnu/bits/stdio.h:65 #3 Py_UniversalNewlineFgets (buf=0x5dd4f <error: Cannot access memory at address 0x5dd4f>, n=<optimized out>, stream=0x5c00e82af6, fobj='IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII') at ../Objects/fileobject.c:2750 #4 0x00007fa3de6dd5f0 in ?? () #5 0x0000000000e400a0 in ?? () #6 0x0000000000f38108 in ?? () #7 0x0000000000f22600 in ?? () #8 0x0000000000000002 in ?? () #9 0x0000000000f2284f in ?? () #10 0x000000000052e672 in PyErr_Occurred () at ../Python/errors.c:80 #11 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:2308 #12 0x0000000000000000 in ?? () (gdb)
it probably means that the Python executable you are using is out of sync with the system executable. This can happen, for example, if you have a virtualenv that was made prior to an update to the system Python.
Rick Copeland mentioned Pyrasite, which can do some crazy stuff with a running Python process: http://pyrasite.readthedocs.org/en/latest/
There's another class of solutions that require instrumenting your code beforehand, but that might provide a "friendlier" interface; see this SO thread for a discussion: http://stackoverflow.com/questions/132058/showing-the-stack-trace-from-a-running-python-application
Collected references:
- basics, install: https://wiki.python.org/moin/DebuggingWithGdb
- Ubuntu bug & patch: https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1241668
- Command/theory details: http://fedoraproject.org/wiki/Features/EasierPythonDebugging