Skip to content

Instantly share code, notes, and snippets.

@csabahenk
Last active January 12, 2023 20:25
Show Gist options
  • Save csabahenk/6497709 to your computer and use it in GitHub Desktop.
Save csabahenk/6497709 to your computer and use it in GitHub Desktop.
Debugging detached Python

Debugging detached Python

Contents

## Introduction
  • What we'd like to do: debug Python software as it's customary -- ie. break into pdb, the Python Debugger and investigate the live code.
  • Why we can't do that: because the way we run the code, the standard input is not a tty. Pdb assumes interaction via a terminal. (Note: thus the other way for us would be to force the code run in terminal. It's worth to explore but now we go in another way.)
  • What we will do: find alternative ways of debugging and introspection that do not rely on stdin. (Note: in our case, while stdin is problematic, we can easy see the stdout. If that does not hold in your case, however you'd like to apply these techniques, replace the print statements with the kind of logging mechanism that's available for you.)

We use Python 2.7.

## Remote Pdb

A hack to get at a networked Pdb session, useful in the case when stdin is not a tty.

Place attached rdb.py (stolen from here, with some adjusments) file somewhere to your $PYTHONPATH. You can do then import rdb; rdb.set_trace() just like with stock pdb. It will print the port on which the debug session is spawned like PDB listening on 6902 (if you don't see stdout, you can try to find out the port by lsof(8) & co.). Then you can just telnet localhost 6902.

Issues:

  • no readline support (you can add it externally with rlwrap)
  • no permanent session. If you set a breakpoint and press c, the connection drops and the followup break will spawn on stdin, not on the network

However, allegedly it supports multiple sessions (ie., if the program hits set_trace multiple times, a new rdb server will spawn for each)(I haven't tried).

Another take on remoting Pdb is Rpdb (thanks Prasanth Pai for the hint). I found it's neither perfect, has similar but slightly different issues. You can give it a try.

## Stack printing

Put the following snippet into your code:

import threading,sys,traceback
def dumpstacks(signal=None, frame=None):
    id2name = dict([(th.ident, th.name) for th in threading.enumerate()])
    code = []
    for threadId, stack in sys._current_frames().items():
        code.append("\n# Thread: %s(%d)" % (id2name.get(threadId,""), threadId))
        for filename, lineno, name, line in traceback.extract_stack(stack):
            code.append('File: "%s", line %d, in %s' % (filename, lineno, name))
            if line:
                code.append("  %s" % (line.strip()))
    print "\n".join(code)

then you can just call dumpstacks() to get a stack trace printed to stdout.

Additionally, if you set

import signal
signal.signal(signal.SIGUSR1, dumpstacks)

somewhere in the main code path (ie. what's get called on program startup) you can get a stack trace at any point by sending SIGUSR1 to your program.

The most convenient way to accomplish these is to use the sitecustomize/usercustomize feature of Python that allows you to specify code which is loaded in each Python program (unless you explicitly ask not to via the -S option of the interpreter), ie. it's always in the main code path.

Just create the sitecustomize.py file with the above content in your Python site dir (installation and version dependent, something like /usr/lib/python2.7/site-packages/). Then the SIGUSR1 stack printing will be always enabled, while in code you can get a stackdump by from sitecustomize import dumpstacks; dumpstacks().

(Courtesy of.)

## Debugging Python with Gdb

I'll provide the instructions in two flavors:

  • Fedora (tested with 19)
  • general instructions

On Fedora support for this feature is nicely built in. In general, you have to compile a suitable Python by yourself and make some additional adjustments.

  • on Fedora:
    1. # yum install yum-utils
    2. # debuginfo-install python
  • in general: Python follows the standard autotools build procedure of ./configure && make && make install. Perform the build with one change: replace the plain make invocation with make OPT="-ggdb -O0". If you are performing the build through a package/build manager, make sure the build manager does not strip the binaries (eg. on Arch Linux, if you build using the python2 PKGBUILD, add '!strip' to the options array).

Note: on RHEL/CentOS, similarly to Fedora, a debuginfo package is available.

Debugging Python: the basics

  • on Fedora: it just works as is. You run the Python script under Gdb (either gdb python <script> or gdb -p <pid-of-running-script> and you'll have access to the py-* commands like py-bt to show a Python backtrace.
  • in general:
    1. Make a note of the location of the Python source tree.

    2. Add the following to your ~/.gdbinit:

      define py-load python import sys; sys.path.insert(0, "/Tools/gdb/"); import libpython end

    3. Run your Python script under Gdb, as discussed above. When you drop to the Gdb prompt for the first time, type py-load which will load the Python support routines. (Note: I tried to have them loaded automatically from ~/.gdbinit but then they did not work properly. Most likely they presuppose that the Python debug symbols are already available. If you load them only from the prompt, by that time this condition is fulfilled.)

Note: on RHEL/CentOS it seems that the py-* commands are not integrated to the build, so you have to follow the general instructions. You can get the Python source if you fetch the SRPM (cf. yumdownloader(1)) of you can get libpython.py right from the source repository (direct download url).

The Gdb Python routines

This is an older mechanism that predates Python scripting support in Gdb -- a collection of routines written directly in Gdb's command language to extract information from the Python VM's internal data structures. They provide the py* commands (ie., prefixed with "py" but no hyphen, like pystack). They are considered deprecated, but are of interest for us for two purposes:

  • Their output contains less information. That can be advantageous if we want terse output, easy to parse for the eye.
  • If we want to add some convenience commands of our own, they serve as good reference.

The routines are included in Python source repo as Misc/gdbinit (direct download url). To use them, download the file and either add their contents to ~/.gdbinit or keep it separately and pull them in with

source <path-to-downloaded-file>

Debugging Python: beef it up

At this point we have basic introspection capabilities for the Python runtime, but still we can't do things that's considered basic for a debugger, most eminently, breaking and stepping. That's what we want to achieve.

breaking

Playing around, one can see that the C function that facilitates the invocation of Python functions is called PyEval_EvalFrameEx. Looking into the Gdb Python routines, we can see how to extract the function name and file from the parameters of PyEval_EvalFrameEx. Thus we can put together the following command:

define pybr
  if $argc == 1
    break PyEval_EvalFrameEx if strcmp((char *)(*(PyStringObject*)f.f_code.co_name).ob_sval, $arg0) == 0
  end
  if $argc == 2
    break PyEval_EvalFrameEx if strcmp((char *)(*(PyStringObject*)f.f_code.co_name).ob_sval, $arg0) == 0 && \
                                strcmp((char *)(*(PyStringObject*)f.f_code.co_filename).ob_sval, $arg1) == 0
  end
end
document pybr
  Python break
end

(This is, needless to say, suggested for inclusion in ~/.gdbinit or some other Gdb command file you would source.)

So the first argument of pybr is the function to break at, the second, optional is the name of the file that includes the function. Note that its arguments should be passed as strings and not as identifiers, for example pybr "GET", or pybr "GET" "monkeyserver.py". Another caveat is whether to use absolute or relative filenames -- that might depend on the way of having the program invoked. You can discover the actual file naming convention by checking py-bt or pystack's output.

stepping

Given that hitting a Python function means hitting PyEval_EvalFrameEx in the C runtime, I suggest you the following practice for stepping in Python code:

  1. when you want to start stepping, do break PyEval_EvalFrameEx (make a note of the index of this breakpoint)
  2. just hit c (continue) to step forward
  3. if you want to continue in Python, disable this breakpoint by dis <index-of-breakpoint> and then c.
  4. if you want to step in Python again, enable the breakpoint by en <index-of-breakpoint>.

Practically (if no other automatic breakpoint setting interferes) you can add

break PyEval_EvalFrameEx
disable 1

to your ~/.gdbinit so that the PyEval_EvalFrameEx breakpoint will be of index 1 and disabled on start; and then you can enable Python-stepping by en 1, and disable it by dis 1.

# -*- coding: utf-8 -*-
"""
celery.contrib.rdb
==================
Remote debugger for Celery tasks running in multiprocessing pool workers.
Inspired by http://snippets.dzone.com/posts/show/7248
**Usage**
.. code-block:: python
from celery.contrib import rdb
from celery import task
@task()
def add(x, y):
result = x + y
rdb.set_trace()
return result
**Environment Variables**
.. envvar:: CELERY_RDB_HOST
Hostname to bind to. Default is '127.0.01', which means the socket
will only be accessible from the local host.
.. envvar:: CELERY_RDB_PORT
Base port to bind to. Default is 6899.
The debugger will try to find an available port starting from the
base port. The selected port will be logged by the worker.
"""
from __future__ import absolute_import, print_function
import errno
import os
import socket
import sys
from pdb import Pdb
####
from contextlib import contextmanager
def get_errno_name(n):
"""Get errno for string, e.g. ``ENOENT``."""
if isinstance(n, basestring):
return getattr(errno, n)
return n
@contextmanager
def ignore_errno(*errnos, **kwargs):
"""Context manager to ignore specific POSIX error codes.
Takes a list of error codes to ignore, which can be either
the name of the code, or the code integer itself::
>>> with ignore_errno('ENOENT'):
... with open('foo', 'r'):
... return r.read()
>>> with ignore_errno(errno.ENOENT, errno.EPERM):
... pass
:keyword types: A tuple of exceptions to ignore (when the errno matches),
defaults to :exc:`Exception`.
"""
types = kwargs.get('types') or (Exception, )
errnos = [get_errno_name(errno) for errno in errnos]
try:
yield
except types as exc:
if not hasattr(exc, 'errno'):
raise
if exc.errno not in errnos:
raise
####
default_port = 6899
CELERY_RDB_HOST = os.environ.get('CELERY_RDB_HOST') or '127.0.0.1'
CELERY_RDB_PORT = int(os.environ.get('CELERY_RDB_PORT') or default_port)
#: Holds the currently active debugger.
_current = [None]
_frame = getattr(sys, '_getframe')
NO_AVAILABLE_PORT = """\
{self.ident}: Couldn't find an available port.
Please specify one using the CELERY_RDB_PORT environment variable.
"""
BANNER = """\
{self.ident}: Please telnet into {self.host} {self.port}.
Type `exit` in session to continue.
{self.ident}: Waiting for client...
"""
SESSION_STARTED = '{self.ident}: Now in session with {self.remote_addr}.'
SESSION_ENDED = '{self.ident}: Session with {self.remote_addr} ended.'
class Rdb(Pdb):
me = 'Remote Debugger'
_prev_outs = None
_sock = None
def __init__(self, host=CELERY_RDB_HOST, port=CELERY_RDB_PORT,
port_search_limit=100, port_skew=+0, out=sys.stdout):
self.active = True
self.out = out
self._prev_handles = sys.stdin, sys.stdout
self._sock, this_port = self.get_avail_port(
host, port, port_search_limit, port_skew,
)
self._sock.setblocking(1)
self._sock.listen(1)
self.ident = '{0}:{1}'.format(self.me, this_port)
self.host = host
self.port = this_port
self.say(BANNER.format(self=self))
self._client, address = self._sock.accept()
self._client.setblocking(1)
self.remote_addr = ':'.join(str(v) for v in address)
self.say(SESSION_STARTED.format(self=self))
self._handle = sys.stdin = sys.stdout = self._client.makefile('rw')
Pdb.__init__(self, completekey='tab',
stdin=self._handle, stdout=self._handle)
def get_avail_port(self, host, port, search_limit=100, skew=+0):
this_port = None
for i in range(search_limit):
_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
this_port = port + skew + i
try:
_sock.bind((host, this_port))
except socket.error as exc:
if exc.errno in [errno.EADDRINUSE, errno.EINVAL]:
continue
raise
else:
print('PDB listening on %d' % this_port)
return _sock, this_port
else:
raise Exception(NO_AVAILABLE_PORT.format(self=self))
def say(self, m):
print(m, file=self.out)
def _close_session(self):
self.stdin, self.stdout = sys.stdin, sys.stdout = self._prev_handles
self._handle.close()
self._client.close()
self._sock.close()
self.active = False
self.say(SESSION_ENDED.format(self=self))
def do_continue(self, arg):
self._close_session()
self.set_continue()
return 1
do_c = do_cont = do_continue
def do_quit(self, arg):
self._close_session()
self.set_quit()
return 1
do_q = do_exit = do_quit
def set_trace(self, frame=None):
if frame is None:
frame = _frame().f_back
with ignore_errno(errno.ECONNRESET):
Pdb.set_trace(self, frame)
def set_quit(self):
# this raises a BdbQuit exception that we are unable to catch.
sys.settrace(None)
def debugger():
"""Returns the current debugger instance (if any),
or creates a new one."""
rdb = _current[0]
if rdb is None or not rdb.active:
rdb = _current[0] = Rdb()
return rdb
def set_trace(frame=None):
"""Set breakpoint at current location, or a specified frame"""
if frame is None:
frame = _frame().f_back
return debugger().set_trace(frame)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment