Skip to content

Instantly share code, notes, and snippets.

@wware
Last active January 6, 2020 15:28
Show Gist options
  • Save wware/384969297ddf4ce0df1c4d414ea14417 to your computer and use it in GitHub Desktop.
Save wware/384969297ddf4ce0df1c4d414ea14417 to your computer and use it in GitHub Desktop.

Yet another Python debugger hack

Here is the problem I'm trying to solve here. You have some code running in AWS or running in some far-away server rack. Your access to the machine in question is things like SSH or telnet. The code is running a bunch of jobs as part of a CI/CD system. You would like to kick off a job where the code runs with RemotePdb so that you can step through it and examine variables and set breakpoints and all that, without disturbing any other jobs running at the same time.

You can't change the code. You don't have time to get a merge request approved, and it doesn't make sense to do a merge request to facilitate what might be a very brief one-time debugging session. You need some kind of hooks in your production code that make this remote debugging stuff feasible without a fresh push to your CI/CD stack.

I had the foresight to add a "back channel" JSON parameter to the test parameters in the CI/CD system, which goes to all the pieces of code I care about. This metadata is on a per-job basis: it only affects a targeted job while all the other simultaneously running jobs are unaffected.

Given a piece of back channel JSON like this:

{debug: "scan", target: "/path/to/foo.py:bar"}

the goal is that when we get to the "scan" step in the CI/CD job, we should set a breakpoint in file "foo.py" at the bar function. As we enter that function, we should kick off a RemotePdb session so that I can telnet in and do a debugging session with my code.

If there is no back channel JSON or it is doing something else, the code should run without a debugger to avoid any performance hit.

In the system, the set_up_remote_pdb function is passed the target information from the JSON back channel.

#!/usr/bin/env python
import os
import re
import sys
from remote_pdb import RemotePdb
#################
def inner():
print('before')
print('during')
print('after')
def middle():
inner()
def outer():
middle()
#################
def set_up_remote_pdb(target, port=4444):
"""
Given a full file path and a function name in that file, run
some code inside this context manager and go into a Remote PDB
session when we enter that function. Remote PDB is useful when
debugging processes running in AWS or on server racks.
"""
try:
m = re.match(r"^(.*):([^:]+)$", target)
filename = m.group(1)
funcname = m.group(2)
except:
logging.error("set_up_remote_pdb bad target: {0}".format(target))
return
previous = sys.gettrace()
class RPDB(RemotePdb):
def do_quit(self, arg):
r = RemotePdb.do_quit(self, arg)
sys.settrace(previous)
return r
def trace_calls(frame, event, arg):
if event == 'call':
co = frame.f_code
if (funcname == co.co_name and
filename == os.path.realpath(co.co_filename)):
RPDB("0.0.0.0", port).set_trace()
sys.settrace(trace_calls)
def main():
target = os.path.realpath(__file__) + ":inner"
set_up_remote_pdb(target)
outer()
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment