Did you ever had to debug some large cell in a Jupyter notebook? In the below I share my experience on the subject. We'll review the classical methods for debugging notebooks, and finally I'll show how to set breakpoints in PyCharm for code being execute in a jupyter notebook, and benefit of the comfort of a real Python IDE for debugging.
Before I actually describe what Pycharm can do, we quickly review the jupyter commands for debugging.
%pdb on
is my favorite. It is a magic command that start a debug shell on exceptions (deactivate this mode with %pdb off
)
This is the jupyter version for the classical python
import pdb; pdb.set_trace()
.
At the location of the desired breakpoint, insert
import IPython; IPython.core.debugger.set_trace()
T. Hoffmann proposes an elegant conditional breakpoint function:
from IPython.core.debugger import Pdb as CorePdb
import sys
def breakpoint(condition=True):
"""
Set a breakpoint at the location the function is called if `condition == True`.
"""
if condition:
debugger = CorePdb()
frame = sys._getframe()
debugger.set_trace(frame.f_back)
return debugger
def add(a, b):
breakpoint(type(a) != type(b))
return a + b
add('a', 2)
Below the cell under debug, you get an input line where you can enter pdb commands.
Most useful commands are:
q(uit)
to exit the debugger and return to jupyteru(p)
to go one frame up (in case you used%pdb on
)n(ext)
,s(tep)
,r(eturn)
,c(ontinue)
to execute next line, step into function, execute until end of current function, or continue executionp(rint)
for printing the content of a variable.
A sample debug session looks like
Don't forget to quit
the debugger, otherwise cells won't execute any more. If you forgot, interrupt the kernel - in some case you will recover a functional notebook.
Now we head towards a more comfortable solution! It does requires some work on configuration the first time you use it, yet the second time already the operation will be very easy, believe me.
The following assumes you have PyCharm installed.
Identify the python that is used in your jupyter notebook with
import sys
sys.executable
Now create a PyCharm project at the same location than your jupyter notebook. Configure the project to use the the same interpreter as your jupyter:
PyCharm only allows you to set breakpoint on python modules and packages.
If you want to do step-by-step execution of a python package, please simply open the desired module in PyCharm, and skip to the next paragraph.
If you want to debug a function you wrote yourself in the notebook, please move the function to a .py
file.
In this example I create a script.py
file with content
def add(x, y):
return x+y
We now load the autoreload
extension with
%load_ext autoreload
%autoreload 2
The extension will make sure jupyter always use the latest version of the script (useful when you fix the bug).
Finally we import the desired function with
from script import add
We identify the python process used by the notebook with
%connect_info
which returns, among other lines, one like
if you are local, you can connect with just:
jupyter <app> --existing kernel-d1b9a862-1f04-403b-82fb-5b820c0a0f89.json
Use the above information to attach PyCharm debugger to the python process. Click on Run / Attach to Local Process in PyCharm's menu, and select the process identified by the kernel file:
We are ready for interactive and comfortable step-by-step debug. In PyCharm, open the file (or package) where the breakpoint is desired, and right-click on the left border to add the breakpoint:
In jupyter, execute the cell that calls the function under debug:
As you see, the execution does not return (yet). PyCharm's breakpoint pauses execution, and offers a comfortable debugger and variable window:
I find the above very helpful for debugging and understanding the stack trace at specific code locations. But I would also love to
- catch all exceptions in PyCharm (and reproduce
%pdb on
) - set PyCharm breakpoints directly in jupyter cells.
Please let me know if you have any idea on how to achieve this!