Copy nbflatten.py to somewhere on $PATH. Then, in the root of a git repository, run these commands:
echo "*.ipynb diff=ipynb" >> .gitattributes
git config diff.ipynb.textconv nbflatten.py
When you change a notebook and run git diff
, you'll see the diff of flattened, simplified notebooks, rather than the full JSON. This does lose some information (metadata, non-text output), but it makes it easier to see simple changes in the notebook.
This doesn't help with merging conflicting changes in notebooks. For that, see nbdiff.org.
Just in case it might be useful to someone:
I have been a big fan of nbflatten.py since I discovered it, and have been using it extensively as a diff filter for git. However, I find it to be a bit slow, especially for repositories with many (large) notebooks. So I spent a bit of time writing a filter for jq which does the same thing, but is orders of magnitude faster.
The relevant section of my .gitconfig now looks like this:
I am typically not interested in the outputs for diffing notebooks, so the textconv filter here does not show them. However, I did find it to be more convenient for me to show the metadata of the notebook as well in the output, and everything not in "cells" is shown first under the header "Non-cell info". This disappears by removing the part
(\"Non-cell info\"|banner),del(.cells),\"\",
.I have also written a more "complete" script which shows the outputs in pretty much the same way as nbflatten.py. That can be found at https://gist.github.com/jfeist/cd00aa3b681092e1d5dc. If you download it and put it somewhere in your path, you can use
textconv = nbflatten.jq
instead.