Last active
March 8, 2025 17:27
-
-
Save chsasank/7218ca16f8d022e02a9c0deb94a310fe to your computer and use it in GitHub Desktop.
Convert jupyter notebook to sphinx gallery notebook styled examples.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""Convert jupyter notebook to sphinx gallery notebook styled examples. | |
Usage: python ipynb_to_gallery.py <notebook.ipynb> | |
Dependencies: | |
pypandoc: install using `pip install pypandoc` | |
""" | |
import pypandoc as pdoc | |
import json | |
def convert_ipynb_to_gallery(file_name): | |
python_file = "" | |
nb_dict = json.load(open(file_name)) | |
cells = nb_dict['cells'] | |
for i, cell in enumerate(cells): | |
if i == 0: | |
assert cell['cell_type'] == 'markdown', \ | |
'First cell has to be markdown' | |
md_source = ''.join(cell['source']) | |
rst_source = pdoc.convert_text(md_source, 'rst', 'md') | |
python_file = '"""\n' + rst_source + '\n"""' | |
else: | |
if cell['cell_type'] == 'markdown': | |
md_source = ''.join(cell['source']) | |
rst_source = pdoc.convert_text(md_source, 'rst', 'md') | |
commented_source = '\n'.join(['# ' + x for x in | |
rst_source.split('\n')]) | |
python_file = python_file + '\n\n\n' + '#' * 70 + '\n' + \ | |
commented_source | |
elif cell['cell_type'] == 'code': | |
source = ''.join(cell['source']) | |
python_file = python_file + '\n' * 2 + source | |
python_file = python_file.replace("\n%", "\n# %") | |
open(file_name.replace('.ipynb', '.py'), 'w').write(python_file) | |
if __name__ == '__main__': | |
import sys | |
convert_ipynb_to_gallery(sys.argv[-1]) |
Modified!
Really useful script. When using it I found that the pdoc conversion of markdown adds "\r\n" after every new line of the file I tried.
This causes errors in the conversion since the filer writer will add write 2 new lines splitting rst cells.
A quick fix was to add
rst_source = rst_source.replace('\r', '')
after line 23 and 28.
There might be a more clever way with how to write the file in the end.
May be replace
nb_dict = json.load(open(file_name))
with
with open(file_name) as f:
nb_dict = json.load(f)
in line 14 and replace
open(file_name.replace('.ipynb', '.py'), 'w').write(python_file)
with
with open(file_name.replace('.ipynb', '.py'), 'w') as f:
f.write(python_file)
in line 38.
I had the opposite issue of @weber-s : my first cell was magic code so as a quick fixed I modified the first conditional block as
if cell["cell_type"] == "markdown":
md_source = "".join(cell["source"])
rst_source = pdoc.convert_text(md_source, "rst", "md")
python_file = '"""\n' + rst_source + '\n"""'
elif cell["cell_type"] == "code":
source = "".join(cell["source"])
python_file = python_file + "\n" * 2 + source
else:
raise ValueError(f"Unsupported starting cell type {cell["cell_type"]}.")
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
This script is very useful! However lots of people now uses magic command such as
%matplotlib inline
and so, the resulting file is not executable by regular python.I suggest adding something like:
just before the writing of the file, l. 35.
thanks again :)