Last active
October 26, 2022 08:03
-
-
Save tonyfast/cfb55f41f5452ef33ec6fbb4e0bda991 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Adding doctests to the markdown it lexer and the docutils renderer in the myst the stack." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
" import markdown_it, docutils.nodes, myst_parser.docutils_renderer" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a lexing rule for markdown_it for doctests." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
" def doctest(state, startLine, endLine, silent, *, offset=0, continuation=True):\n", | |
" nextLine, start, maximum = startLine, state.bMarks[startLine] + state.tShift[startLine], state.eMarks[startLine]\n", | |
" \n", | |
" if not state.src[start:maximum].startswith(\">>> \"): return False\n", | |
" while nextLine < endLine: \n", | |
" nextLine += 1\n", | |
" start, maximum = state.bMarks[nextLine] + state.tShift[nextLine], state.eMarks[nextLine]\n", | |
" if continuation:\n", | |
" continuation = state.src[start:maximum].startswith(\"... \")\n", | |
" if continuation: continue\n", | |
" if state.src[start:maximum].strip():\n", | |
" if state.src[start:maximum].startswith(\">>> \"):\n", | |
" offset = 1\n", | |
" break\n", | |
" continue\n", | |
" break\n", | |
" \n", | |
" old_parent, old_line_max = state.parentType, state.lineMax\n", | |
" state.parentType, state.lineMax = \"container\", nextLine-offset\n", | |
" \n", | |
" token = state.push(\"doctest\", \"code\", 0)\n", | |
" token.content = state.src[state.bMarks[startLine] : state.eMarks[state.lineMax]]\n", | |
" token.map = [startLine, state.lineMax]\n", | |
" state.parentType, state.lineMax, state.line = old_parent, old_line_max, nextLine\n", | |
" \n", | |
" return True" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a function that renders a doctest token as a docutils node. This feature is available in an rst parser." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
" def render_doctest(self, token) :\n", | |
" node = docutils.nodes.doctest_block(''.join(token.content), ''.join(token.content))\n", | |
" self.add_line_and_source_path(node, token)\n", | |
" self.current_node.append(node)\n", | |
"\n", | |
" myst_parser.docutils_renderer.DocutilsRenderer.render_doctest = render_doctest" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Our markdown parser is only concerned with block objects as we desire to tangle from the docutils document." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
" md = markdown_it.MarkdownIt().disable('inline')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Insert the doctest rule before code. This way it is recognized with and without indents." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
" md.block.ruler.before(\"code\", \"doctest\", doctest, {\"alt\": []},)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Generate the markdown it tokens.s" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
" tokens = md.parse(\"\"\"Testing\n", | |
" other taxes\n", | |
" \n", | |
" >>> 1\n", | |
" 10\n", | |
" >>> 2\n", | |
" asdf\n", | |
" \n", | |
" print\n", | |
" \"\"\")" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Convert the tokens to a docutils document." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[<document: <paragraph...><doctest_block...><doctest_block...><liter ...>,\n", | |
" <paragraph: >,\n", | |
" <doctest_block: <#text: ' >>> 1\\n 10'>>,\n", | |
" <#text: ' >>> 1\\n 10'>,\n", | |
" <doctest_block: <#text: ' >>> 2\\n ...'>>,\n", | |
" <#text: ' >>> 2\\n asdf\\n '>,\n", | |
" <literal_block: <#text: 'print\\n'>>,\n", | |
" <#text: 'print\\n'>]" | |
] | |
}, | |
"execution_count": 7, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
" myst_parser.docutils_renderer.DocutilsRenderer(md).render(tokens,{}, markdown_it.utils.AttrDict()).traverse()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"https://gist.github.com/cfb55f41f5452ef33ec6fbb4e0bda991\n" | |
] | |
} | |
], | |
"source": [ | |
" if 10:\n", | |
" !gist doctest-myst.ipynb -u cfb55f41f5452ef33ec6fbb4e0bda991" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.7.7" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
the current implementation i'd suggest is in my pidgy
project. i think it will work generically. this doctest
rule is closer the markdown-it-py
conventions. feel free to borrow the implementation where ever you see fit. ping me if you have any tools you want tested out.
🤞 hopefully there is an implementation that can stay about the docutils
layer.
Stumbled into this trying to find out what the MyST syntax was for class "Python REPL" style that doctests uses, not sure if the above answers my question, but I'm in the process of adding MyST support to Sybil, which may be of interest...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@tonyfast Did anything more come of this? This could come in quite handy to a lot of folks
I found a project at https://github.com/thisch/pytest-sphinx and in the early phases of porting it over to docutils / myst API (from simple regex): https://github.com/thisch/pytest-sphinx/pull/31/files
P.S. I also posted to executablebooks/MyST-Parser#601 and Erotemic/xdoctest#68