Skip to content

Instantly share code, notes, and snippets.

@brabect1
Last active June 27, 2020 14:55
Show Gist options
  • Select an option

  • Save brabect1/b89630a0f668e0ebdfeed0cb684e9f6f to your computer and use it in GitHub Desktop.

Select an option

Save brabect1/b89630a0f668e0ebdfeed0cb684e9f6f to your computer and use it in GitHub Desktop.
Creating pandoc filters with panflute for LaTeX #pandoc #panflute #latex

Pandoc LaTeX Filters with panflute

Panflute is a Python module for creating pandoc filters.

LaTeX Filters

A basic panflute filter, that actually does nothing, looks like follows:

import panflute as pf

def prepare(doc):
    pass

def finalize(doc):
    pass

def filter(elem, doc):
    pass

def main(doc=None):
    return pf.run_filter(filter, prepare=prepare, finalize=finalize, doc=doc)

if __name__ == "__main__":
    main()

To add, change or extend the default pandoc LaTeX writer, you re likely to use some form of raw text that the writer passes on "as is".

def filter(elem, doc):
        if is_admonition(elem):
            classes = list(elem.classes)
            return [
                    pf.RawBlock(u"\\begin{" + classes[0] + u"block}", "tex"),
                    elem,
                    pf.RawBlock(u"\\end{" + classes[0] + u"block}", "tex"),
                    ]

You may also need to add extra LaTeX code to the frontmatter and/or backmatter.

def finalize(doc):
    # Add header-includes if necessary
    metadata = doc.get_metadata(u'header-includes')
    if not metadata:
        doc.metadata[u"header-includes"] = pf.MetaList()

    # Convert header-includes to MetaList if necessary
    elif not isinstance(metadata, pf.MetaList):
        doc.metadata[u"header-includes"] = pf.MetaList(metadata)

    doc.metadata[u"header-includes"].content.append(
            pf.MetaInlines(
                pf.RawInline(
                    ur"""
\hello
""", format="tex",
                )
            )
        )

Python 2.7

To install panflute you normally do something like:

sudo install python-pip
pip install panflute

Panflute is primarily Python 3.x, but can work with Python 2.7. However, likely depending on 2.7 minor version, the latest pip installed version may not always work:

Traceback (most recent call last):
  File "panflute-examples/panflute-load.py", line 1, in <module>
    import panflute as pf
  File "/usr/local/lib/python2.7/dist-packages/panflute/__init__.py", line 9, in <module>
    from .containers import ListContainer, DictContainer
  File "/usr/local/lib/python2.7/dist-packages/panflute/containers.py", line 42
    def __init__(self, *args, oktypes=object, parent=None):
                                    ^
SyntaxError: invalid syntax

So, e.g., for 2.7.15 I did pip install panflute==1.9.3

Pandoc Filters

The most generic way to use pandoc filters is:

pandoc -s input.txt -t json | \
  pandoc-citeproc | \
  pandoc -s -f json -t rst

The first command is to generate a JSON formatted AST that pandoc filters, such as pandoc-citeproc, may operate on. The last command then turns a JSON AST into a -t formatted output (here rst for reStructuredText).

import panflute as pf
import sys
def prepare(doc):
pass
def finalize(doc):
# Add header-includes if necessary
metadata = doc.get_metadata(u'header-includes')
if not metadata:
doc.metadata[u"header-includes"] = pf.MetaList()
# Convert header-includes to MetaList if necessary
elif not isinstance(metadata, pf.MetaList):
doc.metadata[u"header-includes"] = pf.MetaList(metadata)
doc.metadata[u"header-includes"].content.append(
pf.MetaInlines(
pf.RawInline(
ur"""
\hello
""", format="tex",
)
)
)
def filter(elem, doc):
if elem.tag in ['Div']:
classes = list(elem.classes)
#sys.stderr.write(">>>atr:" + str(elem.attributes) + '\n')
#sys.stderr.write(">>>cls:" + str(elem.classes) + '\n')
#sys.stderr.write(">>>jsn:" + str(elem.to_json()) + '\n')
if 'title' in classes and is_admonition(elem.parent):
# throw away the admonition title as it would otherwise be typeset
return []
elif is_admonition(elem):
return [
pf.RawBlock(u"\\begin{" + classes[0] + u"block}", "tex"),
elem,
pf.RawBlock(u"\\end{" + classes[0] + u"block}", "tex"),
]
admonition_classes = ['admonition', 'note', 'warning', 'important', 'caution', 'tip', 'hint', 'error', 'attention', 'danger']
def is_admonition(elem):
if not elem.tag == 'Div':
return 0
for c in set(elem.classes):
if c in admonition_classes: return 1
return 0
def main(doc=None):
return pf.run_filter(filter, prepare=prepare, finalize=finalize, doc=doc)
if __name__ == "__main__":
main()
import panflute as pf
import sys
def main():
if (len(sys.argv) > 1):
with open(sys.argv[1]) as f:
doc = pf.load(f)
pf.dump(doc)
else:
print "Usage: python <script> <json-file>"
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment