-
-
Save maybemkl/d9be15bcabadaa19d2ca50c87b59a92e to your computer and use it in GitHub Desktop.
#!/usr/bin/env python3 | |
from pandocfilters import toJSONFilter, Str | |
import re | |
def replace(key, value, format, meta): | |
if key == 'Str': | |
if '[[' in value: | |
new_value = value.replace('[[', '') | |
return Str(new_value) | |
if ']]' in value: | |
new_value = value.replace(']]', '') | |
return Str(new_value) | |
if __name__ == '__main__': | |
toJSONFilter(replace) |
I think it's this package, although i am not entirely certain how to appropriately install it 🤔
https://pypi.org/project/pandocfilters/
You can install it using pip, but I tried installing it with conda instead in an isolated environment just in case.
conda create -n pandoc
conda install -c conda-forge pandocfilteres
conda activate pandoc
Before any filtering is done, pandac parses markdown file into abstract syntax tree (AST). I took a look at what the tree looks like for a simple markdown with a single line: [[@citekey]]
. The string is actually broken into three blocks: [
, [@citekey]
, and ]
. So there would be no string that contains [[
or ]]
, therefore this pandocfilter script didn't work. Similary for the lua filters, replacing [[
or ]]
doesn't work.
Both pandocfilter and lua filter would work if we replace [
and ]
with ''
.
Here is what the AST looks like
List of 3
|-pandoc-api-version:List of 4
| |-: int 1
| |-: int 22
| |-: int 2
| |-: int 1
|-meta : Named list()
|-blocks :List of 1
|-:List of 2
|-t: chr "Para"
|-c:List of 3
|-:List of 2
| |-t: chr "Str"
| |-c: chr "["
|-:List of 2
| |-t: chr "Cite"
| |-c:List of 2
| |-:List of 1
| | |-:List of 6
| | |-citationId : chr "citekey"
| | |-citationPrefix : list()
| | |-citationSuffix : list()
| | |-citationMode :List of 1
| | | |-t: chr "NormalCitation"
| | |-citationNoteNum: int 1
| | |-citationHash : int 0
| |-:List of 1
| |-:List of 2
| |-t: chr "Str"
| |-c: chr "[@citekey]"
|-:List of 2
|-t: chr "Str"
|-c: chr "]"
Thanks for this great insight @racng . I was able to make this change and get it work except I noticed that the back end of the link did not get filtered correctly.
It went from [[name]] to name] . I found it odd that it was able to replace [[ but only one of the ]
This is the modified code I am using,
#!/usr/bin/env python3
from pandocfilters import toJSONFilter, Str
import re
def replace(key, value, format, meta):
if key == 'Str':
if '[' in value:
new_value = value.replace('[', '')
return Str(new_value)
if ']' in value:
new_value = value.replace(']', '')
return Str(new_value)
if name == 'main':
toJSONFilter(replace)
Thanks for the original filter code @maybemkl! I was hitting the same problem as @aravindk100 in that the filter would not find the closing ]]
characters, so I modified the script to take advantage of some newer Python 3.8 features which also greatly simplifies the code. Here's my version:
#!/usr/bin/env python3
from pandocfilters import toJSONFilter, Str
import re
def replace(key, value, format, meta):
if key == 'Str':
if match := re.search('\[\[(.+)\]\]',value,re.IGNORECASE):
new_value = match.group(1)
return Str(new_value)
if __name__ == '__main__':
toJSONFilter(replace)
Thank you all, this is really helpful, especially when exporting linked notes from Obsidian through pandoc
!
Apparently there seemed to be an invalid escape sequence. The regex pattern '\[\[(.+)\]\]'
contains backslashes (\
). In Python strings, \[
and \]
could be misinterpreted as escape sequences. A raw string (r""
) tells Python to ignore escape sequences, so \[
and \]
are treated as literal brackets instead of escape sequences.
This is the improved version (with help of ChatGTP):
#!/usr/bin/env python3
from pandocfilters import toJSONFilter, Str
import re
def replace(key, value, format, meta):
if key == 'Str':
if match := re.search(r'\[\[(.+)\]\]', value, re.IGNORECASE):
new_value = match.group(1)
return Str(new_value)
if __name__ == '__main__':
toJSONFilter(replace)
Do I have to put the file somewhere specific? If I call it I get the following error:
Traceback (most recent call last): File "/Users/tim/Documents/Wissensmanagement/Pandoc/remove_links.py", line 3, in <module> from pandocfilters import toJSONFilter, Str ImportError: No module named pandocfilters Error running filter /Users/tim/Documents/Wissensmanagement/Pandoc/remove_links.py: Filter returned error status 1