-
-
Save Joilence/82e1433870c538db12086e06ac7975f7 to your computer and use it in GitHub Desktop.
import re | |
import os | |
from dateutil.parser import parse | |
path = '' #insert file path to your vault here | |
### Convert date format in file content | |
for root, dirs, files in os.walk(path): | |
files = [f for f in files if re.match(r'.*\.md', f)] # only keep files end with `.md` | |
#TODO: could better ignore all dirs with `.` like `.git` | |
for f in files: | |
fullpath = (os.path.join(root, f)) | |
with open(fullpath, 'r') as f: #opens each .md file | |
contents = f.read() #reads the contents | |
#substitutes dates with the format [[April 20th, 2020]] for [[2020-04-20]] | |
new_contents = re.sub(r'(?<=\[)[\w]+\s\d{1,2}\w{1,2},\s\d{4}(?=\])', | |
lambda x: str(parse(x.group(0), ignoretz=True)).split(" ")[0], contents, flags=re.M) | |
with open(fullpath, 'w') as f: | |
f.write(new_contents) #writes the files with the new substitutions | |
### Convert daily notes names | |
for root, dirs, files in os.walk(path): | |
files = [f for f in files if re.match(r'[\w]+\s\d{1,2}\w{1,2},\s\d{4}\.md', f)] | |
for f in files: | |
fullpath = (os.path.join(root, f)) | |
new_fullpath = re.sub(r'[\w]+\s\d{1,2}\w{1,2},\s\d{4}', | |
lambda x: str(parse(x.group(0), ignoretz=True)).split(" ")[0], fullpath, flags=re.M) | |
os.rename(fullpath, new_fullpath) | |
Hi Joilence! Thanks for making this script!! I'd like to convert to ISO format without dashes ('20210220' instead of '2021-02-20'), can you help me with that? Best, Koen
You have to adapt these parts for that:
re.sub(r'(?<=\[)[\w]+\s\d{1,2}\w{1,2},\s\d{4}(?=\])',
lambda x: str(parse(x.group(0), ignoretz=True)).split(" ")[0], contents, flags=re.M)
Worked very well - thank you very much
Hi! I'm getting the following error... using a non-english windows installation, if that matters. Any ideas of how to fix this? I am not very prolific in these things just yet - maybe there's some newbie error behind it.
C:\Users\furemajo>python roam-obsidian-date-convert.py
Traceback (most recent call last):
File "C:\Users\furemajo\roam-obsidian-date-convert.py", line 17, in <module>
contents = f.read() #reads the contents
File "C:\Program Files\Python39\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2232: character maps to <undefined>
What did you try yet? @jeom123
-> https://lmgtfy.app/#gsc.tab=0&gsc.q=UnicodeDecodeError%3A%20'charmap'%20codec%20can't%20decode%20byte%200x9d%20
you should try the first result. Change line 16 to
with open(fullpath, 'r', encoding="utf8") as f:
What did you try yet? @jeom123
-> https://lmgtfy.app/#gsc.tab=0&gsc.q=UnicodeDecodeError%3A%20'charmap'%20codec%20can't%20decode%20byte%200x9d%20 you should try the first result. Change line 16 to
with open(fullpath, 'r', encoding="utf8") as f:
I tried a bunch of random stuff, but focused mainly on adding attributes to the thing that happens on line 17. When adding the encoding="utf8" on line 16, I get:
Traceback (most recent call last): File "C:\Users\furemajo\roam-obsidian-date-convert.py", line 17, in <module> contents = f.read() #reads the contents File "C:\Program Files\Python39\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 1558: invalid start byte
Looking at the files, a lot of dates seem to have changed inside the daily note files (but not the name of the files themselves). Also, all scandinavian characters (I write in Danish in the daily notes) have turned into weird symbols/question marks.
One guess is that Roam has screwed up something when I have written in scandinavian letters. When I open the original exports from Roam in visual studio code, all åäö are highlighted and when I hover them, it says: "The character U+00e4 "ä" is not a basic ASCII character". This is before conversion.
To format date add .strftime('%Y.%m.%d')
after the parse
eg.
new_contents = re.sub(r'(?<=\[)[\w]+\s\d{1,2}\w{1,2},\s\d{4}(?=\])', lambda x: str(parse(x.group(0), ignoretz=True).strftime('%Y.%m.%d')).split(" ")[0], contents, flags=re.M)
I'm running this script within Obsidian using the "Execute Code" community plugin on a Mac, and I am getting:
name 're' is not defined
Seems like the import re line is not working?
Any ideas?
Hi Joilence,
I'm running this script within Obsidian using the "Execute Code" community plugin on Win11.
But I'm not getting the 'run' button.
Is there another way to use the script?
What might I be doing wrong?
@max-fedoseev just in case you haven't found an answer elsewhere. To run code inside Obsidian you need to create a code block in your note like the following:
```run-python
import re
import os
from dateutil.parser import parse
....
```
Make sure the indentation is correct as you see above
What did you try yet? @jeom123
-> https://lmgtfy.app/#gsc.tab=0&gsc.q=UnicodeDecodeError%3A%20'charmap'%20codec%20can't%20decode%20byte%200x9d%20 you should try the first result. Change line 16 to
with open(fullpath, 'r', encoding="utf8") as f:
Foy my case I found that the markdown files that Roam exported were actually encoded in ANSI.
So on line 17, AND ON LINE 22, rather than
with open(fullpath, 'r', encoding="utf8") as f:
I wrote
with open(fullpath, 'r', encoding="ANSI") as f:
and everything worked fine.
Apparently there are some special characters that are different between the two. I only checked the characters in two files, but in one it was an apostrophe that it didn't like and in another it was a hyphen.
Hi Joilence! Thanks for making this script!!
I'd like to convert to ISO format without dashes ('20210220' instead of '2021-02-20'), can you help me with that?
Best, Koen