Skip to content

Instantly share code, notes, and snippets.

@Pokechu22
Created December 8, 2022 19:09
Show Gist options
  • Save Pokechu22/092fde7a3e13c2f872f517ad19334bb0 to your computer and use it in GitHub Desktop.
Save Pokechu22/092fde7a3e13c2f872f517ad19334bb0 to your computer and use it in GitHub Desktop.
Generate list of revision URLs from a mediawiki dump
import sys
with open(sys.argv[1]) as fin:
lines = fin.readlines()
INDEX_URL = sys.argv[2]
TITLE_PREFIX = ' <title>'
TITLE_SUFFIX = '</title>\n'
REV_PREFIX = ' <id>'
REV_SUFFIX = '</id>\n'
for line in lines:
if line.startswith(TITLE_PREFIX):
cur_title = line[len(TITLE_PREFIX):-len(TITLE_SUFFIX)].replace(' ', '_')
if line.startswith(REV_PREFIX):
rev=line[len(REV_PREFIX):-len(REV_SUFFIX)]
print(INDEX_URL + '?title=' + cur_title + '&oldid=' + rev)
print(INDEX_URL + '?title=' + cur_title + '&action=edit&oldid=' + rev)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment