Skip to content

Instantly share code, notes, and snippets.

@stefanv
Created September 26, 2018 22:26
Show Gist options
  • Select an option

  • Save stefanv/7ac7462fb055d5dfb88d807f0be1683a to your computer and use it in GitHub Desktop.

Select an option

Save stefanv/7ac7462fb055d5dfb88d807f0be1683a to your computer and use it in GitHub Desktop.
import urllib.request
import os
archive = 'https://mail.python.org/pipermail/numpy-discussion'
months = ('January', 'February', 'March', 'April', 'May', 'June', 'July',
'August', 'September', 'October', 'November', 'December')
output = 'mirror'
if not os.path.isdir(output):
os.mkdir(output)
for year in range(2000, 2019):
for month in months:
thread = f'{archive}/{year}-{month}/thread.html'
print(f'Fetching {thread}')
with urllib.request.urlopen(thread) as response:
html = response.read()
with open(f'{output}/{year}-{month}.html', 'wb') as f:
f.write(html)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment