Skip to content

Instantly share code, notes, and snippets.

@erfg12
Last active September 27, 2019 20:13
Show Gist options
  • Save erfg12/7c470d9ab8657ef7058588919129ef48 to your computer and use it in GitHub Desktop.
Save erfg12/7c470d9ab8657ef7058588919129ef48 to your computer and use it in GitHub Desktop.
Download waybackmachine.org Websites

Method 1

  1. Get wayback machine proxy software - https://github.com/STRML/wayback-machine-machine

  2. Configure proxy software, and setup proxy on your OS. Run proxy software.

  3. If using Windows, get WGet software - https://eternallybored.org/misc/wget/

  4. Open a command/terminal window in the directory you want to download the site to, use this command:

    wget -r -np -e use_proxy=yes -e http_proxy=127.0.0.1:4080 -k http://www.mywaybackmachinewebsite.com

NOTE: If you want, visit the website first to make sure it's exactly what you want prior to download. WGet will download whatever your browser sees.

Method 2

  1. Get https://github.com/hartator/wayback-machine-downloader

  2. Check out the waybackmachine file archives (Ex: http://web.archive.org/web/*/http://battle.net/*). Note the from/to dates.

  3. Use the wayback_machine_downloader program with the from/to specified in the files list to get all files. Use this command example:

    wayback_machine_downloader http://battle.net/* -f 19970101000000 -t 19971230000000 -a

NOTE: Wayback Machine timestamps are YYYYMMDDhhmmss format. The asterisks are very important!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment