Skip to content

Instantly share code, notes, and snippets.

@tedsteinmann
Last active May 2, 2019 16:55
Show Gist options
  • Save tedsteinmann/606d8f6a53b2330948831205407ba4ce to your computer and use it in GitHub Desktop.
Save tedsteinmann/606d8f6a53b2330948831205407ba4ce to your computer and use it in GitHub Desktop.

Website Backup

Getting Started:

To get started you will need a computer capable of running bash.

How to back up a website

  1. create a file named config.sh with the following content
label=[enter a label for downloaded content]
url=[enter the base URL to download from]
domain_list=[enter a comma separated list of additional domains to download content from]
  1. open a terminal prompt and run bash backup.sh

How to browse a downloaded website

Note that all the content downloaded will be viewable, but depending on how the website was built, you may need to run a local web server to view content

A simple python webserver can be started by running:

python -m SimpleHTTPServer 8000

and then navigating to the started webserver location and selecting your directory. This will most likely be http://0.0.0.0.:8000

The webserver can be stopped by pressing cntrl + c

# a gist to backup a public facing website
# author tedsteinmann
source config.sh
# create a file called config.sh defining the following variales:
# --------
# label=[enter a label for downloaded content]
# url=[enter the base URL to download from]
# domain_list=[enter a comma seperated list of additional domains to download content from]
# -------
filename=$label'_'$(date +%Y-%m-%d_%H%M%S)
mkdir $filename
echo $filename
cd $filename
wget $url -e robots=off --mirror --span-hosts --no-dns-cache --base=$url --domains=$domain_list --convert-links --adjust-extension --ignore-case --page-requisites --output-file=log.txt --show-progress --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment