Created
June 12, 2012 15:29
-
-
Save dcosson/2918201 to your computer and use it in GitHub Desktop.
Archive a website (in this case, tourbie.com)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Just a note to myself on how to archive a website | |
mkdir tourbie_archive | |
cd tourbie_archive | |
wget --mirror -p -nH -e robots=off --convert-links http://tourbie.com | |
# --mirror mirrors the site (recurses all links) | |
# -p downloads all the links necessary to view the site | |
# --convert-links converts all links starting with http://tourbie.com to be relative | |
# -e robots=off optional, ignore robots.txt (on tourbie.com, I had disallowed the static files directory in robots.txt) | |
# -nH take out domain name (won't put everything in a "tourbie.com" folder) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Mind if you take a look at the spoofers with https://gist.github.com/mullnerz/9fff80593d6b442d5c1b ?