Skip to content

Instantly share code, notes, and snippets.

@pmeinhardt
Created October 10, 2013 17:12
Show Gist options
  • Select an option

  • Save pmeinhardt/6922049 to your computer and use it in GitHub Desktop.

Select an option

Save pmeinhardt/6922049 to your computer and use it in GitHub Desktop.
download an entire page (including css, js, images) for offline-reading, archiving… using wget

If you ever need to download an entire website, perhaps for off-line viewing, wget can do the job — for example:

$ wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains website.org --no-parent  www.website.org/tutorials/html/

This command downloads the website www.website.org/tutorials/html/.

The options are:

  • --recursive: download the entire website
  • --domains website.org: don't follow links outside website.org
  • --no-parent: don't follow links outside the directory tutorials/html/
  • --page-requisites: get all the elements that compose the page (images, css and so on)
  • --html-extension: save files with the .html extension
  • --convert-links: convert links so that they work locally, off-line
  • --restrict-file-names=windows: modify filenames so that they will work in Windows as well
  • --no-clobber: don't overwrite any existing files (used in case the download is interrupted and resumed).

Source: http://www.linuxjournal.com/content/downloading-entire-web-site-wget

@ienliven
Copy link
Copy Markdown

ienliven commented Aug 2, 2016

I had to add "-e robots=off" for it to work on this specific site. Thanks!

@tikendraw
Copy link
Copy Markdown

will it download css and js ?

@parcox
Copy link
Copy Markdown

parcox commented May 8, 2025

Thanks, very useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment