pmeinhardt/download-site.md

Created October 10, 2013 17:12

Star (25) You must be signed in to star a gist
Fork (17) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/pmeinhardt/6922049.js"></script>
Save pmeinhardt/6922049 to your computer and use it in GitHub Desktop.

Download ZIP

download an entire page (including css, js, images) for offline-reading, archiving… using wget

Raw

download-site.md

If you ever need to download an entire website, perhaps for off-line viewing, wget can do the job — for example:

$ wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains website.org --no-parent  www.website.org/tutorials/html/

This command downloads the website www.website.org/tutorials/html/.

The options are:

--recursive: download the entire website
--domains website.org: don't follow links outside website.org
--no-parent: don't follow links outside the directory tutorials/html/
--page-requisites: get all the elements that compose the page (images, css and so on)
--html-extension: save files with the .html extension
--convert-links: convert links so that they work locally, off-line
--restrict-file-names=windows: modify filenames so that they will work in Windows as well
--no-clobber: don't overwrite any existing files (used in case the download is interrupted and resumed).

Source: http://www.linuxjournal.com/content/downloading-entire-web-site-wget

ienliven commented Aug 2, 2016

I had to add "-e robots=off" for it to work on this specific site. Thanks!

tikendraw commented Aug 22, 2024

will it download css and js ?

Author

pmeinhardt commented Aug 23, 2024 via email

Yup. That’s what --page-requisites is for: https://explainshell.com/explain?cmd=wget+--recursive+--no-clobber+--page-requisites+--html-extension+--convert-links+--restrict-file-names%3Dwindows+--domains+website.org+--no-parent++www.website.org%2Ftutorials%2Fhtml%2F

On Thu 22. Aug 2024 at 18:07, Tikendra Sahu ***@***.***> wrote: ***@***.**** commented on this gist. ------------------------------ will it download css and js ? — Reply to this email directly, view it on GitHub <https://gist.github.com/pmeinhardt/6922049#gistcomment-5163803> or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFMPV6O54NW4SZBKCZX7N3ZSYEDBBFKMF2HI4TJMJ2XIZLTSKBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDHNFZXJJDOMFWWLK3UNBZGKYLEL52HS4DFVRZXKYTKMVRXIX3UPFYGLK2HNFZXIQ3PNVWWK3TUUZ2G64DJMNZZDAVEOR4XAZNEM5UXG5FFOZQWY5LFU43DSMRSGA2DTJ3UOJUWOZ3FOKTGG4TFMF2GK> . You are receiving this email because you authored the thread. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub> .

parcox commented May 8, 2025

Thanks, very useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment