-
-
Save visualkom/0c1b155ca5975ab0ccee32a7467895da to your computer and use it in GitHub Desktop.
[WGET Site Grab] Downloading an Entire Web Site with wget #wget
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Compact | |
wget -mkEpnp -nc http://example.org | |
# m = mirror | |
# k = convert-links | |
# E = adjust-extension | |
# p = page-requisites | |
# np = no-parent | |
# nc = no-clobber | |
# One liner | |
wget --recursive --page-requisites --adjust-extension --span-hosts --convert-links --restrict-file-names=windows --domains example.org --no-parent --no-clobber example.org | |
or | |
wget --mirror --page-requisites --adjust-extension --span-hosts --convert-links --restrict-file-names=windows --domains example.org --no-parent --no-clobber example.org | |
# Explained | |
wget \ | |
--mirror \ # Makes (among other things) the download recursive (-N -r -l inf --no-remove-listing shortcut). | |
or | |
--recursive \ # Download the whole site. | |
--page-requisites \ # Get all assets/elements (CSS/JS/images). | |
--adjust-extension \ # Adds suitable extensions to filenames (html or css) depending on their content-type. | |
--span-hosts \ # Include necessary assets from offsite as well. | |
--convert-links \ # Update links to still work in the static version. | |
--restrict-file-names=windows \ # Modify filenames to work in Windows as well. | |
--domains example.org \ # Do not follow links outside this domain. | |
--no-parent \ # Don't follow links outside the directory you pass in. | |
--no-clobber \ # Don't overwrite any existing files (used in case the download is interrupted and resumed). | |
example.org/whatever/path # The URL to download |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment