Last active
January 14, 2022 04:15
-
-
Save manifestuk/867191 to your computer and use it in GitHub Desktop.
Gist for retrieving a full website using wget, because I always forget the options.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# Explanation: | |
# `--adjust-extension` | |
# Add `.html` file extension to any files of type `application/xhtml + xml` or `text/html`. | |
# Add `.css` file extension to any files of type `text/css`. | |
# | |
# `--convert-links` | |
# Convert full links to relative. | |
# | |
# `--level=inf` (`-l inf`) | |
# Descend an infinite number of levels. | |
# | |
# `--mirror` (`-m`) | |
# Mirror the source (download only "changed" files, based on timestamp). | |
# | |
# `--no-parent` (`-np`) | |
# Do not ascend to the parent directory. | |
# | |
# `--page-requisities` (`-p`) | |
# Download any page prerequisites (images etc.). | |
# | |
# `--random-wait` | |
# Wait for (0.5 * `wait`) to (1.5 * `wait`) between requests. | |
# | |
# `--recursive` (`-r`) | |
# Recursively download the files. | |
# | |
# `--wait=1` (`-w 1`) | |
# Wait for 1 second between requests (randomised by `--random-wait`). | |
# | |
wget \ | |
--adjust-extension \ | |
--convert-links \ | |
--level=inf \ | |
--mirror \ | |
--no-parent \ | |
--page-requisites \ | |
--random-wait \ | |
--recursive \ | |
--wait=1 \ | |
http://example.com/ | |
# The short version... | |
wget -E -k -l inf -m -np -p --random-wait -r -w 1 http://example.com/ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment