Skip to content

Instantly share code, notes, and snippets.

@jdmichaud
Created May 2, 2024 11:49
Show Gist options
  • Save jdmichaud/f30f6eeae42f9a525ed33840acf3167a to your computer and use it in GitHub Desktop.
Save jdmichaud/f30f6eeae42f9a525ed33840acf3167a to your computer and use it in GitHub Desktop.
Mirror a site with wget
wget \
--mirror \
--convert-links \
--adjust-extension \
--page-requisites \
--no-parent --execute robots=off \
--user-agent="Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:125.0) Gecko/20100101 Firefox/125.0" \
<url>
@jdmichaud
Copy link
Author

https://news.ycombinator.com/item?id=40496558

  wget-mirror() {
    wget --mirror --convert-links --adjust-extension --page-requisites \
    --no-parent --content-disposition --content-on-error \
    --header="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" \
    --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:129.0) Gecko/20100101 Firefox/129.0" \
    --restrict-file-names="windows,nocontrol" -e robots=off --no-check-certificate \
    --no-hsts --retry-connrefused --retry-on-host-error --reject-regex=".*\/\/\/.*" $1
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment