Skip to content

Instantly share code, notes, and snippets.

@Rady
Created September 16, 2021 15:06
Show Gist options
  • Save Rady/b95079f0a43600c86595da8ba37c5dc8 to your computer and use it in GitHub Desktop.
Save Rady/b95079f0a43600c86595da8ba37c5dc8 to your computer and use it in GitHub Desktop.
use wget download all PDF files and images from URLs. sometime need ignore robots.txt.
wget -e robots=off -r -l3 -nc -A pdf,htm,jpg -i all-urls-text-file.txt
// URLs in all-urls-text-file.txt, line by line like:
// http://www.target-site.com/url-1.htm
// http://www.target-site.com/url-2.htm
// http://www.target-site.com/url-3.htm
// ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment