Skip to content

Instantly share code, notes, and snippets.

@thangdc94
Last active December 19, 2017 15:49
Show Gist options
  • Save thangdc94/2468aa62ea21f9685f46c5968c0679d4 to your computer and use it in GitHub Desktop.
Save thangdc94/2468aa62ea21f9685f46c5968c0679d4 to your computer and use it in GitHub Desktop.
Crawl webpage

Simple

wget -E -H -k -K -N -p -P newfolder http://example.com/something

Multiple

  1. A few websites
wget -E -H -k -K -N -p -P newfolder http://example.com/something http://example.com/stuff
  1. A lot of Websites

Create list.txt

http://example.com/something
http://example.com/stuff

Then

wget -E -H -k -K -p -e robots=off -P /Downloads/ -i ./list.txt

Custom header

wget -E -H -k -K -N -p --header="Cookie: wordpress_test_cookie=WP+Cookie+check; wordpress_logged_in_86a9106ae65537651a8e456835b316ab=admin%7C1513865967%7CxD5bOmR2pwFHOi53Tt0kWDuAWFvgmXKHRrt9MREG0ZE%7Cc19586752a226033a70791018ca9dd7bd4653c439b77809d6648f1405dfbdfe6; wp-settings-time-1=1513696481" -P addcar http://localhost/addcar/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment