You can use this under following situations:
- Download all images from a website
- Download all videos from a website
- Download all PDF files from a website
$ wget -r -A.pdf http://url-to-webpage-with-pdfs/
Following is the command line which you want to execute when you want to download a full website and made available for local viewing.
$ wget --mirror -p --convert-links -P ./LOCAL-DIR WEBSITE-URL
- -mirror : turn on options suitable for mirroring.
- -p : download all files that are necessary to properly display a given HTML page.
- -convert-links : after the download, convert the links in document for local viewing.
- -P ./LOCAL-DIR : save all the files and directories to the specified directory.
First, store all the download files or URLs in a text file as:
$ cat > download-file-list.txt URL1 URL2 URL3 URL4
Next, give the download-file-list.txt as argument to wget using -i option as shown below.
$ wget -i download-file-list.txt
This downloaded the entire website for me:
wget --no-clobber --convert-links --random-wait -r -p -E -e robots=off -U mozilla http://site/path/
- If the files are ignored for robots (e.g. search engines), you've to add also:
-e robots=off
I was trying to download zip files linked from Omeka's themes page - pretty similar task. This worked for me:
wget -A zip -r -l 1 -nd http://omeka.org/add-ons/themes/
- -A: only accept zip files
- -r: recurse
- -l 1: one level deep (ie, only files directly linked from this page)
- -nd: don't create a directory structure, just download all the files into this directory.
All the answers with -k, -K, -E
etc options probably haven't really understood the question, as those as for rewriting HTML pages to make a local structure, renaming .php files and so on. Not relevant.
To literally get all files except .html etc:
wget -R html,htm,php,asp,jsp,js,py,css -r -l 1 -nd http://yoursite.com