hermes-pimentel/gist:635c7ace64590e6b3dc5a0df2b301246

Created April 28, 2016 14:05

Star (1) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/hermes-pimentel/635c7ace64590e6b3dc5a0df2b301246.js"></script>
Save hermes-pimentel/635c7ace64590e6b3dc5a0df2b301246 to your computer and use it in GitHub Desktop.

Download ZIP

Download all AWS whitepapers

Raw

gistfile1.txt

wget -O w1.txt http://aws.amazon.com/whitepapers/ && for i in `awk -F'"' '$0=$2' w1.txt | grep pdf | grep -v http`; do wget http:$i ; done

austincloudguru commented Feb 2, 2017 •

edited

Loading

Thanks, this was extremely helpful. Some of them now have spaces in the filenames as well as some links that have target before href, so they are missed as they don't awk right, so maybe something like:wget -O w1.txt http://aws.amazon.com/whitepapers/ && for i in grep -o //[^[:space:]]*.pdf w1.txt|grep whitepaper|sed -e 's/ /%20/g'; do wget http:$i ; done

mpursley commented Sep 5, 2018 •

edited

Loading

@austincloudguru.. I think you missed the backticks or '$()' around the the grep... this works for me..

wget -O w1.txt http://aws.amazon.com/whitepapers/ && for i in $(grep -o //[^[:space:]]*.pdf w1.txt|grep whitepaper|sed -e 's/ /%20/g'); do wget http:$i ; done

mulatinho commented Nov 9, 2018

Excellent sir :) 👍

wang1209 commented Apr 22, 2019

@mpursley works for me.
wget -O w1.txt http://aws.amazon.com/whitepapers/ -H '--no-check-certificate' && for i in $(grep -o //[^[:space:]]*.pdf w1.txt|grep whitepaper|sed -e 's/ /%20/g'); do wget http:$i ; done

chrisdlangton commented Apr 30, 2020

no longer possible to wget/curl as PDF links are lazy loaded in via JavaSccript now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment