Last active
April 12, 2016 11:31
-
-
Save Willshaw/2eb8d515f4cf3ce9f08c8fcf8b44bf8b to your computer and use it in GitHub Desktop.
Download a load of files from the web and push to an s3 bucket
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE | |
# Version 2, December 2004 | |
# Copyright (C) 2004 Sam Hocevar <[email protected]> | |
# Everyone is permitted to copy and distribute verbatim or modified | |
# copies of this license document, and changing it is allowed as long | |
# as the name is changed. | |
# DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE | |
# TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION | |
# 0. You just DO WHAT THE FUCK YOU WANT TO. | |
# | |
# USAGE | |
# | |
# You need the s3cmd cli tool for this script to work, and make sure it works - http://s3tools.org/s3cmd | |
# type `s3cmd ls` and if you can see your buckets, you're ok | |
# | |
# the "img.dat" file that the while loop reads needs to just contain a list of http:// links to images | |
# e.g. | |
# http://example.com/imageOfCat.png | |
# http://example.com/photoOfASlipper.jpg | |
# http://example.com/pictureOfCatEnjoyingBeingInASlipper.pdf | |
# | |
# Holy Batman's spectacles - there's actually no reason why this has to be limited to images, | |
# but I've named it now, so I'm not changing it. Should work for any file type | |
# | |
# oh actually sure, I've renamed it, that didn't take long at all. | |
# | |
s3path=$1 | |
[[ -z "$s3path" ]] && { echo "Where the hell am I supposed to put your stuff? I need a bucket path man" ; exit 1; } | |
# read file line by line (file of http://place/where/some/images/be/at.png | |
while read line | |
do | |
# reverse line, get 1st part of it (Delimiter as /, so you get gnp.ta) then re-reverse it to get image name | |
file="$(echo $line | rev | cut -d'/' -f 1 | rev)" | |
echo sort this file : $file | |
# download the file, overwriting with file name (to stop at.png.1, at.png.2 etc) | |
wget $line -O $file | |
# push to s3 | |
s3cmd put $file $s3path$file | |
# get rid of the old files, just to be clean and tidy | |
echo clean up and remove old files | |
rm $file | |
done < img.dat |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment