Created
August 30, 2011 00:09
-
-
Save lg/1179762 to your computer and use it in GitHub Desktop.
Download all booths from a dailybooth user
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# replace USERNAMEHERE with the username of the user whose booths you'd like to copy (there are 4 instances you need to find/replace | |
echo "" > out.csv | |
CUR_PAGE=1 | |
while true; do | |
curl --silent "http://dailybooth.com/USERNAMEHERE/quilt/page/$CUR_PAGE" > page.html | |
if ! grep "div><a href=\"/USERNAMEHERE/" page.html > /dev/null; then | |
echo "Done" | |
break | |
else | |
echo "Page $CUR_PAGE" | |
for id in `perl -n -e '/\/USERNAMEHERE\/(\d+).*?\// && print "$1\n"' page.html`; do | |
echo " $id" | |
echo -en $id >> out.csv | |
echo -en "\t" >> out.csv | |
curl --silent "http://dailybooth.com/USERNAMEHERE/$id" > details.html | |
cat details.html | perl -0777 -ne 'print "$1" while /<p class="when">\n {8}(.*?)\n {6}<\/p><p class="views">/gs' >> out.csv | |
echo -en "\t" >> out.csv | |
cat details.html | perl -0777 -ne 'print "$1" if /(http:\/\/cdn\d.dailybooth.com\/\d\/pictures\/large\/.*?)\"/' >> out.csv | |
echo -en "\t" >> out.csv | |
cat details.html | perl -0777 -ne 'print "$1" while /id="blurb">\n {4}(.*?)\n/gs' >> out.csv | |
echo -en "\t" >> out.csv | |
echo -en "\n" >> out.csv | |
done | |
CUR_PAGE=$(($CUR_PAGE+1)) | |
fi | |
done |
it will output the urls to the images (and details) into out.csv. you'll need to then download the urls using curl or similar.
Alright cool found the file I'll go looking for how to use it to do what you say
It only outputs the booth ID numbers and not the actual URLs. How are we supposed to download the URLs themselves with this information? I've been researching it and it seems that cURL only provides the option of sequential downloads for similar URLs but no option to somehow input the IDs given in the csv file in order to get the booths.
I'm new, but how do I use this?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I can't figure out where it's supposed to save the booths, I also can't seem to find the files by searching. I have gotten it to run but it doesn't seem to do anything except print out url numbers and page numbers.