Skip to content

Instantly share code, notes, and snippets.

@guanix
Created February 26, 2011 20:59
Show Gist options
  • Save guanix/845620 to your computer and use it in GitHub Desktop.
Save guanix/845620 to your computer and use it in GitHub Desktop.
#!/bin/sh
# FACTCHECK: Double-check you got your stuff right.
# This is the filename of the new hotlist for users you need to re-download.
# Run it in the script directory, it'll go down and smash the furniture properly.
echo "" > missing-hotlist.txt
cd videos
for each in *
do
if [ -d $each ]
then
cd $each
echo "[.] $each...."
for html in `find . -type f -name \*\-\*.html`
do
number=`echo $html | sed 's/.*-//g' | sed 's/\.html//g'`
if [ "$number" -gt 1000 ]
then
if [ ! -e *-$number.flv ] && [ ! -e $number.notitle ]
then
UTFSUCKSBALLS=1
fi
fi
done
if [ "$UTFSUCKSBALLS" = "1" ]
then
echo "$each" >> ../../missing-hotlist.txt
echo " $each needs a redo."
UTFSUCKSBALLS=
fi
fi
cd ..
done
#!/bin/sh
#
# GRAPEFRUIT: Given a user, downloads all that user's videos.
# THIS IS THE AXEL version. BEWARE IT WILL EAT ALL OF YOUR BANDWITH LIKE COOKIE MONSTER, BUT BETTER
USER=$1
PAGE=$2
#Path to axel
AXEL="$(which axel)"
if [ ! -x "$AXEL" ]; then
echo "DUDE, MOTHERFUCKING INSTALL AXEL. " >&2
exit 1
fi
if [ ! -d videos ]
then
mkdir videos
fi
if [ ! -d videos/$USER ]
then
mkdir videos/$USER
fi
# Using the PAGE attribution, know how many times we've been called.
if [ "$PAGE" ]
then
KEEPINGUPAPPEARANCES=$(($PAGE + 1))
echo "[!] $USER Page $KEEPINGUPAPPEARANCES."
cd videos/$USER
VICTIM=$USER-$PAGE.html
SKIP=$(($PAGE * 10))
wget -q -O $USER-$PAGE.html "http://video.yahoo.com/mypage/video?s=$USER&o=$SKIP"
else
echo "[!] $USER"
cd videos/$USER
wget -q -O $USER.html -q "http://video.yahoo.com/mypage/video?s=$USER"
VICTIM=$USER.html
PAGE=0
fi
# extract URLs.
for fruitmeat in `grep \.com/watch $VICTIM | sed 's/.*href=\"//g' | cut -f1 -d'"' | sort -u`
do
CUTESY="`echo $fruitmeat | sed 's/.*watch\///g' | sed 's/\//-/g'`"
VNUM="`echo $CUTESY|tr "-" " "|awk '{print $2}'`"
if [ -f *-$VNUM.flv ] && [ -f *-$VNUM.html ]; then
echo "EXISTS: `ls *-$VNUM.flv`"
continue
fi
if [ -f $VNUM.notitle ]; then
echo "PREVIOUS NOTITLE: $VNUM"
continue
fi
wget -q -O $CUTESY.html $fruitmeat
title="`python ../../youtube-dl -I $BIND -e $fruitmeat|tr \"/\" \"_\"`"
if [ -z "$title" ]
then
echo "NOTITLE: $VNUM $title"
touch "$VNUM.notitle"
continue
fi
sleep 2
url="`python ../../youtube-dl -I $BIND -g $fruitmeat`"
if [ -z "$url" ]
then
echo "NOURL: $VNUM $title"
touch "$VNUM.notitle"
continue
fi
echo $VNUM-$title
fileflv="$title-$VNUM.flv"
filetmp="video-temp-$VNUM.tmp"
/usr/local/bin/axel -n 3 -a -o $filetmp $url && mv -v "$filetmp" "$fileflv"
done
# If there's a link for "Older", then call myself again, but this time add a page.
OLDIECHECK=`grep ">Older >" $VICTIM | grep href`
sleep 1
if [ "$OLDIECHECK" ]
then
echo "Yep! Another page of this!"
cd ../..
PAGE=$(($PAGE+1))
exec $0 $USER $PAGE
fi
@guanix
Copy link
Author

guanix commented Feb 26, 2011

Changes:

  • Don't run axel in the background.
  • Don't run youtube-dl at all if .html and .flv files for a video both exist, allows for relatively fast resumes because only the user's index page(s) is downloaded.
  • If youtube-dl cannot find a title or URL, touch a file to record this fact. The next factcheck run will see this and not consider it an error.
  • axel downloads to a temporary file and moves it to the final filename after download. This saves the file in case something is terribly unescaped in the title/flv filename.
  • If there is more than one page, exec the next invocation of grapefruit so we don't have many copies of sh running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment