Skip to content

Instantly share code, notes, and snippets.

@nonsleepr
Last active October 25, 2015 09:09
Show Gist options
  • Save nonsleepr/11401542 to your computer and use it in GitHub Desktop.
Save nonsleepr/11401542 to your computer and use it in GitHub Desktop.
FutureLearn Video downloader
#!/bin/bash
#
# Usage:
# > futurelearn_dl.sh [email protected] password course-name week-id
# Where *[email protected]* and *password* - your credentials
# ,*course-name* is the name from URL
# and *week-id* is the ID from the URL
#
# E.g. To download all videos from the page: https://www.futurelearn.com/courses/corpus-linguistics/todo/238
# Execute following command:
# > futurelearn_dl.sh [email protected] password corpus-linguistics 238
#
email=$1
password=$2
course=$3
weekid=$4
HD=/hd
# Pulls the login page and strips out the auth token
authToken=`curl -s -L -c cookies.txt 'https://www.futurelearn.com/sign-in' | \
grep -Po "(?<=authenticity_token\" value=\")([^\"]+)"`
function dlvid {
vzid=`curl -s -b cookies.txt $1 | grep -Po '(?<=video-)[0-9]+'`
vzurl=https://view.vzaar.com/${vzid}/download${HD}
curl -O -J -L $vzurl
}
# Posts all the pre-URI-encoded stuff and appends the URI-encoded auth token
curl -X POST -s -L -e 'https://www.futurelearn.com/sign-in' -c cookies.txt -b cookies.txt \
--data-urlencode email=$email \
--data-urlencode password=$password \
--data-urlencode authenticity_token=$authToken 'https://www.futurelearn.com/sign-in' > /dev/null
# Download Course page
curl -s -L -b cookies.txt https://www.futurelearn.com/courses/${course}/todo/${weekid} | \
grep -B8 'headline.*video' | grep -o '/courses[^"]*' | \
while read -r line; do
url=https://www.futurelearn.com${line}/progress
dlvid $url
done
@mjbright
Copy link

@thelostelite It looks like you just don't have curl installed, you need to rerun the cygwin setup.exe and select the curl package (in the 'Net' category)

@mjbright
Copy link

OK, I couldn't put this off any longer ...

I scrapped the bash script (still not achieving login) although it is still there in old commits of the repo,
https://github.com/mjbright/futurelearn-dl

and we now have a Python3 version.

Current status is that I'm successfully obtaining mp4 and pdf downloadable urls.
By the time you read this it should be doing basic downloads ... and then I have to do something else on this Sunday ...

I hope this helps people.

I won't have much time to update this before December, but hope to evolve it.

The repo is here:
https://github.com/mjbright/futurelearn-dl

The biggest todo items once downloading is implemented are

  • fixing the "occasional" unicode errors (tricky)
  • add proper command-line arguments
  • handle a week at a time
  • don't repeat downloads

@mjbright
Copy link

OK, I've published something useable (for me ... YMMV).

It downloads most mp4 and pdf files for a course.

It can download just one week and avoids downloading files which already exist
(doesn't download if the destination file exists ...careful if you move/rename)

Still some unhandled unicode errors and the need for proper cmd-line argument handling.

I'll stop spamming here now.
Follow the repo if you're interested.
https://github.com/mjbright/futurelearn-dl

NOTE: I won't have much time to look at issues until December, but please file issues anyway..

More than welcome to have functionality issues or just comments on bad style ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment