Skip to content

Instantly share code, notes, and snippets.

@pscollins
Created January 13, 2015 04:02
Show Gist options
  • Save pscollins/d189ce8531375a4401d1 to your computer and use it in GitHub Desktop.
Save pscollins/d189ce8531375a4401d1 to your computer and use it in GitHub Desktop.
Scrape chalk
BASE_URL="https://chalk.uchicago.edu/webapps"
AUTH_URL="$BASE_URL/login/?action=relogin"
SCRAPE_URL="$BASE_URL/blackboard/content/launchLink.jsp?course_id=_137801_1&tool_id=_139_1&tool_type=TOOL&mode=view&mode=reset"
# Log in to the server. This can be done only once.
wget --keep-session-cookies --save-cookies cookies.txt --post-data 'user_id=$1&password=$2' $AUTH_URL
# Now grab the page or pages we care about.
wget --load-cookies cookies.txt -m $SCRAPE_URL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment