The Python scripts attached here take care of the following tedious work, and should help one quickly get started with some real work on the corpus:
- Respect the Twitter API rate limits and throttle API hits.
- Don't hit the API for already expanded tweet ID's, so you can resume tweet expansion after stopping midway.
- Parse the API response and dump it into the correct column in the sqlite3 database.
- Gracefully handle exceptions while acquiring tweets from the API.
- Wrap version 1.1 of the Twitter API.
- Start from a specified tweet ID, assuming the input file is sorted in increasing order of tweet ID.