I'll be working from these notes:
- Regular expressions (I used to use these notes in RWET)
- Strings and regular expressions (from a class I teach at Columbia, "Data and Databases"---scroll down to "regular expressions")
Source texts we'll use (download these to your working directory):
- Sea Rose by H.D.
- SOWPODS (a scrabble dictionary)
- All of the subject lines from the EnronSent corpus (more on EnronSent)