title |
author |
date |
A Gentle Introduction to Text Analysis and Topic Modeling |
Shawn Graham |
February 3rd, 2016 |
So much data has been made available online for historians - everything from court trials (The Old Bailey Online) to newspaper articles (Scissors and Paste), to all 71 volumes of the Jesuit Relations, to over 3.9 million tweets sent during the last election (Ian Milligan & Nick Ruest).
How do we begin to deal with this data? We do it the same way we do with all of our historical information: we consider its context and the patterns we find within it. Happily, we don't have to do this alone: we can 'not read' this information and see what patterns stand out. In a way, it's a bit like those 'magic-eye' cartoons the newspapers used to print. If you squinted the right way, p