Command-line gender analysis of New York Times articles, using wget and regex and highly simplified domain-specific methodology
This is a short proof-of-concept of how to use pattern matching and batch-downloading -- and, well, careful reading of New York Times style -- to perform a quickie gender analysis of New York Times articles by website section.
Sloppy analysis: Across virtually all of the nytimes.com section fronts, mentions of men outnumber women, except for Weddings (roughly 200 vs 190, Ms. vs Mr). Previous explanations for this phenomenon have generally boiled down to - more men are in the kinds of positions that get written about.
- See the full table of results here. Or look at the download log.