Skip to content

Instantly share code, notes, and snippets.

@neurosnap
Created May 4, 2015 14:51
Show Gist options
  • Save neurosnap/42521a0b21f751a97116 to your computer and use it in GitHub Desktop.
Save neurosnap/42521a0b21f751a97116 to your computer and use it in GitHub Desktop.
Authors on Freep Article
# -*- coding: utf-8 -*-
from __future__ import print_function
import re
import requests
# lets download the article web page
r = requests.get('http://www.freep.com/story/life/advice/2015/05/04/mother-law-meant-selfish/26706999/')
print("Grabbing data from {}".format(r.url))
r.raise_for_status()
amy_pattern = re.compile('amy dickinson', re.IGNORECASE)
carolyn_pattern = re.compile('carolyn hax', re.IGNORECASE)
amys = amy_pattern.findall(r.text)
carolyns = carolyn_pattern.findall(r.text)
# number of times the author was found in the text of the web page
print("Amy Dickinson: {}".format(len(amys)))
print("Carolyn Hax: {}".format(len(carolyns)))
@neurosnap
Copy link
Author

Results:

Amy Dickinson: 0
Carolyn Hax: 14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment