Skip to content

Instantly share code, notes, and snippets.

@dwillis
Created January 12, 2013 16:01
Show Gist options
  • Select an option

  • Save dwillis/4518613 to your computer and use it in GitHub Desktop.

Select an option

Save dwillis/4518613 to your computer and use it in GitHub Desktop.
A Python script to parse Senate votes, by Aaron Swartz, 2005. Thank you, and may you find peace.
#Module to retrieve and parse votes from Senate web site, output to stdout
import urllib, re
f=open('senate03.txt','w')
sr = re.compile('<question>(.*?)</td>.*?<b> Vote Date: </b><.*?>(.*?)</td>.*?<b> Required For Majority: </b><td class="contenttext">(.*?)</td></td><td valign="top" class="contenttext"><b> Vote Result: </b><.*?>(.*?)</td></td>', re.S)
sr2 = re.compile('</span>\n<TABLE width="100%"(.*?)</TABLE>', re.S)
sr3 = re.compile('\s+(?:<br>)?\s*(?:</TD><TD class="contenttext" width="33%">|<td width="33%" class="contenttext">)?(.*?), <b>(.*?)</b>')
def parseSenateVote(congress, session, n):
sn = str(n).zfill(5)
c = urllib.urlopen("http://www.senate.gov/legislative/LIS/roll_call_lists/roll_call_vote_cfm.cfm?congress=" + `congress` + "&session=" + `session` + "&vote=" + sn).read()
question, date, required, result = sr.findall(c)[0]
question = question.replace("</question>", '').replace("\n", ' ').strip()
print question, date, required, result
c2 = sr2.findall(c)[0]
nsum = 0
for (name, vote) in sr3.findall(c2):
print name, vote
nsum += 1
if nsum != 100: print "ERR: only", nsum, "votes!"
#Example: parseSenateVote(101, 1, 312)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment