Skip to content

Instantly share code, notes, and snippets.

@deepthawtz
Created August 5, 2009 22:22
Show Gist options
  • Select an option

  • Save deepthawtz/163016 to your computer and use it in GitHub Desktop.

Select an option

Save deepthawtz/163016 to your computer and use it in GitHub Desktop.
#!/usr/bin/python
import urllib2
from BeautifulSoup import BeautifulSoup
import re
resp = urllib2.urlopen('http://kuumbwajazz.org/concerts')
html = resp.read()
soup = BeautifulSoup(html)
el = soup("meta", content=re.compile(".*"))[0]['content']
redirect = el[6:]
resp = urllib2.urlopen('http://kuumbwajazz.org/concerts/' + redirect)
html = resp.read()
soup = BeautifulSoup(html)
junk = soup.findAll('p', align="center")
# it is *all* about the list comprehensions
alist = [x.contents for x in junk if not re.search(" ", str(x.contents))]
artists = BeautifulSoup(str(alist)).findAll("a")
f = open('kuumbwa_calendar.html', 'w')
f.write(str(artists))
f.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment