Created
March 14, 2012 01:38
-
-
Save cstrouse/2033271 to your computer and use it in GitHub Desktop.
Scrape beeradvocate with Python, Requests, and BS
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import requests | |
| from BeautifulSoup import BeautifulSoup | |
| soup = BeautifulSoup(requests.get("http://beeradvocate.com/search?q=ipa&qt=beer").content) | |
| # Find the beer links | |
| results = soup.findAll('div') | |
| # How many beers did it find | |
| beer_count = results[12].find('b').string.split(' ')[1] | |
| beer_links = results[13].findAll('li') | |
| links = [] | |
| for link in beer_links: | |
| links.append("http://beeradvocate.com%s" % (link.find('a')['href'])) | |
| for beer in links: | |
| soup = BeautifulSoup(requests.get(beer).content) | |
| # Beer name | |
| print soup.find('h1', {'class': 'norm'}).string | |
| # Beer rating | |
| print soup.find('span', {'class': 'BAscore_big'}).string |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment