Skip to content

Instantly share code, notes, and snippets.

@blha303
Created February 27, 2014 13:16
Show Gist options
  • Save blha303/9249808 to your computer and use it in GitHub Desktop.
Save blha303/9249808 to your computer and use it in GitHub Desktop.
Get all bropages entries and output to folder
import requests
from os import mkdir, sep, listdir
from BeautifulSoup import BeautifulSoup as Soup
# Load page
soup = Soup(requests.get("http://bropages.org/browse").text)
# Get rows excluding header
rows = soup.findAll('tr')[1:]
cmds = []
# iterate over rows, get commands, save list without duplicates
for a in rows:
out = a.find('td').text
if not out in cmds:
cmds.append(out)
# make output directory if not already existing
try:
mkdir('bropages')
except:
pass
# get output json, save to directory
for cmd in cmds:
try:
with open('bropages{}{}.json'.format(sep, cmd), 'w') as f:
f.write(requests.get('http://bropages.org/%s.json' % cmd).text)
print "done " + cmd
except UnicodeError: # stupid unicode
continue
@matxpg
Copy link

matxpg commented Feb 27, 2014

Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment