Skip to content

Instantly share code, notes, and snippets.

@sergiolucero
Last active January 22, 2019 18:04
Show Gist options
  • Save sergiolucero/c94a0568566a9f335ef93360544a74e1 to your computer and use it in GitHub Desktop.
Save sergiolucero/c94a0568566a9f335ef93360544a74e1 to your computer and use it in GitHub Desktop.
scraping robos
import requests
from bs4 import BeautifulSoup
URL_BASE='http://www.autosrobadoschile.com/agregar-robo/9-automoviles?start=%d'
out = []
for start in range(20): # first 400 robos
url = URL_BASE %(20*start)
print(url)
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
pcells = soup.find_all('td',{'class':'tdcenter column_3 hidden-phone'})
plates = [pcell.text for pcell in pcells]
out += plates
print('found %d plates' %(len(out)))
@sergiolucero
Copy link
Author

This could be written in a python one-liner! LEARN MAP+FILTER!

out = sum([pcell.text for pcell in BeautifulSoup(requests.get(URL_BASE %(20*start), 'html.parser').find_all(XXX) for start in range(20)])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment