scraping_links.md

Forest Gregg
[email protected]
DataMade
http://datamade.us
```

Almost every website you go to is a view of some data that has been organized into tables. Web pages are fancy view of spreadsheets

* [Tiers Fusion Table](https://www.google.com/fusiontables/data?docid=11PNEL-A6MFtYLLGvgtHqK7K1Pm4viKiK9IHY0tYf#rows:id=1)
* [CPS Tiers](http://cpstiers.opencityapps.org/)

The means that sometimes we can go the other way. We can turn websites back into tables of data. This is called web scraping.

* [Illinois State Board Elections](http://www.elections.il.gov/)
* [Campaign Committee Page](http://www.elections.il.gov/CampaignDisclosure/CommitteeDetail.aspx?id=4410)
* [Election Money](http://electionmoney.org/)
* [Current Cash Position](http://illinoiselectiondata.com/?p=265)

Two reasons you might scrape?
- You tried asking, got ignored or rejected, don't want to hire a lawyer.
- The data changes often and you need current data on ongoing basis.

If one of these don't apply, it's easier to just ask.

Reasons why you still might not scrape?
- It's illegal. Specifically, it might violate the terms of service of a website. This is a contract that you implicitly agree to by interacting with a website that limits how you can use the website. The law hear is often murky. It is *much* murkier for government websites. I don't ever violate terms of services.
- It's expensive. It will typically cost $3-5K to hire someone to write a good scraper for a complicated site. This will often just be an upfront cost, and if you have an ongoing use, it can be attractive.
fgregg/scraping_links.md