- https://docs.google.com/presentation/d/1pBi0I_I6NvXD_nRoypwaKx_SfUQe97CasD9fko9cWy4/edit#slide=id.g2bf82efe8_039
- http://www.flickr.com/
- http://congress.api.sunlightfoundation.com/
- http://sunlightlabs.github.io/congress/
- http://konklone.io/json/
Before we begin:
- Do you have a recent version of Firefox or Chrome?
- Do you have a JSON viewer installed? Get that done now.
Okay:
- We say "Open Data", we're at its Day, but what is it really?
- If you've sent a spreadsheet instead of a doc, you get why data is good
- We're going to go a level up from spreadsheets, and then come back down
- We'll stay all in-browser, no code or terminal required
- How are URLs laid out? What's behind each piece?
- Briefly: domains, DNS, protocols
- Originally, URLs were just ways to reference files and folders
- Go to flickr.com
- Search for ukraine, http://www.flickr.com/search/?q=ukraine
- Move to 'recent', https://www.flickr.com/search/?q=ukraine&s=rec
- Advanced search - inc. screenshots, only CC content, https://www.flickr.com/search/?q=ukraine&l=cc&ss=0&ct=3&mt=all&w=all&adv=1
The important thing here is keys and values.
Okay, neat, but this is a little arcane.
Well, these URLs aren't designed for you to understand, they're just to make a little advanced search page. Now we're going to look at an API.
APIs: the most overloaded, overused word in all of government and open data right now. I don't even want to tell you what it means (okay), because it means nothing.
On the web, APIs are just URL patterns, that lead you to data instead of a web page.
This may sound surprising, but API URLs are designed to be much more understandable to humans than website URLs are. When all you have are URLs and data, and you can't use any bolding or images, your words have to be very clear.
You've seen data at a URL if you've ever peeked at an RSS feed - and if you've ever hit View Source, then you've seen that web pages themselves actually are pretty data-like.
Could you have a CSV API? Absolutely. But that's pretty rare. Used to be XML, but nowadays the main data format is JSON.
Congress API:
- Docs, an intro: http://sunlightlabs.github.io/congress/
- Root: http://congress.api.sunlightfoundation.com
- let's talk about JSON, this is the simplest form
- key and value pairs, just like URLs (slides)
- Let's read a bit about how the API works
- /legislators, okay
- error? okay...
- Ah, we need an API key
- opendataday key
- Okay, /legislators
- fold up 'results', look at what we have
- go over array, a list (slides)
- Export that to CSV!
- konklone.io/json/
- now you have a spreadsheet of 20 members of Congress
- Let's learn how to make this even better
- let's limit the fields to what we want
- let's drop the pagination
- new spreadsheet: every member of Congress
- so useful! so useful we already provide this in bulk
- (bulk downloads are great)
- Let's take it one step further and ask a real question, that you're not going to find in bulk
- /votes, let's read
- operators
- nesting
- what fields do we want?
- make that CSV
You can certainly do a lot more with data by writing code in a programming language (and Shannon will be teaching a Python class later!).
But hopefully this demystifies some words for you: URLs, JSON, and APIs don't take a computer science degree to understand. They are patterns, meant for both humans and computers to understand.
Understanding how this stuff fits together will help you in seeing how the web works, the value in tools people make, and maybe even to know when someone is feeding you a line.
Above all, I want you to leave knowing what to Google for - half my job is Googling!
Thanks!