Skip to content

Instantly share code, notes, and snippets.

@BrianHicks
Created January 29, 2014 23:16
Show Gist options
  • Save BrianHicks/8699332 to your computer and use it in GitHub Desktop.
Save BrianHicks/8699332 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "Bike Racks"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": "# Bike Racks\n\nWe're going to try and find cluster of bike racks in Cincinnatti. Why? Because John asked for something cool.\n\nFirst we need to grab the data. Fortunately Socrata has this!"
},
{
"cell_type": "code",
"collapsed": false,
"input": "import requests\nimport csv\n\nracks = requests.get('https://cincinnati.demo.socrata.com/api/views/wi79-n3c6/rows.json?accessType=DOWNLOAD').json()",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": "The data is in a bit of weird format but at least it's deserializable. The dictionary comprehension below matches up location name with coordinates."
},
{
"cell_type": "code",
"collapsed": false,
"input": "racks = {rack[9]: (float(rack[10][1]), float(rack[10][2])) for rack in racks['data']}\nprint racks.items()[0]",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": "(u'Cincinnati Commerce Center', (39.102596310765705, -84.51330853175256))\n"
}
],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Next we'll need to define a grouping function. I tend to abuse `collections.Counter` for this sort of thing, rather than iterating over lists. This just chops off some digits of precision off of both components of the coordinate and returns a count of those groups."
},
{
"cell_type": "code",
"collapsed": false,
"input": "import math\nfrom collections import Counter\n\ndef get_grouping_at(values, precision):\n return Counter(\n (round(lat, precision), round(lng, precision))\n for lat, lng\n in values\n )",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": "After we have our grouping function, we see what the grouping is like for a bunch of values. As you can see, the groups stay the same after 4 digits of precision, so we'll go for 2 to get sort of medium-big areas without being huge. Of course since this is geo data this scale is logarithmic."
},
{
"cell_type": "code",
"collapsed": false,
"input": "import math\n\nfor digits in range(0, 10):\n groups = get_grouping_at(racks.values(), digits)\n print \"{}:\\t{} groups\".format(digits, len(groups))",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": "0:\t2 groups\n1:\t7 groups\n2:\t68 groups\n3:\t165 groups\n4:\t198 groups\n5:\t200 groups\n6:\t200 groups\n7:\t200 groups\n8:\t200 groups\n9:\t200 groups\n"
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": "And last we're going to list each group and the number of bike racks it contains."
},
{
"cell_type": "code",
"collapsed": false,
"input": "precision = 2\ngroups = get_grouping_at(racks.values(), precision)\nfor coords, count in groups.most_common(10):\n print \"The area roughly centered at {coords[0]:0<5}, {coords[1]:0<6} has {count} bike racks.\".format(\n coords=coords,\n count=count\n )\n\nprint \"Overall, there are {} groups\".format(len(groups))",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": "The area roughly centered at 39.11, -84.50 has 52 bike racks.\nThe area roughly centered at 39.11, -84.51 has 43 bike racks.\nThe area roughly centered at 39.10, -84.51 has 25 bike racks.\nThe area roughly centered at 39.10, -84.52 has 10 bike racks.\nThe area roughly centered at 39.11, -84.52 has 8 bike racks.\nThe area roughly centered at 39.13, -84.52 has 7 bike racks.\nThe area roughly centered at 39.14, -84.51 has 6 bike racks.\nThe area roughly centered at 39.16, -84.54 has 6 bike racks.\nThe area roughly centered at 39.13, -84.51 has 5 bike racks.\nThe area roughly centered at 39.15, -84.43 has 5 bike racks.\nOverall, there are 68 groups\n"
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": "That's about it. Of course, we could have used numpy/scipy/matplotlib to do something visual like heatmaps, but this is a quick and dirty way of getting at the data."
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment