Skip to content

Instantly share code, notes, and snippets.

@powersa
Last active March 17, 2017 05:18
Show Gist options
  • Save powersa/0766e52dfb1d40098e0d6ebe9b29b243 to your computer and use it in GitHub Desktop.
Save powersa/0766e52dfb1d40098e0d6ebe9b29b243 to your computer and use it in GitHub Desktop.
Import Seattle Transit data into Neo4j

Grab data from data portal: http://www5.kingcounty.gov/gisdataportal/

Explore in QGIS

Export busstop.shp as geojson: busstop-layer.geojson

Generate routes.txt:

jq -r '.features | .[] | .properties | .ROUTES | split(",") | .[]' busstop-layer.geojson | sort -u >routes.txt

https://gist.githubusercontent.com/powersa/8a0c72365fd0f99693d2d411d4967da6/raw/8b387f9d9a51e776f32660008077aacfc6bcd165/routes.txt

Generate bus-stops.txt:

jq -r '.features | .[] | .properties | (.BUSSTOP_ID | tostring)+"\t"+(.ON_STREET)+"\t"+(.CROSS_STRE)' busstop-layer.geojson | sort -u >bus-stops.txt

https://gist.githubusercontent.com/powersa/8a0c72365fd0f99693d2d411d4967da6/raw/8670b8b899f6b9144c6dd2d888af0d99892eb62a/bus-stops.txt

Launch Neo4j

Import stops:

LOAD CSV FROM "https://gist.githubusercontent.com/powersa/8a0c72365fd0f99693d2d411d4967da6/raw/8670b8b899f6b9144c6dd2d888af0d99892eb62a/bus-stops.txt" AS row FIELDTERMINATOR '\t'
CREATE (:Stop {id: toInteger(row[0]), onStreet:row[1], crossStreet: row[2]});

Import routes:

LOAD CSV FROM "https://gist.githubusercontent.com/powersa/8a0c72365fd0f99693d2d411d4967da6/raw/8b387f9d9a51e776f32660008077aacfc6bcd165/routes.txt" AS row FIELDTERMINATOR '\t'
CREATE (:Route {id: toInteger(row[0])});

Generate relationships

jq -r '.features | .[] | .properties | ((.BUSSTOP_ID | tostring)+"\t"+(.ROUTES | split(",") | .[]))' busstop-layer.geojson | sort -u >stop-route-relationships.txt

https://gist.githubusercontent.com/powersa/8a0c72365fd0f99693d2d411d4967da6/raw/03ae9b2ceeb935fcbdf9dddeb5a72a9468c5a8f5/stop-route-relationships.txt

Import relationships:

USING PERIODIC COMMIT 500
LOAD CSV FROM "https://gist.githubusercontent.com/powersa/0766e52dfb1d40098e0d6ebe9b29b243/raw/5d9c00154405fe1dab78e05204fd9c406e127350/stop-route-relationships.txt" AS row FIELDTERMINATOR '\t' 
MATCH (stop:Stop { id: toInteger(row[0])}),(route:Route { id: toInteger(row[1])})
CREATE (route)-[:STOPS]->(stop)

Select some data, look at the graph:

MATCH (n { id: 26 }) RETURN n
LOAD CSV FROM "https://gist.githubusercontent.com/powersa/8a0c72365fd0f99693d2d411d4967da6/raw/8b387f9d9a51e776f32660008077aacfc6bcd165/routes.txt" AS row FIELDTERMINATOR '\t'
CREATE (:Route {id: toInteger(row[0])});
USING PERIODIC COMMIT 500
LOAD CSV FROM "https://gist.githubusercontent.com/powersa/8a0c72365fd0f99693d2d411d4967da6/raw/49ce8e591d3ad45103097608171cfcc302c46f72/bus-stop-bus-relationships.txt" AS row FIELDTERMINATOR '\t' 
MATCH (one:Route { id: toInteger(row[0])}),(two:Route { id: toInteger(row[1])})
CREATE (one)-[:stops {stop_id: row[2], stop_name: row[3]}]->(two)

This is better: http://metro.kingcounty.gov/GTFS/

Maybe get realtime something with http://tripplanner.kingcounty.gov/hiwire?.a=iNextBusResults&StopId=27050&StopName=Latona%20Ave%20Ne%20@%20Ne%2045th%20St%20(NB)

Would be cool to measure how actual times differ from the schedule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment