Last active
May 20, 2016 20:42
-
-
Save ljwolf/9730be2bfa14327a262d21e78d55d5f5 to your computer and use it in GitHub Desktop.
RFC for ljwolf's GSOC
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Clarifying a Core Data Model for PySAL\n", | |
"\n", | |
"### tl; dr\n", | |
"\n", | |
"The big idea is this:\n", | |
"\n", | |
"PySAL's labelled array interface must look native to those who have Pandas, but invisible to those who do not. In the end, PySAL will expose a *labelled array interface*, using pandas `dataframes` & weights, and an *unlabelled array interface*, using numpy `ndarrays` & weights like we do now. What a no-GDAL python GIS/Geocomputation module (what I'm calling NOGR) will be clearer when the labelled array interface is decided.\n", | |
"\n", | |
"In short, I'm suggesting three deliverables needed for a filled-out tabular data model for PySAL:\n", | |
"\n", | |
"- `pdio`: try to read any datatype and return a df+geo \n", | |
" - Pandas tools for PySAL, the core of the labelled array API.\n", | |
" - Functions for working with tabular data, like:\n", | |
" - lag_spatial(W, Series) or lag_spatial(columns, dataframe=dataframe)\n", | |
" - centroids(dataframe)\n", | |
" - Extends & improves upon current pdio module to support all current PySAL readers\n", | |
" - When no PySAL reader is supported, call safely to fiona, then pandas.\n", | |
" - Not mandatory for a PySAL install, but \"fully supported\" as a native way to interact with PySAL.\n", | |
"- `geocomp`: tabular GIS for PySAL\n", | |
" - conversion interface between pdio's df+geo and geopandas, resolving longstanding issue #474\n", | |
" - extensions of shapely_ext to tabular GIS operations (dissolve, spatial join, etc), targeting feature-completeness of GIS operations in `geopandas`\n", | |
" - NOGR pure python/cython GIS\n", | |
"- work in `core`\n", | |
" - make unlabelled array api consistent (shaping, typing)\n", | |
" - add correct, safe calls into tabular handlers in pdio.\n", | |
" - incorporate soft-dependency paty-pandas code, safely enabled only when pda is importable\n", | |
" - finish the polymorphic weights constructors project Serge & I worked on last cycle, targeting the df+geocolumn. \n", | |
"\n", | |
"### Clarifying the GSOC proposal\n", | |
"\n", | |
"From my perspective, I see my proposal for GSOC as containing two oblique foci. My hope in drafting this notebook for my fellow authors is to make this clear.\n", | |
"\n", | |
"I think at this point, many people who use & develop PySAL would like to see better tooling for Pandas$\\leftrightarrow$PySAL interaction. This was started by Luc, Dani, & myself in the `pysal.pdio` contrib module. At its most basic, `pdio` reads/writes `dbf`s using PySAL IO. In addition, all shapes in a shapefile are read in and stuffed into a column of the dataframe. Unlike Geopandas, this occurs on a standard Pandas dataframe, rather than constructing a special subclass. \n", | |
"\n", | |
"In addition, interfaces between shapely and PySAL exist via the [`shapely_ext`](http://pysal.geodacenter.org/1.3/users/tutorials/shapely.html) contrib module. The shapely extension provides an interface between pysal and shapely using geointerface at the individual polygon level.\n", | |
"\n", | |
"My GSOC proposal's main point was that **I'd like to see PySAL gain a labelled array interface**, and that a specification for this interface *should* make it easy to do GIS, regardless of source. How we want this done remains an open question. I think there are really two aims in my proposal that I'd like to come to a consensus on about proirities:\n", | |
"\n", | |
"The directions are what I'd like to call:\n", | |
"- **NOGR**: A pure python/cython GIS\n", | |
"- **Pandas as Idiom**: A Tabular data idiom for PySAL\n", | |
"\n", | |
"These concern a common cycle that goes something like this:\n", | |
"\n", | |
"1. PySAL targets minimal dependencies from the scientific python stack.\n", | |
"2. Spatial Analysis sometimes has tabular operations & GIS as a prerequisite.\n", | |
"3. GeoPandas+Shapely/Pandas are the standard for GIS/tabular in Python.\n", | |
"\n", | |
"The **NOGR** project is really providing a method of solving **2** within the constraints of **1**.\n", | |
"\n", | |
"The **Pandas as Idiom** direction leverages **3** within the constraints of **1** to provide a consistent labelled array interface.\n", | |
"\n", | |
"These largely oblique to each other: \n", | |
"\n", | |
"- Pure Python GIS would not, by itself, define a tabular data model for PySAL\n", | |
"\n", | |
"- A tabular data model would make it **much** easier to interface with current GIS tools in python, but still means that users who need GIS would need GDAL. \n", | |
"\n", | |
"Thus, I'd like the mentors' & broader PySAL community's perspectives on how whether & how urgently a NOGR GIS package is needed before commencing my GSOC work. This feedback will inform and direct what parts of the proposal get emphasized. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### What we already have:" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"\n", | |
"Just to be clear, though, I'd like to show exactly how useful the shapely extension is. I don't really see it mentioned a lot, so I'd like to show what's already possible using just it and the `pdio`-style PySAL + Pandas Dataframes:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"import pysal as ps\n", | |
"from pysal.contrib import shapely_ext as psh\n", | |
"import shapely.geometry as sh\n", | |
"import shapely.ops as GIS\n", | |
"import numpy as np" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"data = ps.pdio.read_files(ps.examples.get_path('columbus.shp'))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Basic GIS:" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We have wrappers for almost all of the GIS operations in Shapely. They all take/recieve PySAL shapes, but use shapely for the actual operations:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 36, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"{'BaseGeometry',\n", | |
" 'CollectionOperator',\n", | |
" 'Point',\n", | |
" 'ValidateOp',\n", | |
" 'asLineString',\n", | |
" 'asMultiLineString',\n", | |
" 'asShape',\n", | |
" 'byref',\n", | |
" 'c_double',\n", | |
" 'c_void_p',\n", | |
" 'geom_factory',\n", | |
" 'izip',\n", | |
" 'lgeos',\n", | |
" 'linemerge',\n", | |
" 'nearest_points',\n", | |
" 'operator',\n", | |
" 'polygonize',\n", | |
" 'polygonize_full',\n", | |
" 'snap',\n", | |
" 'sys',\n", | |
" 'transform',\n", | |
" 'triangulate',\n", | |
" 'validate'}" | |
] | |
}, | |
"execution_count": 36, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"(set([x for x in dir(GIS) if not x.startswith('_')]).difference(dir(psh)))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We can also use most shapely methods as standalone functions:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 54, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"null = geom.Polygon()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 89, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['almost_equals',\n", | |
" 'area',\n", | |
" 'boundary',\n", | |
" 'bounds',\n", | |
" 'buffer',\n", | |
" 'centroid',\n", | |
" 'contains',\n", | |
" 'convex_hull',\n", | |
" 'crosses',\n", | |
" 'difference',\n", | |
" 'disjoint',\n", | |
" 'distance',\n", | |
" 'envelope',\n", | |
" 'equals',\n", | |
" 'equals_exact',\n", | |
" 'has_z',\n", | |
" 'interpolate',\n", | |
" 'intersection',\n", | |
" 'intersects',\n", | |
" 'is_empty',\n", | |
" 'is_ring',\n", | |
" 'is_simple',\n", | |
" 'is_valid',\n", | |
" 'length',\n", | |
" 'overlaps',\n", | |
" 'project',\n", | |
" 'relate',\n", | |
" 'representative_point',\n", | |
" 'simplify',\n", | |
" 'symmetric_difference',\n", | |
" 'to_wkb',\n", | |
" 'to_wkt',\n", | |
" 'touches',\n", | |
" 'union',\n", | |
" 'within']" | |
] | |
}, | |
"execution_count": 89, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"[x for x in dir(null) if (not x.startswith('_') and x in dir(psh))]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 73, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"#merge\n", | |
"full_CO = psh.cascaded_union(data.geometry.values.tolist())" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Since Shapely Polygons have a visual `__repr__`, I'll be casting to shapely to quickly show the shapes. But, they're still python shapes:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 74, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"<pysal.cg.shapes.Polygon at 0x7f4a34285f28>" | |
] | |
}, | |
"execution_count": 74, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"full_CO" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 75, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/svg+xml": [ | |
"<svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"100.0\" height=\"100.0\" viewBox=\"5.658406486511231 10.572129001617432 5.8455143165588375 4.3868212890625\" preserveAspectRatio=\"xMinYMin meet\"><g transform=\"matrix(1,0,0,-1,0,25.531079292297363)\"><path fill-rule=\"evenodd\" fill=\"#66cc99\" stroke=\"#555555\" stroke-width=\"0.11691028633117675\" opacity=\"0.6\" d=\"M 6.6781768798828125,11.447600364685059 L 6.6491851806640625,11.399640083312988 L 6.647164821624756,11.344690322875977 L 6.639161109924316,11.315719604492188 L 6.522275924682617,11.318730354309082 L 6.520257949829102,11.269769668579102 L 6.5162482261657715,11.236800193786621 L 6.542212009429932,11.209819793701172 L 6.549192905426025,11.1808500289917 L 6.55515718460083,11.10791015625 L 6.534173011779785,11.096920013427734 L 6.537120819091797,10.97803020477295 L 6.220462799072266,11.057000160217285 L 6.088609218597412,11.097979545593262 L 6.088623046875,11.132949829101562 L 6.084632873535156,11.148940086364746 L 6.085638046264648,11.160920143127441 L 6.035704135894775,11.202890396118164 L 5.9847540855407715,11.203900337219238 L 5.948803901672363,11.239870071411133 L 5.96979284286499,11.263850212097168 L 5.970815181732178,11.31779956817627 L 5.87490701675415,11.313819885253906 L 6.181705951690674,11.553569793701172 L 6.394561767578125,11.710399627685547 L 6.456532001495361,11.781330108642578 L 6.674379825592041,11.931170463562012 L 6.7633137702941895,11.981120109558105 L 6.942171096801758,12.0560302734375 L 7.004114151000977,12.067009925842285 L 7.078045845031738,12.076990127563477 L 7.125000953674316,12.078980445861816 L 7.185831069946289,11.84823989868164 L 7.317720890045166,11.856149673461914 L 7.508541107177734,11.872119903564453 L 7.4845499992370605,11.838150024414062 L 7.463559150695801,11.80918025970459 L 7.469532012939453,11.757220268249512 L 7.493495941162109,11.728249549865723 L 7.492452144622803,11.619339942932129 L 7.688279151916504,11.66327953338623 L 7.717282772064209,11.741209983825684 L 7.580474853515625,11.882100105285645 L 7.637434005737305,11.917059898376465 L 7.616511821746826,12.057939529418945 L 7.40372896194458,12.080949783325195 L 7.404763221740723,12.164870262145996 L 7.323862075805664,12.21183967590332 L 7.19602108001709,12.295780181884766 L 7.135107040405273,12.359729766845703 L 7.108152866363525,12.406700134277344 L 7.092213153839111,12.51261043548584 L 7.0613298416137695,12.725419998168945 L 7.143249988555908,12.724410057067871 L 7.155218124389648,12.674460411071777 L 7.173169136047363,12.598520278930664 L 7.206113815307617,12.544560432434082 L 7.27003812789917,12.510580062866211 L 7.341958045959473,12.486599922180176 L 7.429862976074219,12.46660041809082 L 7.459833145141602,12.463600158691406 L 7.481781005859375,12.389659881591797 L 7.498770236968994,12.403650283813477 L 7.516755104064941,12.409640312194824 L 7.548725128173828,12.412630081176758 L 7.5597147941589355,12.414629936218262 L 7.55869197845459,12.356679916381836 L 7.569672107696533,12.334699630737305 L 7.616611003875732,12.29872989654541 L 7.6695709228515625,12.327690124511719 L 7.721512794494629,12.307709693908691 L 7.708548069000244,12.36266040802002 L 7.639626979827881,12.389639854431152 L 7.646636009216309,12.430609703063965 L 7.842434883117676,12.405599594116211 L 8.131139755249023,12.373600006103516 L 8.137151718139648,12.416560173034668 L 8.14915943145752,12.462510108947754 L 8.160175323486328,12.528459548950195 L 8.155200958251953,12.5794095993042 L 8.147226333618164,12.619379997253418 L 8.148235321044922,12.645350456237793 L 8.13826847076416,12.703310012817383 L 8.138282775878906,12.737279891967773 L 8.122319221496582,12.787229537963867 L 8.062442779541016,12.944100379943848 L 8.052460670471191,12.963089942932129 L 8.052496910095215,13.052009582519531 L 8.048531532287598,13.129940032958984 L 8.032543182373047,13.117950439453125 L 8.01356029510498,13.114959716796875 L 7.989583969116211,13.115960121154785 L 7.962619781494141,13.137940406799316 L 7.923679828643799,13.191900253295898 L 7.898725986480713,13.243860244750977 L 7.88774299621582,13.25784969329834 L 7.871758937835693,13.25885009765625 L 7.868794918060303,13.338780403137207 L 7.866809844970703,13.371749877929688 L 7.851837158203125,13.40073013305664 L 7.844857215881348,13.432700157165527 L 7.8408918380737305,13.50862979888916 L 7.824925899505615,13.552599906921387 L 7.803965091705322,13.596559524536133 L 7.801973819732666,13.61553955078125 L 7.90187406539917,13.608539581298828 L 7.90288782119751,13.644510269165039 L 7.9967942237854,13.63949966430664 L 7.998781204223633,13.614520072937012 L 8.037740707397461,13.60752010345459 L 8.062715530395508,13.604519844055176 L 8.072694778442383,13.58053970336914 L 8.115653038024902,13.579540252685547 L 8.130637168884277,13.576539993286133 L 8.1974515914917,13.575770378112793 L 8.201574325561523,13.591509819030762 L 8.201583862304688,13.614489555358887 L 8.19859504699707,13.635479927062988 L 8.233589172363281,13.703410148620605 L 8.229602813720703,13.72739028930664 L 8.22661304473877,13.744379997253418 L 8.215643882751465,13.794329643249512 L 8.198686599731445,13.858280181884766 L 8.16972541809082,13.883259773254395 L 8.12777042388916,13.89225959777832 L 8.093802452087402,13.891260147094727 L 8.063838005065918,13.90526008605957 L 8.044872283935547,13.943220138549805 L 8.037888526916504,13.96720027923584 L 7.999115943908691,14.02457046508789 L 7.99936580657959,14.0349702835083 L 8.003013610839844,14.187020301818848 L 7.950088977813721,14.243969917297363 L 8.111939430236816,14.263930320739746 L 8.147891998291016,14.232959747314453 L 8.181855201721191,14.225959777832031 L 8.20982837677002,14.226949691772461 L 8.252790451049805,14.236940383911133 L 8.282757759094238,14.229940414428711 L 8.330711364746094,14.229940414428711 L 8.383658409118652,14.228930473327637 L 8.444600105285645,14.228919982910156 L 8.544504165649414,14.23490047454834 L 8.624129295349121,14.236980438232422 L 8.559700012207031,14.742449760437012 L 8.809452056884766,14.734430313110352 L 8.808412551879883,14.636520385742188 L 8.919304847717285,14.638500213623047 L 9.087138175964355,14.63049030303955 L 9.09996509552002,14.244830131530762 L 9.015047073364258,14.241840362548828 L 9.008951187133789,13.995059967041016 L 9.008928298950195,13.9381103515625 L 9.34359073638916,13.913080215454102 L 9.351485252380371,13.67529010772705 L 9.298501014709473,13.589380264282227 L 9.310471534729004,13.54541015625 L 9.401384353637695,13.550399780273438 L 9.43441104888916,13.694270133972168 L 9.605246543884277,13.698240280151367 L 9.651198387145996,13.692239761352539 L 9.687166213989258,13.697230339050293 L 9.686145782470703,13.645279884338379 L 9.845992088317871,13.652250289916992 L 10.050789833068848,13.650230407714844 L 10.103719711303711,13.603260040283203 L 10.175629615783691,13.565290451049805 L 10.1806001663208,13.482359886169434 L 10.16759967803955,13.471369743347168 L 10.153610229492188,13.454389572143555 L 10.1356201171875,13.439399719238281 L 10.119629859924316,13.429409980773926 L 10.121600151062012,13.344490051269531 L 10.096619606018066,13.342490196228027 L 10.085630416870117,13.333499908447266 L 10.05265998840332,13.33650016784668 L 10.027669906616211,13.298540115356445 L 9.950613975524902,12.983830451965332 L 10.004560470581055,12.976829528808594 L 10.082509994506836,13.033769607543945 L 10.092499732971191,13.052749633789062 L 10.126489639282227,13.090709686279297 L 10.182459831237793,13.157649993896484 L 10.267419815063477,13.254549980163574 L 10.289389610290527,13.247550010681152 L 10.316360473632812,13.244549751281738 L 10.33234977722168,13.23954963684082 L 10.368260383605957,13.246430397033691 L 10.39428997039795,13.246540069580078 L 10.417269706726074,13.247540473937988 L 10.450240135192871,13.251529693603516 L 10.4752197265625,13.261520385742188 L 10.4752197265625,13.272509574890137 L 10.534159660339355,13.271499633789062 L 10.566129684448242,13.271499633789062 L 10.603090286254883,13.26550006866455 L 10.632060050964355,13.263489723205566 L 10.639049530029297,13.24450969696045 L 10.640040397644043,13.21953010559082 L 10.649020195007324,13.199549674987793 L 10.646010398864746,13.176569938659668 L 10.64700984954834,13.15758991241455 L 10.645999908447266,13.144599914550781 L 10.637999534606934,13.129610061645508 L 10.63899040222168,13.104630470275879 L 10.631979942321777,13.070659637451172 L 10.62697982788086,13.051679611206055 L 10.627969741821289,13.026700019836426 L 10.631959915161133,13.008720397949219 L 10.633950233459473,12.983739852905273 L 10.62893009185791,12.945779800415039 L 10.639909744262695,12.918800354003906 L 10.637900352478027,12.890819549560547 L 10.645890235900879,12.865839958190918 L 10.647870063781738,12.842860221862793 L 10.649680137634277,12.830180168151855 L 10.70281982421875,12.838859558105469 L 10.709790229797363,12.774909973144531 L 10.664819717407227,12.762929916381836 L 10.6697998046875,12.703980445861816 L 10.667770385742188,12.648030281066895 L 10.776670455932617,12.652009963989258 L 10.88755989074707,12.644009590148926 L 10.888489723205566,12.49213981628418 L 10.884440422058105,12.34926986694336 L 10.884380340576172,12.19340991973877 L 10.876319885253906,12.037540435791016 L 10.886269569396973,11.946619987487793 L 10.915240287780762,11.941619873046875 L 10.932220458984375,11.938619613647461 L 10.963190078735352,11.940620422363281 L 10.983169555664062,11.936619758605957 L 10.995160102844238,11.937620162963867 L 11.045080184936523,11.853679656982422 L 11.112970352172852,11.761759757995605 L 11.204830169677734,11.638850212097168 L 11.126910209655762,11.633870124816895 L 11.126890182495117,11.59589958190918 L 11.138870239257812,11.56991958618164 L 11.139849662780762,11.530960083007812 L 11.141839981079102,11.505979537963867 L 11.12285041809082,11.492989540100098 L 11.122790336608887,11.336130142211914 L 11.122209548950195,11.315879821777344 L 11.120759963989258,11.265190124511719 L 11.119729995727539,11.199250221252441 L 11.123700141906738,11.127309799194336 L 11.14465045928955,11.05636978149414 L 11.181599617004395,11.025400161743164 L 11.221540451049805,10.978429794311523 L 11.256489753723145,10.922479629516602 L 11.275449752807617,10.888500213623047 L 11.287420272827148,10.84253978729248 L 11.286399841308594,10.790590286254883 L 11.013669967651367,10.788629531860352 L 10.963720321655273,10.799619674682617 L 10.88379955291748,10.80762004852295 L 10.781900405883789,10.812629699707031 L 10.708979606628418,10.819640159606934 L 10.694000244140625,10.833629608154297 L 10.649069786071777,10.890580177307129 L 10.616109848022461,10.925559997558594 L 10.588159561157227,10.978509902954102 L 10.592169761657715,10.994500160217285 L 10.691129684448242,11.140359878540039 L 10.687170028686523,11.2222900390625 L 10.613240242004395,11.216300010681152 L 10.595250129699707,11.213310241699219 L 10.555290222167969,11.22029972076416 L 10.480369567871094,11.221309661865234 L 10.421429634094238,11.23231029510498 L 10.364489555358887,11.254300117492676 L 10.343520164489746,11.272279739379883 L 10.329560279846191,11.328240394592285 L 10.320590019226074,11.369199752807617 L 10.279640197753906,11.403180122375488 L 10.226710319519043,11.45613956451416 L 10.168800354003906,11.523090362548828 L 10.109880447387695,11.5900297164917 L 10.07094955444336,11.67197036743164 L 10.05659008026123,11.74390983581543 L 9.917131423950195,11.737930297851562 L 9.918045997619629,11.531109809875488 L 9.963891983032227,11.268340110778809 L 9.778072357177734,11.265359878540039 L 9.776052474975586,11.211409568786621 L 9.779011726379395,11.118490219116211 L 9.774969100952148,11.007590293884277 L 9.775935173034668,10.92866039276123 L 9.78189754486084,10.853730201721191 L 9.095559120178223,10.828829765319824 L 8.85079574584961,10.821869850158691 L 8.815848350524902,10.867839813232422 L 8.789886474609375,10.899809837341309 L 8.7659273147583,10.938779830932617 L 8.752955436706543,10.978750228881836 L 8.747973442077637,11.010720252990723 L 8.753978729248047,11.035699844360352 L 8.750991821289062,11.060669898986816 L 8.622117042541504,11.059690475463867 L 8.631120681762695,11.089659690856934 L 8.634133338928223,11.126629829406738 L 8.63216495513916,11.199569702148438 L 8.665151596069336,11.246520042419434 L 8.68313980102539,11.2575101852417 L 8.69713020324707,11.267499923706055 L 8.67520809173584,11.405380249023438 L 8.77213191986084,11.450329780578613 L 8.713337898254395,11.810020446777344 L 8.618428230285645,11.80504035949707 L 8.508537292480469,11.809040069580078 L 8.386432647705078,11.785490036010742 L 8.37365436553955,11.774089813232422 L 8.282732009887695,11.74413013458252 L 8.256792068481445,11.828060150146484 L 8.103923797607422,11.785120010375977 L 8.120859146118164,11.670220375061035 L 7.733179092407227,11.527389526367188 L 7.710183143615723,11.482439994812012 L 7.674202919006348,11.446470260620117 L 7.624241828918457,11.421500205993652 L 7.562289237976074,11.388540267944336 L 7.4683637619018555,11.349579811096191 L 7.398374080657959,11.329950332641602 L 7.404403209686279,11.29263973236084 L 7.38238000869751,11.18474006652832 L 7.359377861022949,11.12479019165039 L 7.327386856079102,11.071849822998047 L 7.275406837463379,10.997920036315918 L 6.82789421081543,11.116869926452637 L 6.879881858825684,11.211779594421387 L 6.81195592880249,11.22877025604248 L 6.83095121383667,11.262740135192871 L 6.771010875701904,11.26574993133545 L 6.7540388107299805,11.292719841003418 L 6.7860212326049805,11.325690269470215 L 6.778051853179932,11.380640029907227 L 6.758076190948486,11.392629623413086 L 6.758334159851074,11.462329864501953 L 6.6781768798828125,11.447600364685059 z\" /></g></svg>" | |
], | |
"text/plain": [ | |
"<shapely.geometry.polygon.PolygonAdapter at 0x7f4a342852e8>" | |
] | |
}, | |
"execution_count": 75, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"sh.asShape(full_CO)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 76, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"False\n", | |
"True\n" | |
] | |
} | |
], | |
"source": [ | |
"print(psh.contains(full_CO, geom.Point(0,0)))\n", | |
"print(psh.contains(full_CO, data.geometry[0]))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 77, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"west_CO = (data[data.geometry.apply(lambda x: x.centroid[0] < 9)])" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 78, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/svg+xml": [ | |
"<svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"100.0\" height=\"100.0\" viewBox=\"8.524526634216308 10.659851112365722 2.8916720581054705 3.477017326354982\" preserveAspectRatio=\"xMinYMin meet\"><g transform=\"matrix(1,0,0,-1,0,24.796719551086426)\"><path fill-rule=\"evenodd\" fill=\"#66cc99\" stroke=\"#555555\" stroke-width=\"0.06954034652709964\" opacity=\"0.6\" d=\"M 9.008951187133789,13.995059967041016 L 9.008928298950195,13.9381103515625 L 9.34359073638916,13.913080215454102 L 9.351485252380371,13.67529010772705 L 9.298501014709473,13.589380264282227 L 9.310471534729004,13.54541015625 L 9.401384353637695,13.550399780273438 L 9.43441104888916,13.694270133972168 L 9.605246543884277,13.698240280151367 L 9.651198387145996,13.692239761352539 L 9.687166213989258,13.697230339050293 L 9.686145782470703,13.645279884338379 L 9.845992088317871,13.652250289916992 L 10.050789833068848,13.650230407714844 L 10.103719711303711,13.603260040283203 L 10.175629615783691,13.565290451049805 L 10.1806001663208,13.482359886169434 L 10.16759967803955,13.471369743347168 L 10.153610229492188,13.454389572143555 L 10.1356201171875,13.439399719238281 L 10.119629859924316,13.429409980773926 L 10.121600151062012,13.344490051269531 L 10.096619606018066,13.342490196228027 L 10.085630416870117,13.333499908447266 L 10.05265998840332,13.33650016784668 L 10.027669906616211,13.298540115356445 L 9.950613975524902,12.983830451965332 L 10.004560470581055,12.976829528808594 L 10.082509994506836,13.033769607543945 L 10.092499732971191,13.052749633789062 L 10.126489639282227,13.090709686279297 L 10.182459831237793,13.157649993896484 L 10.267419815063477,13.254549980163574 L 10.289389610290527,13.247550010681152 L 10.316360473632812,13.244549751281738 L 10.33234977722168,13.23954963684082 L 10.368260383605957,13.246430397033691 L 10.39428997039795,13.246540069580078 L 10.417269706726074,13.247540473937988 L 10.450240135192871,13.251529693603516 L 10.4752197265625,13.261520385742188 L 10.4752197265625,13.272509574890137 L 10.534159660339355,13.271499633789062 L 10.566129684448242,13.271499633789062 L 10.603090286254883,13.26550006866455 L 10.632060050964355,13.263489723205566 L 10.639049530029297,13.24450969696045 L 10.640040397644043,13.21953010559082 L 10.649020195007324,13.199549674987793 L 10.646010398864746,13.176569938659668 L 10.64700984954834,13.15758991241455 L 10.645999908447266,13.144599914550781 L 10.637999534606934,13.129610061645508 L 10.63899040222168,13.104630470275879 L 10.631979942321777,13.070659637451172 L 10.62697982788086,13.051679611206055 L 10.627969741821289,13.026700019836426 L 10.631959915161133,13.008720397949219 L 10.633950233459473,12.983739852905273 L 10.62893009185791,12.945779800415039 L 10.639909744262695,12.918800354003906 L 10.637900352478027,12.890819549560547 L 10.645890235900879,12.865839958190918 L 10.647870063781738,12.842860221862793 L 10.649680137634277,12.830180168151855 L 10.70281982421875,12.838859558105469 L 10.709790229797363,12.774909973144531 L 10.664819717407227,12.762929916381836 L 10.6697998046875,12.703980445861816 L 10.667770385742188,12.648030281066895 L 10.776670455932617,12.652009963989258 L 10.88755989074707,12.644009590148926 L 10.888489723205566,12.49213981628418 L 10.884440422058105,12.34926986694336 L 10.884380340576172,12.19340991973877 L 10.876319885253906,12.037540435791016 L 10.886269569396973,11.946619987487793 L 10.915240287780762,11.941619873046875 L 10.932220458984375,11.938619613647461 L 10.963190078735352,11.940620422363281 L 10.983169555664062,11.936619758605957 L 10.995160102844238,11.937620162963867 L 11.045080184936523,11.853679656982422 L 11.112970352172852,11.761759757995605 L 11.204830169677734,11.638850212097168 L 11.126910209655762,11.633870124816895 L 11.126890182495117,11.59589958190918 L 11.138870239257812,11.56991958618164 L 11.139849662780762,11.530960083007812 L 11.141839981079102,11.505979537963867 L 11.12285041809082,11.492989540100098 L 11.122790336608887,11.336130142211914 L 11.122209548950195,11.315879821777344 L 11.120759963989258,11.265190124511719 L 11.119729995727539,11.199250221252441 L 11.123700141906738,11.127309799194336 L 11.14465045928955,11.05636978149414 L 11.181599617004395,11.025400161743164 L 11.221540451049805,10.978429794311523 L 11.256489753723145,10.922479629516602 L 11.275449752807617,10.888500213623047 L 11.287420272827148,10.84253978729248 L 11.286399841308594,10.790590286254883 L 11.013669967651367,10.788629531860352 L 10.963720321655273,10.799619674682617 L 10.88379955291748,10.80762004852295 L 10.781900405883789,10.812629699707031 L 10.708979606628418,10.819640159606934 L 10.694000244140625,10.833629608154297 L 10.649069786071777,10.890580177307129 L 10.616109848022461,10.925559997558594 L 10.588159561157227,10.978509902954102 L 10.592169761657715,10.994500160217285 L 10.691129684448242,11.140359878540039 L 10.687170028686523,11.2222900390625 L 10.613240242004395,11.216300010681152 L 10.595250129699707,11.213310241699219 L 10.555290222167969,11.22029972076416 L 10.480369567871094,11.221309661865234 L 10.421429634094238,11.23231029510498 L 10.364489555358887,11.254300117492676 L 10.343520164489746,11.272279739379883 L 10.329560279846191,11.328240394592285 L 10.320590019226074,11.369199752807617 L 10.279640197753906,11.403180122375488 L 10.226710319519043,11.45613956451416 L 10.168800354003906,11.523090362548828 L 10.109880447387695,11.5900297164917 L 10.07094955444336,11.67197036743164 L 10.05659008026123,11.74390983581543 L 9.917131423950195,11.737930297851562 L 9.918045997619629,11.531109809875488 L 9.963891983032227,11.268340110778809 L 9.778072357177734,11.265359878540039 L 9.776052474975586,11.211409568786621 L 9.779011726379395,11.118490219116211 L 9.774969100952148,11.007590293884277 L 9.775935173034668,10.92866039276123 L 9.78189754486084,10.853730201721191 L 9.095559120178223,10.828829765319824 L 9.10042953491211,10.899089813232422 L 9.135557174682617,10.917750358581543 L 9.146570205688477,10.976699829101562 L 9.178548812866211,10.999670028686523 L 9.175572395324707,11.048629760742188 L 9.174186706542969,11.060310363769531 L 9.171592712402344,11.09158992767334 L 9.093674659729004,11.10359001159668 L 9.098838806152344,11.115389823913574 L 9.107674598693848,11.135560035705566 L 9.111679077148438,11.157540321350098 L 9.10374927520752,11.308409690856934 L 9.082836151123047,11.456330299377441 L 9.09585952758789,11.559189796447754 L 9.085843086242676,11.639080047607422 L 9.084967613220215,11.792989730834961 L 9.090069770812988,11.873970031738281 L 9.093652725219727,11.930830001831055 L 9.098085403442383,12.025779724121094 L 9.100104331970215,12.063579559326172 L 9.103628158569336,12.128439903259277 L 9.115089416503906,12.158659934997559 L 9.131059646606445,12.182160377502441 L 9.129522323608398,12.284629821777344 L 9.127169609069824,12.441459655761719 L 9.124981880187988,12.587260246276855 L 9.124277114868164,12.63424015045166 L 9.146787643432617,12.658740043640137 L 9.166311264038086,12.679980278015137 L 9.187246322631836,12.7091703414917 L 9.206208229064941,12.75883960723877 L 9.213532447814941,12.778030395507812 L 9.220885276794434,12.805720329284668 L 9.233386993408203,12.86400032043457 L 8.943572998046875,12.86221981048584 L 8.757728576660156,12.861089706420898 L 8.733969688415527,13.116339683532715 L 8.685274124145508,13.639519691467285 L 8.677577018737793,13.722209930419922 L 8.666550636291504,13.861700057983398 L 8.662188529968262,13.909899711608887 L 8.653305053710938,14.008090019226074 L 8.818140029907227,14.002050399780273 L 9.008951187133789,13.995059967041016 z\" /></g></svg>" | |
], | |
"text/plain": [ | |
"<shapely.geometry.polygon.PolygonAdapter at 0x7f4a342948d0>" | |
] | |
}, | |
"execution_count": 78, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"diff = psh.difference(full_CO, psh.unary_union(west_CO.geometry.tolist()))\n", | |
"sh.asShape(diff)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 79, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"<function pysal.contrib.shapely_ext.buffer>" | |
] | |
}, | |
"execution_count": 79, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"psh.buffer" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 80, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>AREA</th>\n", | |
" <th>PERIMETER</th>\n", | |
" <th>COLUMBUS_</th>\n", | |
" <th>COLUMBUS_I</th>\n", | |
" <th>POLYID</th>\n", | |
" <th>NEIG</th>\n", | |
" <th>HOVAL</th>\n", | |
" <th>INC</th>\n", | |
" <th>CRIME</th>\n", | |
" <th>OPEN</th>\n", | |
" <th>...</th>\n", | |
" <th>DISCBD</th>\n", | |
" <th>X</th>\n", | |
" <th>Y</th>\n", | |
" <th>NSA</th>\n", | |
" <th>NSB</th>\n", | |
" <th>EW</th>\n", | |
" <th>CP</th>\n", | |
" <th>THOUS</th>\n", | |
" <th>NEIGNO</th>\n", | |
" <th>geometry</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>0.309441</td>\n", | |
" <td>2.440629</td>\n", | |
" <td>2</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" <td>5</td>\n", | |
" <td>80.467003</td>\n", | |
" <td>19.531</td>\n", | |
" <td>15.725980</td>\n", | |
" <td>2.850747</td>\n", | |
" <td>...</td>\n", | |
" <td>5.03</td>\n", | |
" <td>38.799999</td>\n", | |
" <td>44.070000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1000.0</td>\n", | |
" <td>1005.0</td>\n", | |
" <td><pysal.cg.shapes.Polygon object at 0x7f4a3536b...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>0.259329</td>\n", | |
" <td>2.236939</td>\n", | |
" <td>3</td>\n", | |
" <td>1</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" <td>44.567001</td>\n", | |
" <td>21.232</td>\n", | |
" <td>18.801754</td>\n", | |
" <td>5.296720</td>\n", | |
" <td>...</td>\n", | |
" <td>4.27</td>\n", | |
" <td>35.619999</td>\n", | |
" <td>42.380001</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1000.0</td>\n", | |
" <td>1001.0</td>\n", | |
" <td><pysal.cg.shapes.Polygon object at 0x7f4a3536b...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>0.192468</td>\n", | |
" <td>2.187547</td>\n", | |
" <td>4</td>\n", | |
" <td>6</td>\n", | |
" <td>3</td>\n", | |
" <td>6</td>\n", | |
" <td>26.350000</td>\n", | |
" <td>15.956</td>\n", | |
" <td>30.626781</td>\n", | |
" <td>4.534649</td>\n", | |
" <td>...</td>\n", | |
" <td>3.89</td>\n", | |
" <td>39.820000</td>\n", | |
" <td>41.180000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1000.0</td>\n", | |
" <td>1006.0</td>\n", | |
" <td><pysal.cg.shapes.Polygon object at 0x7f4a3536b...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>0.083841</td>\n", | |
" <td>1.427635</td>\n", | |
" <td>5</td>\n", | |
" <td>2</td>\n", | |
" <td>4</td>\n", | |
" <td>2</td>\n", | |
" <td>33.200001</td>\n", | |
" <td>4.477</td>\n", | |
" <td>32.387760</td>\n", | |
" <td>0.394427</td>\n", | |
" <td>...</td>\n", | |
" <td>3.70</td>\n", | |
" <td>36.500000</td>\n", | |
" <td>40.520000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1000.0</td>\n", | |
" <td>1002.0</td>\n", | |
" <td><pysal.cg.shapes.Polygon object at 0x7f4a3536b...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>0.488888</td>\n", | |
" <td>2.997133</td>\n", | |
" <td>6</td>\n", | |
" <td>7</td>\n", | |
" <td>5</td>\n", | |
" <td>7</td>\n", | |
" <td>23.225000</td>\n", | |
" <td>11.252</td>\n", | |
" <td>50.731510</td>\n", | |
" <td>0.405664</td>\n", | |
" <td>...</td>\n", | |
" <td>2.83</td>\n", | |
" <td>40.009998</td>\n", | |
" <td>38.000000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1000.0</td>\n", | |
" <td>1007.0</td>\n", | |
" <td><pysal.cg.shapes.Polygon object at 0x7f4a3536b...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>6</th>\n", | |
" <td>0.257084</td>\n", | |
" <td>2.554577</td>\n", | |
" <td>8</td>\n", | |
" <td>4</td>\n", | |
" <td>7</td>\n", | |
" <td>4</td>\n", | |
" <td>75.000000</td>\n", | |
" <td>8.438</td>\n", | |
" <td>0.178269</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>...</td>\n", | |
" <td>2.74</td>\n", | |
" <td>33.360001</td>\n", | |
" <td>38.410000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1000.0</td>\n", | |
" <td>1004.0</td>\n", | |
" <td><pysal.cg.shapes.Polygon object at 0x7f4a3536b...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>7</th>\n", | |
" <td>0.204954</td>\n", | |
" <td>2.139524</td>\n", | |
" <td>9</td>\n", | |
" <td>3</td>\n", | |
" <td>8</td>\n", | |
" <td>3</td>\n", | |
" <td>37.125000</td>\n", | |
" <td>11.337</td>\n", | |
" <td>38.425858</td>\n", | |
" <td>3.483478</td>\n", | |
" <td>...</td>\n", | |
" <td>2.89</td>\n", | |
" <td>36.709999</td>\n", | |
" <td>38.709999</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>1000.0</td>\n", | |
" <td>1003.0</td>\n", | |
" <td><pysal.cg.shapes.Polygon object at 0x7f4a3530c...</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>7 rows × 21 columns</p>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" AREA PERIMETER COLUMBUS_ COLUMBUS_I POLYID NEIG HOVAL \\\n", | |
"0 0.309441 2.440629 2 5 1 5 80.467003 \n", | |
"1 0.259329 2.236939 3 1 2 1 44.567001 \n", | |
"2 0.192468 2.187547 4 6 3 6 26.350000 \n", | |
"3 0.083841 1.427635 5 2 4 2 33.200001 \n", | |
"4 0.488888 2.997133 6 7 5 7 23.225000 \n", | |
"6 0.257084 2.554577 8 4 7 4 75.000000 \n", | |
"7 0.204954 2.139524 9 3 8 3 37.125000 \n", | |
"\n", | |
" INC CRIME OPEN \\\n", | |
"0 19.531 15.725980 2.850747 \n", | |
"1 21.232 18.801754 5.296720 \n", | |
"2 15.956 30.626781 4.534649 \n", | |
"3 4.477 32.387760 0.394427 \n", | |
"4 11.252 50.731510 0.405664 \n", | |
"6 8.438 0.178269 0.000000 \n", | |
"7 11.337 38.425858 3.483478 \n", | |
"\n", | |
" ... DISCBD X \\\n", | |
"0 ... 5.03 38.799999 \n", | |
"1 ... 4.27 35.619999 \n", | |
"2 ... 3.89 39.820000 \n", | |
"3 ... 3.70 36.500000 \n", | |
"4 ... 2.83 40.009998 \n", | |
"6 ... 2.74 33.360001 \n", | |
"7 ... 2.89 36.709999 \n", | |
"\n", | |
" Y NSA NSB EW CP THOUS NEIGNO \\\n", | |
"0 44.070000 1.0 1.0 1.0 0.0 1000.0 1005.0 \n", | |
"1 42.380001 1.0 1.0 0.0 0.0 1000.0 1001.0 \n", | |
"2 41.180000 1.0 1.0 1.0 0.0 1000.0 1006.0 \n", | |
"3 40.520000 1.0 1.0 0.0 0.0 1000.0 1002.0 \n", | |
"4 38.000000 1.0 1.0 1.0 0.0 1000.0 1007.0 \n", | |
"6 38.410000 1.0 1.0 0.0 0.0 1000.0 1004.0 \n", | |
"7 38.709999 1.0 1.0 0.0 0.0 1000.0 1003.0 \n", | |
"\n", | |
" geometry \n", | |
"0 <pysal.cg.shapes.Polygon object at 0x7f4a3536b... \n", | |
"1 <pysal.cg.shapes.Polygon object at 0x7f4a3536b... \n", | |
"2 <pysal.cg.shapes.Polygon object at 0x7f4a3536b... \n", | |
"3 <pysal.cg.shapes.Polygon object at 0x7f4a3536b... \n", | |
"4 <pysal.cg.shapes.Polygon object at 0x7f4a3536b... \n", | |
"6 <pysal.cg.shapes.Polygon object at 0x7f4a3536b... \n", | |
"7 <pysal.cg.shapes.Polygon object at 0x7f4a3530c... \n", | |
"\n", | |
"[7 rows x 21 columns]" | |
] | |
}, | |
"execution_count": 80, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"buffer_query = psh.buffer(data.geometry[1], .6)\n", | |
"data[data.geometry.apply(lambda x: psh.intersects(buffer_query, x))]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Note that none of this raw shapely/pandas stuff uses any spatial indexing, which is [something Geopandas uses](https://github.com/geopandas/geopandas/blob/77b9b3b7e87456bb75f569725aa38285ea4e35d0/geopandas/sindex.py). " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 81, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"US = ps.pdio.read_files(ps.examples.get_path('NAT.shp'))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 82, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"states = US.groupby('STATE_NAME').geometry.apply(psh.unary_union)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 83, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"STATE_NAME\n", | |
"Alabama <pysal.cg.shapes.Polygon object at 0x7f4a5d9e2...\n", | |
"Arizona <pysal.cg.shapes.Polygon object at 0x7f4a351d1...\n", | |
"Arkansas <pysal.cg.shapes.Polygon object at 0x7f4a3519b...\n", | |
"California <pysal.cg.shapes.Polygon object at 0x7f4a351d1...\n", | |
"Colorado <pysal.cg.shapes.Polygon object at 0x7f4a351ea...\n", | |
"Name: geometry, dtype: object" | |
] | |
}, | |
"execution_count": 83, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"states.head()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 84, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"STATE_NAME\n", | |
"Alabama Valid Geometry\n", | |
"Arizona Valid Geometry\n", | |
"Arkansas Valid Geometry\n", | |
"California Valid Geometry\n", | |
"Colorado Valid Geometry\n", | |
"Connecticut Valid Geometry\n", | |
"Delaware Valid Geometry\n", | |
"District of Columbia Valid Geometry\n", | |
"Florida Valid Geometry\n", | |
"Georgia Valid Geometry\n", | |
"Idaho Valid Geometry\n", | |
"Illinois Valid Geometry\n", | |
"Indiana Valid Geometry\n", | |
"Iowa Valid Geometry\n", | |
"Kansas Valid Geometry\n", | |
"Kentucky Valid Geometry\n", | |
"Louisiana Valid Geometry\n", | |
"Maine Valid Geometry\n", | |
"Maryland Valid Geometry\n", | |
"Massachusetts Valid Geometry\n", | |
"Michigan Valid Geometry\n", | |
"Minnesota Valid Geometry\n", | |
"Mississippi Valid Geometry\n", | |
"Missouri Valid Geometry\n", | |
"Montana Valid Geometry\n", | |
"Nebraska Valid Geometry\n", | |
"Nevada Valid Geometry\n", | |
"New Hampshire Valid Geometry\n", | |
"New Jersey Valid Geometry\n", | |
"New Mexico Valid Geometry\n", | |
"New York Valid Geometry\n", | |
"North Carolina Valid Geometry\n", | |
"North Dakota Valid Geometry\n", | |
"Ohio Valid Geometry\n", | |
"Oklahoma Valid Geometry\n", | |
"Oregon Valid Geometry\n", | |
"Pennsylvania Valid Geometry\n", | |
"Rhode Island Valid Geometry\n", | |
"South Carolina Valid Geometry\n", | |
"South Dakota Valid Geometry\n", | |
"Tennessee Valid Geometry\n", | |
"Texas Valid Geometry\n", | |
"Utah Valid Geometry\n", | |
"Vermont Valid Geometry\n", | |
"Virginia Valid Geometry\n", | |
"Washington Valid Geometry\n", | |
"West Virginia Valid Geometry\n", | |
"Wisconsin Valid Geometry\n", | |
"Wyoming Valid Geometry\n", | |
"Name: geometry, dtype: object" | |
] | |
}, | |
"execution_count": 84, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"states.apply(lambda x: GIS.validate(sh.asShape(x)))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Weights" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 85, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"from functools import partial" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"This is a very naive way to construct spatial weights, since it doesn't use any prefiltering or spatial query and doesn't recognize symmetry in touches. So, don't use this on something big, like `NAT.shp`, because it will take forever. \n", | |
"\n", | |
"But, for the purposes of illustration, Queen Weights can be constructed using chained method calls on tabular data. This pattern is how most of the Queen Weights functions in PostGIS I've seen work, using [array queries](http://www.postgresql.org/docs/current/static/arrays.html). \n", | |
"\n", | |
"Since, in PostGIS, the `ST_Touches` naturally uses a [GIST](http://postgis.net/docs/manual-1.3/ch03.html), it does a similar kind of binning reduction we use in PySAL. The following does not, but still yields Queen weights:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 86, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"Wt = (data\n", | |
" .geometry #for all geometries\n", | |
" .apply(lambda x:\n", | |
" data\n", | |
" .geometry #go back through geometry\n", | |
" .apply(partial(psh.touches, x))) # check if any touch current x, partial to ensure closure\n", | |
" .values.astype(int)) #cast bools to ints\n", | |
"Wps = ps.queen_from_shapefile(ps.examples.get_path('columbus.shp')).full()[0]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 87, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"np.testing.assert_allclose(Wt, Wps)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# NOGR" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"So, in my vision, a **NOGR** focus would be to make the above code run without the lines:\n", | |
"```python\n", | |
"import shapely.geom as geom\n", | |
"import shapely.ops as GIS\n", | |
"from pysal.contrib import shapely_ext as psh\n", | |
"```\n", | |
"\n", | |
"In short, this means implementing a `GIS` module using only python/cython that can do what `shapely.ops` does on `shapely.geometry` objects. I would probably target replicating most of the functions provided in sections [8.5, 8.6, and 8.9 of the PostGIS documentation](http://postgis.net/docs/reference.html#Geometry_Accessors), ensuring they work when supplied with a dataframe with a column of known spatial type (i.e. containing shapely polygons (geopandas), pysal shapes (`pdio`), or `object.__geointerface__` for all objects).\n", | |
"\n", | |
"Ultimately, returning to the cycle of discussion mentioned above, this would reflect PySAL prioritizing point 2 to achieve point 1:\n", | |
"\n", | |
"> Need: **(1)** PySAL needs its own tabular/GIS Module with minimal dependencies\n", | |
"\n", | |
"> Reason: **(2)** Spatial Analysis often requires tabular/GIS operations\n", | |
"\n", | |
"This is where the proposal that suggests novel python/cython implementations of required algorithms from [De Berg's *Computational Geometry*](http://www.amazon.com/Computational-Geometry-Applications-Mark-Berg/dp/3540779736) or [Xiao's *GIS Algorithms*](http://www.amazon.com/Algorithms-Advances-Geographic-Information-Technology/dp/1446274330/ref=sr_1_1?s=books&ie=UTF8&qid=1463684492&sr=1-1&keywords=ningchuan+xiao) be the focus of the project. But, implementing a fast NOGR GIS library would be the core here. If there were time at the end, the datamodel could be extended to use the GIS operations provided here." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": true | |
}, | |
"source": [ | |
"# A Tabular Data Model" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The **Pandas as Idiom** direction is centrally about PySAL's interface with Pandas dataframes with geometric columns (or Geopandas Geodataframes) as if they were another native datamodel for PySAL. That is, if I prioritize in this direction, I would be attempting to make the boundary between PySAL, Pandas, & GeoPandas/Shapely a little smoother, **without** introducing new hard dependencies. This focuses on leveraging against packages in **3** within the constraints of **1**:\n", | |
"\n", | |
"> Need: **(1)** PySAL needs to abstract its datastructures from its data representations/IO framework\n", | |
"\n", | |
"> Reason: **(3)** It is unclear how to leverage existing Tabular & GIS tools alongside of PySAL\n", | |
"\n", | |
"It may end up looking like an answer to [Issue #474: Add contrib module to explore interface with geopandas](https://github.com/pysal/pysal/issues/474). But, instead of an exploration it would be an instantiation, would support **both** the PySAL geometry-based `pdio` & the Shapely-geometry Geopandas.\n", | |
"\n", | |
"Instead of focusing on pure python/cython implementations of core GIS operations, this would focus on making PySAL's interfaces become more polymorphic. It would reduce the gap between PySAL & Geopandas/Shapely, while keeping a bright line between what uses GDAL and what does not. I imagine this happening through extensive use of [optional import](https://github.com/ljwolf/pysal/blob/dev/pysal/spreg/optional_imports.ipynb) strategies.\n", | |
"\n", | |
"In it, I see three components:\n", | |
"\n", | |
"1. An adapter module in contrib, `geotable`, that does a few things:\n", | |
" - manages transitions between Geopandas dataframes and `pdio` dataframes\n", | |
" - minimizes the distance between GDAL-dependent libraries and PySAL saftely & clearly\n", | |
" - extends `shapely_ext` to perform common tabular operations\n", | |
"\n", | |
"2. A contrib module, `pda`. This would:\n", | |
" - subsume current `pdio` code\n", | |
" - extend the `pdio` strategy to use PySAL's full IO suite\n", | |
" - If datatype isn't supported, loudly call into `geotable`'s GeoPandas interface & fail safe if unavailable\n", | |
" - contain the necessary bridge code for PySAL to use pandas/patsy for any purpose\n", | |
"\n", | |
"3. In core:\n", | |
" - expose consistent labelled array interfaces across core modules\n", | |
" - enforce consistent *unlabelled* array rules across core modules (i.e. resolve shaping & array/list inconsistencies)\n", | |
" - finish & merge polymorphic weights constructors for arbitrary shape iterables" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# In my view" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"I think\n", | |
"- the tabular data model is more important\n", | |
"- GIS is best left to Shapely & GeoPandas\n", | |
"- If **NOGR** is necessary, it will be easier to build it as a drop-in replacement for the shapely hooks in `geotable` than it would be to build it first." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.5.1" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment