redo: Combine Several Fields into One.
A sort of continuation of GIS with Python, Shapely, and Fiona.
Combine several fields into one with Python, Shapely, and Fiona! You'll also want GDAL around.
First, let's start out with data. I'll use Natural Earth Countries 1:50m. Download that file and move it to a directory.
This is assuming you're using a terminal. If you have a mac, that's the Terminal.app.
Unzip that file. Then find out about its fields with
ogrinfo
. -al
indicates all layers, and -so
means show a summary only - don't show every individual feature in the file.
$ ogrinfo -so -al ne_50m_admin_0_countries.shp
INFO: Open of `ne_50m_admin_0_countries.shp'
using driver `ESRI Shapefile' successful.
Layer name: ne_50m_admin_0_countries
Geometry: 3D Polygon
Feature Count: 241
Extent: (-180.000000, -89.998926) - (180.000000, 83.599609)
Layer SRS WKT:
GEOGCS["GCS_WGS_1984",
DATUM["WGS_1984",
SPHEROID["WGS_84",6378137.0,298.257223563]],
PRIMEM["Greenwich",0.0],
UNIT["Degree",0.0174532925199433]]
scalerank: Integer (4.0)
featurecla: String (30.0)
labelrank: Real (16.6)
sovereignt: String (254.0)
sov_a3: String (254.0)
adm0_dif: Real (16.6)
level: Real (16.6)
type: String (254.0)
admin: String (254.0)
adm0_a3: String (254.0)
# truncated, there's lots more
First let's get comfortable with the layout of Fiona - check out
copy.py
.
Once you're comfortable there, check out combining_fields.py
. It's only
two lines different than copy.py
- the first is
schema['properties']['a3_sov'] = 'str'
Which says that in the new schema (layout, or database format), there's a new
field called a3_sov
and it's a string.
Then, on the output loop,
poly['properties']['a3_sov'] = poly['properties']['sov_a3'] + poly['properties']['sovereignt']
Creates this new field by pulling the sov_a3
and sovereignt
fields
and combining them with +
. This will create a new file called ne_50m_admin_0_countries_a3_sov.shp
.
afaik, shapefiles do not contain multiple layers, it's OGR that adds this abstraction in order for it to support datasources with multiple layers (e.g. multiple tables in a postgis database). You can replace your first commands with
which lists the info of all (i.e. the only) the shapefile's layers