Skip to content

Instantly share code, notes, and snippets.

@jcrist
Last active July 8, 2023 06:03
Show Gist options
  • Save jcrist/3e74af6ae329111955ad6696c3134519 to your computer and use it in GitHub Desktop.
Save jcrist/3e74af6ae329111955ad6696c3134519 to your computer and use it in GitHub Desktop.
A simple implementation of GeoJSON using msgspec
"""
A simple implementation of GeoJSON (RFC 7946) using msgspec
(https://jcristharif.com/msgspec/) for parsing and validation.
The `loads` and `dumps` methods work like normal `json.loads`/`json.dumps`,
but:
- Will result in high-level GeoJSON types
- Will error nicely if a field is missing or the wrong type
- Will fill in default values for optional fields
- Decodes much faster than the stdlib json
- Integrates well with editor tooling like mypy or pyright
"""
from __future__ import annotations
import msgspec
Position = tuple[float, float]
class Point(msgspec.Struct, tag=True):
coordinates: Position
class MultiPoint(msgspec.Struct, tag=True):
coordinates: list[Position]
class LineString(msgspec.Struct, tag=True):
coordinates: list[Position]
class MultiLineString(msgspec.Struct, tag=True):
coordinates: list[list[Position]]
class Polygon(msgspec.Struct, tag=True):
coordinates: list[list[Position]]
class MultiPolygon(msgspec.Struct, tag=True):
coordinates: list[list[list[Position]]]
class GeometryCollection(msgspec.Struct, tag=True):
geometries: list[Geometry]
Geometry = (
Point
| MultiPoint
| LineString
| MultiLineString
| Polygon
| MultiPolygon
| GeometryCollection
)
class Feature(msgspec.Struct, tag=True):
geometry: Geometry | None = None
properties: dict | None = None
id: str | int | None = None
class FeatureCollection(msgspec.Struct, tag=True):
features: list[Feature]
GeoJSON = Geometry | Feature | FeatureCollection
loads = msgspec.json.Decoder(GeoJSON).decode
dumps = msgspec.json.Encoder().encode
@jcrist
Copy link
Author

jcrist commented Apr 25, 2022

Example usage & quick benchmark:

In [1]: import msgspec_geojson

In [2]: with open("canada.json", "rb") as f:
   ...:     buffer = f.read()  # canada as a FeatureCollection

In [3]: canada = msgspec_geojson.loads(buffer)

In [4]: type(canada)  # loaded as high-level, validated object
Out[4]: msgspec_geojson.FeatureCollection

In [5]: import orjson, json, geojson

In [6]: %timeit msgspec_geojson.loads(data)  # benchmark msgspec
6.46 ms ± 57.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [7]: %timeit orjson.loads(data)  # benchmark orjson
9.57 ms ± 19.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [8]: %timeit json.loads(data)  # benchmark stdlib json
30 ms ± 59.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [9]: %timeit geojson.loads(data)  # benchmark geojson
94.1 ms ± 318 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment