- Moved from
trialtoadopt - You're probably already familiar, but recent under-the-hood upgrades to IO (pyogrio over fiona) and geometry functionality (shapely 2.0), both of which are aimed at better vectorized operations.
- In practice, things are more streamlined and potentially you spend less of your life trying to load a 20 GB GeoJSON into a DataFrame.
- pyogrio is optimized for bulk reading/writing of spatial vector data
Pyogrio is fast because it uses pre-compiled bindings for GDAL/OGR to read and write the data records in bulk. This approach avoids multiple steps of converting to and from Python data types within Python, so performance becomes primarily limited by the underlying I/O speed of data source drivers in GDAL/OGR. We have seen >5-10x speedups reading files and >5-20x speedups writing files compared to using row-per-row approaches (e.g. Fiona).