Skip to content

Instantly share code, notes, and snippets.

@elipousson
Created October 11, 2024 20:59
Show Gist options
  • Select an option

  • Save elipousson/81748509a1a8b46ace309053bde94382 to your computer and use it in GitHub Desktop.

Select an option

Save elipousson/81748509a1a8b46ace309053bde94382 to your computer and use it in GitHub Desktop.
library(microbenchmark)
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
# pak::pkg_install('geoarrow/[email protected]')
# https://github.com/geoarrow/geoarrow-r/issues/28
library(geoarrow)

nc <- st_read(system.file("shape/nc.shp", package="sf"))
#> Reading layer `nc' from data source 
#>   `/Users/elipousson/Library/R/arm64/4.4/library/sf/shape/nc.shp' 
#>   using driver `ESRI Shapefile'
#> Simple feature collection with 100 features and 14 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> Geodetic CRS:  NAD27

sf_obj <- rep(list(nc), 1000) |>
  purrr::list_rbind() |>
  sf::st_as_sf()

gpkg_parquet_comparison <- microbenchmark(
  list = list(
    {
    # Write to and read from GeoPackage
    tf_gpkg <- tempfile(fileext = ".gpkg")
    st_write(sf_obj, tf_gpkg)
    st_read(tf_gpkg)

  },
  {
    # Write to and read from parquet
    tf_parquet <- tempfile(fileext = ".parquet")
    write_geoparquet(sf_obj, tf_parquet)
    read_geoparquet_sf(tf_parquet)
  }
  ),
  times = 2L
)
#> Writing layer `fileb74e5178979e' to data source 
#>   `/var/folders/3f/50m42dx1333_dfqb5772j6_40000gn/T//RtmptuFwkN/fileb74e5178979e.gpkg' using driver `GPKG'
#> Writing 100000 features with 14 fields and geometry type Multi Polygon.
#> Reading layer `fileb74e5178979e' from data source 
#>   `/private/var/folders/3f/50m42dx1333_dfqb5772j6_40000gn/T/RtmptuFwkN/fileb74e5178979e.gpkg' 
#>   using driver `GPKG'
#> Simple feature collection with 100000 features and 14 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> Geodetic CRS:  NAD27
#> Warning in microbenchmark(list = list({: less accurate nanosecond times to
#> avoid potential integer overflows
#> Warning in microbenchmark(list = list({: Could not measure a positive execution
#> time for one evaluation.

ggplot2::autoplot(gpkg_parquet_comparison)
#> Warning in ggplot2::scale_y_log10(name = y_label): log-10 transformation
#> introduced infinite values.
#> Warning: Removed 4 rows containing non-finite outside the scale range
#> (`stat_ydensity()`).
#> Warning in max(data$density, na.rm = TRUE): no non-missing arguments to max;
#> returning -Inf
#> Warning: Computation failed in `stat_ydensity()`.
#> Caused by error in `$<-.data.frame`:
#> ! replacement has 1 row, data has 0

Created on 2024-10-11 with reprex v2.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment