Skip to content

Instantly share code, notes, and snippets.

@jlehtoma
Created October 28, 2016 08:59
Show Gist options
  • Save jlehtoma/9fb80856e6d51a2eba8ce95419531acf to your computer and use it in GitHub Desktop.
Save jlehtoma/9fb80856e6d51a2eba8ce95419531acf to your computer and use it in GitHub Desktop.
Testing different variants of accessing WFS data in R
library(gdalUtils)
library(rgdal)
library(rwfs)
library(sp)
dsn_espoo <- "WFS:http://kartat.espoo.fi/TeklaOgcWeb/WFS.ashx?service=WFS&REQUEST=GetCapabilities&maxFeatures=100000"
dsn_vantaa <- "WFS:http://gis.vantaa.fi/geoserver/wfs"
# Variant 1 ----------------------------------------------------------------
start_time1 <- Sys.time()
dsn_hel <- "WFS:http://kartta.hel.fi/ws/geoserver/avoindata/wfs?version=1.1.0&REQUEST=GetCapabilities&maxFeatures=100000"
gdalUtils::ogrinfo(dsn_hel, so = FALSE)
# Read point data
# NOTE 1: using ogr2ogr assumes that it's available in the system (the original
# issue with rwfs).
# NOTE 2: Using ogr2ogr is fast. Preliminary test showed, that with the
# implementations given here Variant 1 (ogr2ogr) is 2x faster than
# Variant 2 (stream directly), which in turn is 2x faster than Variant 3
# (download and read). In the last case, disambiguating FIDs probably takes
# unnecessary amounts of time.
# NOTE 3: Using f = "GML" fails, using f = "ESRI Shapefile" mangles
# field names.
temp_folder <- "data/temp_helsinki_rakennukset_rekisteripisteet"
gdalUtils::ogr2ogr(dsn_hel, temp_folder,
"avoindata:Rakennukset_rekisteripisteet",
f = "MapInfo File")
# For some reason the files are shown as avoindata/Rakennukset instead of
# avoindata:Rakennukset in Finder
gis_file <- file.path(temp_folder, "avoindata:Rakennukset_rekisteripisteet.tab")
layer1 <- rgdal::ogrListLayers(gis_file)
helsinki_rakennukset_rekisteripisteet <- rgdal::readOGR(gis_file,
layer = layer1,
stringsAsFactors = FALSE)
stop_time1 <- Sys.time() - start_time1
message("First variant took ", stop_time1, " seconds")
# Variant 2 ---------------------------------------------------------------
# Stream directly
start_time2 <- Sys.time()
# First, get layers
# NOTE: streaming seems to be extremely volatile and error prone.
dsn_hel1 <- "WFS:http://kartta.hel.fi/ws/geoserver/avoindata/wfs?version=1.1.0&REQUEST=GetCapabilities"
hel_layers <- rgdal::ogrListLayers(dsn_hel1)
# Read data from WFS, layer "avoindata:Rakennukset_rekisteripisteet" is hel_layers[16]
hel_rp1 <- rgdal::readOGR(dsn_hel1, layer = hel_layers[16])
stop_time2 <- Sys.time() - start_time2
message("Second variant took ", stop_time2, " seconds")
# Variant 3 ---------------------------------------------------------------
# Download and read
start_time3 <- Sys.time()
# Get data
dsn_hel_baseurl <- "http://kartta.hel.fi/ws/geoserver/avoindata/wfs"
dsn_hel2 <- paste0(dsn_hel_baseurl, "?version=1.1.0&request=GetFeature&typeName=",
hel_layers[16])
dest_file <- file.path(temp_folder, paste0(hel_layers[16], ".gml"))
download.file(dsn_hel2, dest_file, method = "internal", quiet = TRUE)
# NOTE: in the downloaded GML file the layernames is altered:
# "avoindata:Rakennukset_rekisteripisteet" -> "Rakennukset_rekisteripisteet"
hel_rp2 <- rgdal::readOGR(dest_file, layer = gsub("avoindata:", "", hel_layers[16]),
disambiguateFIDs = TRUE)
stop_time3 <- Sys.time() - start_time3
message("Third variant took ", stop_time3, " seconds")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment