Skip to content

Instantly share code, notes, and snippets.

@obrl-soil
Last active September 7, 2017 22:41
Show Gist options
  • Save obrl-soil/d311c36d0461851aa82869408f2e368e to your computer and use it in GitHub Desktop.
Save obrl-soil/d311c36d0461851aa82869408f2e368e to your computer and use it in GitHub Desktop.
library(sf)
library(tidyverse)
setwd('C:/data/prepair_test')
# a file with known problems, also at https://github.com/obrl-soil/bits-n-pieces/tree/master/r2p
tester <- read_sf('C:/data/r2p/cat_v_pretty.gpkg')
# I used the QGIS topology checker tool to find invalid geometries. Row 130 is no good:
plot(st_geometry(tester[130, ]))
# test if prepair can fix it
bad_130 <- st_as_text(st_geometry(tester[130, ]))
fixed_130 <- system2('prepair.exe', args = c('--wkt', paste0('"', bad_130, '"')), stdout = TRUE)
# cool! re-sf the output:
fixed_130 <- st_as_sf(data.frame('ID' = 130, 'geometry' = fixed_130), wkt = 'geometry',
agr = 'constant', crs = st_crs(tester)$proj4string)
plot(st_geometry(fixed_130)) # looks the same, good good
# If I replace the geometry for that row, the resulting file should have one less invalid geometry
tester_fixed <- tester
st_geometry(tester_fixed[130, ]) <- st_geometry(fixed_130[1, ])
identical(st_geometry(tester[130, ]),
st_geometry(tester_fixed[130, ])) # FALSE, mission accomplished
tester_fixed <- st_cast(tester_fixed, 'MULTIPOLYGON') # since the replacement polygon is this type
st_write(tester_fixed, 'C:/data/prepair_test/tester_fixed_130.gpkg', delete_dsn = TRUE)
# spoiler: it does! now for the other problem polygons...
# I need a function to fix all the problem geometries in the above file:
fixer_upper <- function(bad_sfobj = NULL) {
message(paste0(Sys.time(), ': prepair iteration in progress...'))
g <- st_geometry(bad_sfobj)
new_g <- lapply(seq_along(g), function(gx) {
gxt <- st_as_text(g[gx])
# hahaha this is so terrible but some wkts are too big to paste into cmd:
fixed <- if(nchar(gxt) > 7500) {
write(gxt, file = 'C:/DATA/prepair_test/prepair_temp.txt')
f <- system2('prepair.exe',
args = c('-f', 'C:/DATA/prepair_test/prepair_temp.txt'),
stdout = TRUE)
do.call('paste0', as.list(f))
} else {
system2('prepair.exe', args = c('--wkt', paste0('"', gxt, '"')), stdout = TRUE)
}
gx_fix <- st_as_sf(data.frame('ID' = gx, 'geometry' = fixed), wkt = 'geometry',
agr = 'constant', crs = st_crs(bad_sfobj)$proj4string)
# for testing - 5-10 geoms/sec, probs 90% of time is wasted on write/read above
print(paste0(gx, ' complete'))
gx_fix
})
message(paste0(Sys.time(), ': assembling repaired geometries...'))
new_g <- do.call('rbind', new_g) # secondary bottleneck, ~1 minute for 3600 rows
st_geometry(bad_sfobj) <- st_geometry(new_g)
message(paste0(Sys.time(), ': ...repair complete'))
bad_sfobj <- st_cast(bad_sfobj, 'MULTIPOLYGON')
}
fixed_up <- fixer_upper(tester)
st_write(fixed_up, 'C:/data/prepair_test/fixed_up.shp', delete_dsn = TRUE)
# the output passes the QGIS topo check for invalid geometries \o/
@hugoledoux
Copy link

I tried your file that failed: it's fine here. Under macOS.

It's either something with the precision libraries and Windows; or the fact that your WKT conversion is not giving your enough precision. I suggest you try to read a SHP directly and output a SHP too, instead of converting to text.

In any ways, the fixed one is there:
https://www.dropbox.com/sh/gun9s55ptx3rudk/AADHbAbQbgbNrkZvgB2zfm_ea?dl=0

@obrl-soil
Copy link
Author

obrl-soil commented Sep 6, 2017

I think it may be a Windows issue then @hugoledoux, on the command line prepair.exe --ogr bad_poly_37.shp still causes a crash.

Edit: Whoops, turns out I'd grabbed the old build by mistake. The new build works perfectly, although I did have to copy over libmpfr-4.dll and libgmp-10.dll from the older version first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment