Skip to content

Instantly share code, notes, and snippets.

View mdsumner's full-sized avatar

Michael Sumner mdsumner

View GitHub Profile

finally, do the basic thing - this is like one line of basic generic config and code in GDAL ...

import fsspec
import xarray as xr
import zarr

pawsey_s3_config = {
     "anon": True,
     "client_kwargs": {
f <- c("/rdsi/PUBLIC/raad/data/data.marine.copernicus.eu/SEALEVEL_GLO_PHY_L4_NRT_008_046/cmems_obs-sl_glo_phy-ssh_nrt_allsat-l4-duacs-0.25deg_P1D_202311/2022/01/nrt_global_allsat_phy_l4_20220111_20220111.nc", 
  "/rdsi/PUBLIC/raad/data/data.marine.copernicus.eu/SEALEVEL_GLO_PHY_L4_NRT_008_046/cmems_obs-sl_glo_phy-ssh_nrt_allsat-l4-duacs-0.25deg_P1D_202311/2022/01/nrt_global_allsat_phy_l4_20220112_20220118.nc",
"/rdsi/PUBLIC/raad/data/data.marine.copernicus.eu/SEALEVEL_GLO_PHY_L4_NRT_008_046/cmems_obs-sl_glo_phy-ssh_nrt_allsat-l4-duacs-0.25deg_P1D_202311/2022/01/nrt_global_allsat_phy_l4_20220113_20220116.nc")

fs::file_size(f)
#9.05M 7.49M 9.03M

This local example in the Usage for VirtualiZarr is (still) broken:

https://virtualizarr.readthedocs.io/en/latest/usage.html#__tabbed_1_7

I personally find it impossible to remember all this rigmarole (what's wrong with a file path?), so I need a reference example that actually works:

## local file 'data.nc' in your wd e.g. 
# wget -O data.nc https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/198109/oisst-avhrr-v02r01.19810901.nc
+proj=stere +a=6378137.0 +b=6356752.3 +lat_0=-90 +lon_0=64 219383.7 309876.1 2473114 2944041

Bare minimum capability and library alignment for using GDAL R and Python together, using mdim or classic raster:

GDAL R and Python alignment

Single GDAL, single PROJ, single GEOS, R packages and Python packages linking the same libraries, reticulate bridging cleanly between them, modern numpy, modern Python, modern R, current GDAL master available or latest release.

library(terra)
dsn <- "/vsicurl/https://projects.pawsey.org.au/idea-gebco-tif/GEBCO_2025.tif"

ex <- c(-1, 1, -1, 1) * pi/2 * 6378137

## generate a template spilhaus 25km grid and project the COG to that
r &lt;- project(rast(dsn), rast(ext(ex), res = 25000, crs = "+proj=spilhaus"), by_util = TRUE)
Sys.setenv("AWS_NO_SIGN_REQUEST" = "YES")

dsn <- "/vsis3/us-west-2.opendata.source.coop/tge-labs/aef/v1/annual/2023/17N/xfbc3s1k0nrye0ub4-0000008192-0000000000.tiff"
r <- rast(sprintf("vrt://%s?ovr=3", dsn))[[1]]
par(mfrow = c(2, 1))
plot(r)

plot(r)
imdsn &lt;- "WMTS:https://services.arcgisonline.com/arcgis/rest/services/World_Imagery/MapServer/WMTS/1.0.0/WMTSCapabilities.xml,layer=World_Imagery"
/vsicurl/https://projects.pawsey.org.au/idea-gebco-tif/GEBCO_2025.tif

or s3

--endpoint-url=https://projects.pawsey.org.au
s3://idea-gebco-tif/GEBCO_2025.tif

New chunk read in https://github.com/hypertidy/zaro

library(zaro)
store <- zaro("virtualizarr://https://raw.githubusercontent.com/mdsumner/virtualized/refs/heads/main/remote/ocean_temp_2023.parq")
#> [zaro] opening VirtualiZarr Parquet reference store: https://raw.githubusercontent.com/mdsumner/virtualized/refs/heads/main/remote/ocean_temp_2023.parq
meta <- attr(zaro_meta(store), "consolidated")$temp
#> [zaro] found .zmetadata (Zarr V2 consolidated)
#&gt; [zaro] 11 arrays: Time, Time_bnds, average_DT, average_T1, average_T2, nv, st_edges_ocean, st_ocean, temp, xt_ocean, yt_ocean
# library(sedonadb)
# 
# # Works with files
# fgb <- system.file("files/natural-earth_cities.fgb", package = "sedonadb")
# sd_read_sf(fgb)

## worked a while already with GDAL
library(lazysf)
url <- "/vsicurl/https://github.com/apache/sedona-db/raw/refs/heads/main/r/sedonadb/inst/files/natural-earth_cities_geo.parquet"