Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Created March 17, 2025 10:03
Show Gist options
  • Save mdsumner/3f6c2722c3ce594356b5b6c23fa605c9 to your computer and use it in GitHub Desktop.
Save mdsumner/3f6c2722c3ce594356b5b6c23fa605c9 to your computer and use it in GitHub Desktop.

On this site is a lot of ZARR datasets, ARCO (analysis ready cloud optimized).

https://catalog.leap.columbia.edu/

There are example codes that show how to open in xarray. We can do this is in R by loading up the GDAL api via the Python osgeo.gdal package.

The new version of reticulate makes this easy to do.

reticulate::py_require("gdal")
gdal <- reticulate::import("osgeo.gdal")

Now this ZARR store is just a URL but we need some special syntax to use it in GDAL, essentially declare as ZARR, and carefully quote around the url with "/vsicurl/".

https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/pangeo-forge/AGDC-feedstock/AGCD.zarr

We use gdal MULTIDIM mode to open it as a group of n-dimensional arrays.

dsn <- "ZARR:\"/vsicurl/https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/pangeo-forge/AGDC-feedstock/AGCD.zarr\""
ds <- gdal$OpenEx(dsn, gdal$OF_MULTIDIM_RASTER)

We need the root group and then can ask what's in there.

rg$GetMDArrayNames()
##[1] "lat"        "lon"        "time"       "precip"     "tmax"       "tmin"       "vapourpres"

Let's go for tmax, and find out what it is.

tmax <- rg$OpenMDArrayFromFullname("/tmax")
ivar <- lapply(tmax$GetDimensions(), \(.x) .x$GetIndexingVariable()$ReadAsArray())
lapply(ivar, range)
[[1]]
[1] 44194.38 62090.38

[[2]]
[1] -44.5 -10.0

[[3]]
[1] 112.00 156.25

It's (probably) time, lat, lon.

(the metadata is there but we save that for another time)

Get one slice in time.

t1 <- tmax$GetView("[100,:,:]")$ReadAsArray()  ## time step 101 (all lats, all lons)
image(ivar[[3]], ivar[[2]], t(t1), asp = 1/cos(mean(ivar[[2]]) * pi/180), col = hcl.colors(56))

image

The values look ok, no offset scale to do.

range(t1)
[1]  8.865234 39.677734
tmax$GetNoDataValue()
[1] NaN
tmax$GetOffset()  ## None
tmax$GetScale()   ## None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment