A real Rust library for Zarr, including virtualization support, Icechunk integration and v3.
has its own Zarr v2 and v3 internal implementation, works but
- we also have the flip between GDAL classic and multidim mode
- classic mode is 2D with bands (unrolled from higher dimensions when present, and no 1D vars)
- multidim is truly n-dimensional, but doesn't have the same reprojection power as 2D mode
- not exercised much, I personally found pretty fundamental bugs (fixed immediately but could do with more eyes on it)
- doesn't know about virtualization, but see https://lists.osgeo.org/pipermail/gdal-dev/2024-July/059256.html
there are several existing partial pathways for Zarr in R
- {sf} (for {stars} has GDAL bindings and some support for multdim mode, not much attention (Carl Boettigger and me)
- {gdalraster} is a real API for GDAL but only for classic mode for now (I have started with multidim support for that )
- {pizzarr} an R-only package (no C), David Blodgett is a big supporter and has written netcdf-like wrappers
- {Rarr} on bioconductor (which is a CRAN-sibling)
- nczarr, we can use {RNetCDF} or {ncdf4} but cross platform support is patchy and unstable, {tidync} and {stars} (read_ncdf) will already work this way without any changes (WIP let's explore, but see https://gist.github.com/mdsumner/492d2a98bffc6de5974a96f50a0b75f2)
pizzarr and Rarr have these compression tools (some limitations on settings)
- zlib/gzip
- bzip2
- blosc
- LZMA
- LZ4
- Zstd
A fundamental issue in R is its narrow type support - Byte (raw), Int32 (integer and logical), Float64 (numeric), external package {bit64} provides Int64
None of the R packages support virtualization (kerchunk, VirtualiZarr), but the nczarr approach must support some, technically opendap is dmr++ so it can't be too far off. Biggest gap in netcdf is being able to read remote stores, and having it built to support that.
R itself just got Zstd compression native https://github.com/wch/r-source/commit/7e16093f2c107d4965e0ebfaeea50865062df54d
I need to look at how pizarr and Rarr do it but generally the compression tools are scattered, some native, some extension packages.
We could go a long way with R itself with Zarr, but it would be like the python landscape and its probably time to push behind something fundamental like Rust-zarrs.