Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Last active February 17, 2025 23:55
Show Gist options
  • Save mdsumner/e2bf4a32e8985562896821d1c29dcd42 to your computer and use it in GitHub Desktop.
Save mdsumner/e2bf4a32e8985562896821d1c29dcd42 to your computer and use it in GitHub Desktop.

with stars (uses GDAL MULTIDIM)

## note we have to inner-quote the string (I think ZARR driver needs some attention here)
 dsn <- "ZARR:\"/vsicurl/https://mur-sst.s3.us-west-2.amazonaws.com/zarr-v1\""
library(stars)
read_mdim(dsn, proxy = TRUE, bounds = FALSE, variable = "analysed_sst")
stars_proxy object with 1 attribute in 1 file(s):
$analysed_sst
[1] "[...]/zarr-v1"

dimension(s):
     from    to     offset  delta refsys                                    values x/y
lon     1 36000         NA     NA     NA [-179.995,-179.985),...,[179.995,180.005) [x]
lat     1 17999         NA     NA     NA     [-89.995,-89.985),...,[89.985,89.995) [y]
time    1  6443 2002-06-01 1 days   Date                                      NULL    


@mdsumner
Copy link
Author

mdsumner commented Feb 17, 2025

with pizzarr remotes::install_github("keller-mark/pizzarr")

url <- "https://mur-sst.s3.us-west-2.amazonaws.com/zarr-v1"
> z <- pizzarr::HttpStore$new(url)
> z
<HttpStore>
  Inherits from: <Store>
  Public:
    clone: function (deep = FALSE)
    close: function ()
    contains_item: function (item)
    get_cache_time_seconds: function ()
    get_consolidated_metadata: function ()
    get_item: function (item)
    initialize: function (url, options = NA, headers = NA)
    is_erasable: function ()
    is_listable: function ()
    is_readable: function ()
    is_writeable: function ()
    listdir: function ()
    metadata_class: Metadata2, R6
    rename: function (src_path, dst_path)
    rmdir: function (path)
    set_cache_time_seconds: function (seconds)
    set_item: function (key, value)
  Private:
    base_path: zarr-v1
    cache_enabled: TRUE
    cache_time_seconds: 3600
    client: HttpClient, R6
    domain: https://mur-sst.s3.us-west-2.amazonaws.com
    erasable: TRUE
    get_zmetadata: function ()
    headers: NA
    listable: TRUE
    listdir_from_keys: function (path)
    make_request: function (item)
    make_request_memoized: function (key)
    memoize_make_request: function ()
    options: NA
    readable: TRUE
    rename_from_keys: function ()
    rmdir_from_keys: function (path)
    store_version: 2
    url: https://mur-sst.s3.us-west-2.amazonaws.com/zarr-v1
    writeable: TRUE
    zmetadata: list

@mdsumner
Copy link
Author

with pizarr then

z$listdir()
[1] "analysed_sst"     "analysis_error"   "lat"              "lon"
[5] "mask"             "sea_ice_fraction" "time"


g <- zarr_open_group(z)
g$get_attrs()$to_list()
$Conventions
[1] "CF-1.7"

$Metadata_Conventions
[1] "Unidata Observation Dataset v1.0"

$acknowledgment
[1] "Please acknowledge the use of these data with the following statement:  These data were provided by JPL under support by NASA MEaSUREs program."

$cdm_data_type
[1] "grid"

$comment
[1] "MUR = \"Multi-scale Ultra-high Resolution\""

$creator_email
[1] "[email protected]"

$creator_name
[1] "JPL MUR SST project"

$creator_url
[1] "http://mur.jpl.nasa.gov"

$date_created
[1] "20200124T010755Z"

$easternmost_longitude
[1] 180

$file_quality_level
[1] 3
...

@mdsumner
Copy link
Author

to actually get data with BLOSC encoding we need Rarr

##BiocManager::install('Rarr')

so then

sst <- g$get_item("analysed_sst")
sst$get_shape()
# [1]  6443 17999 36000
sst$get_compressor()
<BloscCodec>
  Inherits from: <Codec>
  Public:
    blocksize: 0
    clevel: 5
    clone: function (deep = FALSE)
    cname: lz4
    decode: function (buf, zarr_arr)
    encode: function (buf, zarr_arr)
    get_config: function ()
    initialize: function (cname = "lz4", clevel = 5, shuffle = TRUE, blocksize = NA,
    shuffle: 1


## don't get this yet
##sst$get_item(list(slice(1), slice(1, 10), slice(1:10)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment