Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Last active February 11, 2025 04:16
Show Gist options
  • Save mdsumner/a91b1c8233c1d74182d8f803038ec1b7 to your computer and use it in GitHub Desktop.
Save mdsumner/a91b1c8233c1d74182d8f803038ec1b7 to your computer and use it in GitHub Desktop.

A small list of global datasets on Pawsey object storage.

'NSIDC_SEAICE_PS_S25km', 'NSIDC_SEAICE_PS_N25km' are the 25km sea ice concentrations between 1978-now.

'SEALEVEL_GLO_PHY_L4' is Copernicus altimetry, global longlat since 1993 (ssh, u/v currents

'oisst-avhrr-v02r01' is OISST 25km global longlat SST since 1982.

Virtual Zarr referencing NetCDF objects on Pawsey storage, stored as "kerchunk Parquet".

Can be run with this big messy docker image:

docker run --rm -ti ghcr.io/mdsumner/gdal-builds:rocker-gdal-dev-python bash
python
import xarray

## public bucket on Pawsey endpoint
so = {"endpoint_url": "https://projects.pawsey.org.au",  "anon": True}

datasetlist = ['SEALEVEL_GLO_PHY_L4', 'NSIDC_SEAICE_PS_S25km', 'NSIDC_SEAICE_PS_N25km', 'oisst-avhrr-v02r01']

i = 0

ds = xarray.open_dataset(f's3://vzarr/{datasetlist[i]}.parquet', engine = "kerchunk", chunks = {}, storage_options={"target_options": so, "remote_options": so})


print(ds)
@mdsumner
Copy link
Author

we have to change this now, it works to remove "storage_options".

import xarray

## public bucket on Pawsey endpoint
so = {"endpoint_url": "https://projects.pawsey.org.au",  "anon": True}

datasetlist = ['SEALEVEL_GLO_PHY_L4', 'NSIDC_SEAICE_PS_S25km', 'NSIDC_SEAICE_PS_N25km', 'oisst-avhrr-v02r01']

i = 0

ds = xarray.open_dataset(f's3://vzarr/{datasetlist[i]}.parquet', engine = "kerchunk", chunks = {}, storage_options={"remote_options": so})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment