Skip to content

Instantly share code, notes, and snippets.

View mdsumner's full-sized avatar

Michael Sumner mdsumner

  • Integrated Digital East Antarctica, Australian Antarctic Division
  • Hobart, Australia
View GitHub Profile

These are two datasets, the Reynolds .25deg OISST and the NSIDC 25km southern polar sea ice concentration, up to date for those collections and updated daily as new data are published.

These are synchronized by {bowerbird} to disk and object storage and delivered to R users via {raadtools} on openstack VMs, or at HQ.

This VirtualiZarr-ing of these datasets is incremental progress towards opening up raadtools to the public and cross language usage.

import xarray

## public bucket on Pawsey endpoint

R code for copying this cool toot: https://en.osm.town/@koriander/113459280926992782

src <- "/vsizip//vsicurl/https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM0.zip"
sql <- "SELECT shapeGroup FROM geoBoundariesCGAZ_ADM0 WHERE shapeGroup IN ('ATA')"
library(terra)
laea <- project(vect(src), "+proj=laea +lat_0=90")

ant &lt;- project(vect(src, query = sql), "+proj=laea +lat_0=-90")
@mdsumner
mdsumner / vtiff.md
Last active November 10, 2024 03:50

def open_virtual(filepath, creds): ds = open_virtual_dataset(filepath, indexes = {}, loadable_variables=['x', 'y', 'time', 'crs'], decode_times = True, reader_options={'storage_options': creds})

zarr-developers/VirtualiZarr#291

library(terra)

options(parallelly.fork.enable = TRUE, future.rng.onMisuse = "ignore")
library(furrr)



topo <- project(rast(sds::gebco()), rast(), by_util = TRUE)
plot(topo)

Read parquet from S3 in python:

import pyarrow.parquet as pq
from pyarrow import fs

aws_credentials = {"endpoint_url": "https://projects.pawsey.org.au",  "anon": True}

s3 = fs.S3FileSystem(endpoint_override = "projects.pawsey.org.au")

abc

Netcdf files in public bucket idea-10.7289-v5sq8xb5/ on Pawsey endpoint, but we don't need netcdf with the VirtualiZarr references in Parquet:

import xarray

## public bucket on Pawsey endpoint
so = {"endpoint_url": "https://projects.pawsey.org.au",  "anon": True}


import xarray

terra for healpix of GEBCO

s <- "/vsicurl/https://gebco2023.s3.valeria.science/gebco_2023_land_cog.tif"
library(terra)
A <- 20037508.342789244
target <- rast(ext(c(-1, 1, -.5, .5) * A), res = 25000, crs = "+proj=healpix")

r <- project(rast(s), target, by_util = TRUE)
plot(r)