#that for me needs env vars set:
#COPERNICUSMARINE_SERVICE_USERNAME
#COPERNICUSMARINE_SERVICE_PASSWORD
#that is able to obtain the "Authorization Bearer: <token>", which fwiw can be used directly with any GDAL tool
## via GDAL_HTTP_HEADERS or GDAL_HTTP_HEADER_FILE (and we can use multidim, coming soon to R)
## so you don't have to use the packages stars' interface (lots of options):
#https://github.com/pepijn-devries/CopernicusMarine/blob/a006052d343b09a52e0b9ff8e7aadd998ad9000f/R/cms_login.r#L25-L33
<VRTDataset> | |
<Group name="/"> | |
<Dimension name="Time" type="TEMPORAL" size="5479" indexingVariable="Time" /> | |
<Dimension name="st_ocean" type="VERTICAL" direction="DOWN" size="51" indexingVariable="st_ocean" /> | |
<Dimension name="xt_ocean" type="HORIZONTAL_X" size="3600" indexingVariable="xt_ocean" /> | |
<Dimension name="yt_ocean" type="HORIZONTAL_Y" size="1500" indexingVariable="yt_ocean" /> | |
<Array name="Time"> | |
<DataType>Float64</DataType> | |
<DimensionRef ref="Time" /> | |
<RegularlySpacedValues start="11323.5" increment="1" /> |
new gdal mdim mosaic, with 180 Bluelink daily ocean_temp netcdf files from Thredds
## obtain a list of Thredds urls (we previously cached, we grep out daily_ocean_temp, 180 NetCDFs)
files <- arrow::read_parquet("https://github.com/mdsumner/dryrun/releases/download/latest/BRAN-netcdf-2023-netcdf.parquet") |>
dplyr::filter(stringr::str_detect(url, ".*daily.*temp.*\\.nc$"))
## generate gdal mdim mosaic command and run, a few minutes
maybe fix for terra::stretch(histeq = TRUE)
stretch_hist <- function(x, ...) {
## stretch as if all the pixels were in the same band (not memory safe)
rv <- terra::stretch(terra::rast(matrix(terra::values(x))), histeq = TRUE)
## set the values to the input, then stretch to 0,255
terra::stretch(terra::setValues(x, c(terra::values(rv))), histeq = FALSE)
}
Just an example that was mentioned, I've been playing with vrtility today looking at other parts of the world too and learning how it works.
A lot to say, this is a pretty big area for Sentinel (~180x220km), which is 5561, 4483 pixels at 40m resolution (so reduce the size of extend_x, extend_y (in degrees), and move the POINT for where you want the centre. Can go as fine as 10m resolution.
You don't really need median calc done if there isn't a lot of cloud, and we don't necessarily need mask either, date can be as fine as a two-day window (but there's only imagery every few days, depending).
library(vrtility)
# La Pampa
x0 <- c(xmin = -65.5, ymin = -39.5, xmax = -62, ymax = -34)
This url has a netcdf:
"https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/198110/oisst-avhrr-v02r01.19811002.nc"
it has correct georeferencing (bbox 0, -90, 360, 90) but doesn't have the crs set ("EPSG:4326" is appropriate)
it also has multiple variables, which traditionally requires this verbose "subdataset syntax" for GDAL:
terra::vect("/vsicurl/https://github.com/mdsumner/geoboundaries/releases/download/latest/geoBoundariesCGAZ_ADM0.parquet")
class : SpatVector
geometry : polygons
dimensions : 218, 3 (geometries, attributes)
extent : -180, 180, -90, 83.63339 (xmin, xmax, ymin, ymax)
source : geoBoundariesCGAZ_ADM0.parquet
/vsizip//vsicurl/https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM0.zip
/vsizip//vsicurl/listdata.thelist.tas.gov.au/opendata/data/LIST_PARCELS_HOBART.zip/list_parcels_hobart.shp
/vsizip//vsicurl/https://github.com/AustralianAntarcticDivision/rema.proc/raw/refs/heads/master/01_rock_classification/Medium_resolution_vector_polygons_of_Antarctic_rock_outcrop_-_VERSION_7.3.zip/Medium resolution vector polygons of Antarctic rock outcrop - VERSION 7.3/add_rock_outcrop_medium_res_polygon_v7.3.gpkg
Please, I think I'm doing something wrong here. With open_virtual_mfdataset( , parallel = False, )
this runs in about 2 minutes.
With open_virtual_mfdataset( , parallel = ThreadPoolExecutor, , )
it takes the same length of time (system with 32 cpus).
import xarray as xr
from obstore.store import from_url
import time
from virtualizarr import open_virtual_dataset, open_virtual_mfdataset