Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Last active July 1, 2025 05:22
Show Gist options
  • Select an option

  • Save mdsumner/dc3d076427c8bd440b1587c8448996fd to your computer and use it in GitHub Desktop.

Select an option

Save mdsumner/dc3d076427c8bd440b1587c8448996fd to your computer and use it in GitHub Desktop.

This works to load but we can't sel() it sensibly, any ideas?

import virtualizarr
#virtualizarr.__version__
#'1.3.3.dev81+ga5d04d7'

from obstore.store import HTTPStore
from virtualizarr.parsers import HDFParser

parser = HDFParser()
store = HTTPStore(url="https://thredds.nci.org.au")

nc = ['https://thredds.nci.org.au/thredds/fileServer/gb6/BRAN/BRAN2020/month/ocean_temp_mth_2019_05.nc', 'https://thredds.nci.org.au/thredds/fileServer/gb6/BRAN/BRAN2020/month/ocean_temp_mth_2019_06.nc']
ds= virtualizarr.open_virtual_mfdataset(nc, object_store=store, parser=parser, 
   drop_variables = ["average_DT", "Time_bounds", "average_T1", "average_T2", "st_edges_ocean", "nv"])
   
ds.isel(yt_ocean = slice(0, 2))

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/workenv/lib/python3.12/site-packages/xarray/core/dataset.py", line 2778, in isel
    var = var.isel(var_indexers)
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/workenv/lib/python3.12/site-packages/xarray/core/variable.py", line 1045, in isel
    return self[key]
           ~~~~^^^^^
  File "/workenv/lib/python3.12/site-packages/xarray/core/variable.py", line 791, in __getitem__
    data = indexing.apply_indexer(indexable, indexer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workenv/lib/python3.12/site-packages/xarray/core/indexing.py", line 1038, in apply_indexer
    return indexable[indexer]
           ~~~~~~~~~^^^^^^^^^
  File "/workenv/lib/python3.12/site-packages/xarray/core/indexing.py", line 1564, in __getitem__
    return array[key]
           ~~~~~^^^^^
  File "/workenv/lib/python3.12/site-packages/virtualizarr/manifests/array.py", line 261, in __getitem__
    indexer = _possibly_expand_trailing_ellipsis(indexer, self.ndim)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workenv/lib/python3.12/site-packages/virtualizarr/manifests/array.py", line 366, in _possibly_expand_trailing_ellipsis
    raise ValueError(
ValueError: Invalid indexer for array. Indexer length must be less than or equal to the number of dimensions in the array, but indexer=(slice(None, None, None), slice(None, None, None), slice(0, 2, None), slice(None, None, None), Ellipsis) has length 5 and array has 4 dimensions.
If concatenating using xarray, ensure all non-coordinate data variables to be concatenated include the concatenation dimension, or consider passing `data_vars='minimal'` and `coords='minimal'` to the xarray combining function.
>
@mdsumner
Copy link
Author

mdsumner commented Jul 1, 2025

or with dask

dask.config.set(num_workers = 24, scheduler = "processes") 
lds= [dask.delayed(virtualizarr.open_virtual_dataset)(xnc, 
   drop_variables = ["average_DT", "Time_bounds", "average_T1", "average_T2", "st_edges_ocean", "nv"]) for
        xnc in nc]
xr_concat_kwargs = {
    "coords": "minimal",
    "compat": "override",
    "data_vars": "minimal",
} 

vd = dask.compute(lds)

ds = xr.concat(vd[0], dim="Time", **xr_concat_kwargs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment