Skip to content

Instantly share code, notes, and snippets.

@kandersolar
Created August 27, 2021 21:29
Show Gist options
  • Select an option

  • Save kandersolar/09c320d08ef8daac80f3302e4b11b1ac to your computer and use it in GitHub Desktop.

Select an option

Save kandersolar/09c320d08ef8daac80f3302e4b11b1ac to your computer and use it in GitHub Desktop.
timing comparisons of reading hdf5 and netcdf4 data with various python libraries
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mikofski
Copy link

Thanks for this. OK, I didn't realize that the format of LinkeTurbidity.h5 was already sane. I can read it using tables as in the existing code or using h5py just as easily. I always just assumed that we were using tables because of pandas, but now I realize that pandas isn't even used at all.

import pvlib
import pathlib
import tables
import h5py

# get linke turbidity hdf5 file
# note: it is sanely organized, not created using pandas
# note: pandas isn't used at all, no need for tables
pvlib_path = pathlib.Path(pvlib.__file__)
tl_flie = (pvlib_path / 'data'/ 'LinkeTurbidities.h5')

# using tables, looks the same as h5py
tl_tables = tables.open_file(tl_flie )

# only difference is that tables uses dot notation
tl_tables.root.LinkeTurbidity[10, 5, :]
# array([38, 38, 38, 38, 40, 41, 42, 42, 40, 39, 38, 38], dtype=uint8)

alldata = tl_tables.root.LinkeTurbidity[:, :, :]  # it's a numpy array
alldata.shape
# (2160, 4320, 12)

alldata.dtype
# dtype('uint8')

tl_tables.close()  # must close if context not used

# use h5py, from the hdf5 library maintainers
tl_h5 = h5py.File(tl_flie )
# main difference is that uses keys (or paths) as indices instead of dot notation
# like numpy structured array API
# note root, "/", is assumed, unless using a path as a key 
tl_h5['LinkeTurbidity'][10, 5, :]
# array([38, 38, 38, 38, 40, 41, 42, 42, 40, 39, 38, 38], dtype=uint8)

So essentially easy to just remove tables and use h5py, no problemo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment