Keywords: Cosmo_rea6 DWD wind-data python numpy
Purpose:
DWD, the German Meteorological Service,
has wind data for Europe on a 6 x 6 km grid, from 1995 to August 2019, hourly,
at heights 10 100 .. 200m.
These are in 2 gigabyte files like
https://opendata.dwd.de/climate_environment/REA/COSMO_REA6/converted/hourly/2D/WS_100m.2D.201908.nc4
-- windspeed data at 100m for 2019 August.
Cosmopy is a bare-bones package to
- clip out small areas of these files, e.g. Bayern, to numpy .npz files
- regrid the staggered Cosmo grid to a WGS84 uniform lat-lon grid, e.g. 0.05° x 0.05°.
Its design point: small programs with lots of print statements, verbose= . Of course, if you're happy with a particular package, be happy. But if you value clarity above raw speed, and like small solid packages, and like numpy, read on.
Files are generated by 3 shell scripts, wget-cosmo clip-cosmo regrid-cosmo
,
which call python;
see their help text, clip-cosmo -h
etc.
After downloading wih wget-cosmo
, clip the gigabyte files to the few percent you want to look at.
Then if need be, regrid to a uniform WGS84 grid.
Example files in this flow:
wget-cosmo: $cosmodata/raw/WS_150m.2D.201401.nc4 -- monthly 01 .. 12, 1.5 GB each
clip-cosmo: $cosmodata/BY-clip/WS_150m.2D.2014.npz -- numpy savez: flexible, can memmap
regrid-cosmo: $cosmodata/BY-grid/WS_150m.2D.2014.npz -- 0.05° x 0.05° grid to nearest cosmo
Here BY
is an area name, see "Parameters" below.
If I understand correctly,
the data in opendata.dwd.de/climate_environment/REA/COSMO_REA6/.../*.nc4
is on a staggered grid, about 6 x 6 km.
The 2d RLAT RLON in those files are in an odd rotated coordinate system
which as far as I know has no CRS / no proj string.
Ignore them, and use only the RLAT RLON in
.../constant/COSMO_REA6_CONST_withOUTsponge.nc .
To clip a particular N S W E box out of the gigabyte COSMO_REA6 files, use straight numpy,
see clip2d.py
. For example, 48.5° .. 48° x 11.25° x 12°:
mask: 83 points in (10, 10)
[[0 0 1 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1 1 1]
[0 1 1 1 1 1 1 1 1 1]
[0 1 1 1 0 0 0 0 0 0]]
To find the j,k
with RLAT[j,k], RLON[j,k]
nearest say 48.137 N, 11.575 E
,
use a KDTree, see kdtree.py
also near_latlon.py
.
"Nearest" means nearest in the Cosmo staggered grid, in the Euclidean metric (cos-shrink ?)
For example,
wgslat wgslon cosmoj cosmok cosmolat cosmolon
48 12 378 443 47.973 11.969
Then wind_speed[:,378,443]
in a (720, 824, 848) (time, lat, lon) array
is the timeseries at the Cosmo gridpoint nearest 48° N, 12° E.
Power from windmills is highly sensitive to peak windspeeds. Interpolating timeseries usually smooths, lowers peaks, which would bias windpower estimates -- don't want to do that. For example, consider two hourly timeseries at points 6 km apart with 6 km/h wind:
6 0 6 0 ... m/s: some windpower
0 6 0 6 ...
3 3 3 3 ... interpolate, average: 0 windpower
(Wind turbines generate no electricity below about 3 m/s).
I regrid the Cosmo staggered grid to wgs84 with Kdtree, with each timeseries unchanged:
wgs84 gridpoint ⟶ the nearest staggered gridpoint ⟶ the windspeed timeseries there.
Yes, nearest-neighbor is blocky, and several points on the uniform wgs84 grid
will get the same Cosmo timeseries.
Comments welcome.
Netcdf4 files are afaik hdf5 files, which can be really messy.
h5stat .../cosmo/...nc4
in h5py
shows a multilevel tree with compressed blocks --
complicated and slow.
Xarray dask zarr etc. can't even see the complicated hdf5 way down below,
so cannot possibly speed it up.
What I really want is a clean path to numpy -- solid,
with a well-defined algebra of primitives (slicing, views, memmap).
A file tree of .npz files < 1 GB is, for me,
simpler and faster than messing with xarray etc.; ymmv.
See
numpy indexes small amounts of data 1000 faster than xarray
(2019)
Cyrille Rossant moving-away-hdf5 (2016)
my little testbench to time various paths numpy <--> xarray <--> read/write
Choice of software depends on what you really want to do, what you know / who you know, learn time. Last not least, it's a matter of taste:
Less is more
More is more
cheers
— denis-bz-py t-online.de updated 2024-04-23 April