Skip to content

Instantly share code, notes, and snippets.

@pelson
Created January 3, 2017 05:28
Show Gist options
  • Save pelson/c6880e24957c20e610a47bece9834598 to your computer and use it in GitHub Desktop.
Save pelson/c6880e24957c20e610a47bece9834598 to your computer and use it in GitHub Desktop.
A demonstration of making a rich MARS client interface using Iris
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question**: Is it possible to define our own \"lazy\" Iris data source, similar to that provided by Iris' OPeNDAP interface?\n",
"\n",
"**Answer**: There are a few ways to solve this. Unless there is an underlying capability to fetch *metadata* without fetching the actual *data payload*, the easiest approach would be for a client to hard-code knowledge of what is in the data-store.\n",
"\n",
"In the following example, I show the creation of a \"mars_data\" function. Given a forecast reference time, it returns a list of cubes that may be queried from mars. Knowledge of *what* is available is hard-coded, but the retrieval of the data-payload is entirely dependent upon the data requested. In addition, I show that once cubes are constructed we see the benefit from Iris' lazy loading capabilities, allowing a client (such as Iris) to do the subsetting in a far richer way than can be offered through a single CLI."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, let's get hold of the packages we need..."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import datetime\n",
"import textwrap\n",
"\n",
"import numpy as np\n",
"\n",
"import iris\n",
"import biggus\n",
"import cf_units"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next define a \"Fetcher\" that can be used to lazily fetch data from our client (MARS). This is similar to the pattern presented in https://pelson.github.io/2013/massive_virtual_arrays_with_biggus/."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"class MarsFetcher(object):\n",
" def __init__(self, grib_param, yx_resolution, forecast_rt, times, levels):\n",
" self._param = grib_param\n",
" self._times = np.asarray(times)\n",
" self._levels = np.asarray(levels)\n",
" self._forecast_rt = forecast_rt\n",
" \n",
" # The required attributes of a \"concrete\" array, according\n",
" # to biggus, are 'ndim', 'dtype', and 'shape'.\n",
" self.ndim = 4\n",
" self.dtype = np.float32\n",
" self.shape = (len(self._times), len(self._levels)) + tuple(yx_resolution)\n",
" \n",
" def __getitem__(self, keys):\n",
" # Implement the getitem/slicing interface which is the interface needed for biggus to create numpy arrays.\n",
" keys = biggus._init._full_keys(keys, self.ndim)\n",
"\n",
" # Just make the numbers up for now.\n",
" result = np.empty(self.shape)[keys]\n",
" \n",
" \n",
" # But we could do whatever we like... including calling out to the MARS client with this request...\n",
" \n",
" # Subset the times as requested.\n",
" steps = self._times[keys[0]]\n",
" if not isinstance(keys[0], slice):\n",
" steps = [steps]\n",
"\n",
" # Subset the levels as requested.\n",
" levels = self._levels[keys[1]]\n",
" if not isinstance(keys[1], slice):\n",
" levels = [levels]\n",
" \n",
" request = textwrap.dedent(\"\"\"\n",
" retrieve,\n",
" class=od,\n",
" date={forecast_rt:%Y-%m-%d},\n",
" expver=1,\n",
" levelist={levels},\n",
" levtype=pl,\n",
" param={param},\n",
" step={steps},\n",
" stream=oper,\n",
" time={forecast_rt:%H:%M:%S},\n",
" type=fc\n",
" \"\"\").strip().format(levels='/'.join(map(str, levels)),\n",
" param=self._param,\n",
" steps='/'.join(map(str, steps)),\n",
" forecast_rt=self._forecast_rt)\n",
"\n",
" # Print the request, so that we can see what would happen if we really implemented a client.\n",
" print(request)\n",
" \n",
" return result"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, produce cubes based on the expectation of what is available."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def mars_data(forecast_reference=None):\n",
" \"\"\"\n",
" Generates a list of cubes that can be accessed from MARS.\n",
" \n",
" Parameters\n",
" ----------\n",
" forecast_reference\n",
" A datetime\n",
" \n",
" \"\"\"\n",
" if forecast_reference is None:\n",
" now_m12 = datetime.datetime.now() - datetime.timedelta(hours=12)\n",
" forecast_reference = datetime.datetime(now_m12.year, now_m12.month, now_m12.day,\n",
" 0 if now_m12.hour <= 12 else 12)\n",
"\n",
" forecast_periods = [0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60,\n",
" 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99, 102, 105, 108, 111, 114, 117, 120]\n",
" levels = [1000, 950, 925, 900, 850, 800, 700, 600, 500, 400, 300, 250, 200, 150, 100, 70, 50, 30, 20, 10, 7, 5, 3, 2, 1]\n",
"\n",
" t_unit = cf_units.Unit('hours since 2000-01-01')\n",
" times = [t_unit.date2num(forecast_reference) + hour\n",
" for hour in forecast_periods]\n",
" \n",
" cubes = iris.cube.CubeList()\n",
" for std_name, param in (['air_temperature', '130.128'],\n",
" ['geopotential_height', '129.128']):\n",
" template = iris.cube.Cube(standard_name=std_name, attributes={'Source': 'My MARS dataset'},\n",
" data=biggus.NumpyArrayAdapter(MarsFetcher(param, (769, 1024),\n",
" forecast_reference,\n",
" forecast_periods, levels)))\n",
" template.add_dim_coord(iris.coords.DimCoord(points=np.linspace(-180, 180, 1024),\n",
" standard_name='longitude'), 3)\n",
" template.add_dim_coord(iris.coords.DimCoord(points=np.linspace(-90, 90, 769),\n",
" standard_name='latitude'), 2)\n",
" template.add_dim_coord(iris.coords.DimCoord(points=levels,\n",
" long_name='pressure', units='hPa'), 1)\n",
" template.add_dim_coord(iris.coords.DimCoord(points=times,\n",
" standard_name='time',\n",
" units=t_unit),\n",
" 0)\n",
" template.add_aux_coord(iris.coords.DimCoord(points=forecast_periods,\n",
" standard_name='forecast_period'),\n",
" 0)\n",
" template.add_aux_coord(iris.coords.DimCoord(points=[t_unit.date2num(forecast_reference)],\n",
" standard_name='forecast_reference_time', units=t_unit)) \n",
" cubes.append(template)\n",
" return cubes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As expected, calling this function produces a number of cubes: "
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0: air_temperature / (unknown) (time: 41; pressure: 25; latitude: 769; longitude: 1024)\n",
"1: geopotential_height / (unknown) (time: 41; pressure: 25; latitude: 769; longitude: 1024)\n",
"----\n",
"air_temperature / (unknown) (time: 41; pressure: 25; latitude: 769; longitude: 1024)\n",
" Dimension coordinates:\n",
" time x - - -\n",
" pressure - x - -\n",
" latitude - - x -\n",
" longitude - - - x\n",
" Auxiliary coordinates:\n",
" forecast_period x - - -\n",
" Scalar coordinates:\n",
" forecast_reference_time: 2017-01-02 12:00:00\n",
" Attributes:\n",
" Source: My MARS dataset\n"
]
}
],
"source": [
"cubes = mars_data()\n",
"print(cubes)\n",
"[temp] = cubes.extract('air_temperature')\n",
"print('----')\n",
"print(temp)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Whilst the MarsFetcher that we have implemented currently only returns empty data,\n",
"we can see the MARS queries that it would take to subset the data as requested.\n",
"\n",
"First, the full cube..."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"retrieve,\n",
"class=od,\n",
"date=2017-01-02,\n",
"expver=1,\n",
"levelist=1000/950/925/900/850/800/700/600/500/400/300/250/200/150/100/70/50/30/20/10/7/5/3/2/1,\n",
"levtype=pl,\n",
"param=130.128,\n",
"step=0/3/6/9/12/15/18/21/24/27/30/33/36/39/42/45/48/51/54/57/60/63/66/69/72/75/78/81/84/87/90/93/96/99/102/105/108/111/114/117/120,\n",
"stream=oper,\n",
"time=12:00:00,\n",
"type=fc\n"
]
}
],
"source": [
"data = temp[...].data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now a subset of levels and timesteps..."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"retrieve,\n",
"class=od,\n",
"date=2017-01-02,\n",
"expver=1,\n",
"levelist=1000/950/925/900,\n",
"levtype=pl,\n",
"param=130.128,\n",
"step=0/6/12/18/24/30/36/42/48/54/60/66/72/78/84/90/96/102/108/114/120,\n",
"stream=oper,\n",
"time=12:00:00,\n",
"type=fc\n"
]
}
],
"source": [
"data = temp[::2, :4].data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And just to show it works as expected, standard Iris constraints also work..."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"retrieve,\n",
"class=od,\n",
"date=2017-01-02,\n",
"expver=1,\n",
"levelist=1000,\n",
"levtype=pl,\n",
"param=130.128,\n",
"step=6/9/12/15/18/21/24,\n",
"stream=oper,\n",
"time=12:00:00,\n",
"type=fc\n"
]
}
],
"source": [
"data = temp.extract(iris.Constraint(pressure=1000,\n",
" forecast_period=lambda c: 5 < c.point < 25)).data"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [conda env:pre-proc]",
"language": "python",
"name": "conda-env-pre-proc-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment