This is my settings for Visual Studio Code, for syncing with the Settings Sync extension.
"""Collection of utilities for downloading and processing MOS output. | |
""" | |
from collections import OrderedDict | |
import datetime | |
import itertools | |
import os | |
import re | |
import subprocess |
from pangeo_forge_recipes.patterns import pattern_from_file_sequence, FilePattern, ConcatDim | |
from pangeo_forge_recipes.recipes import HDFReferenceRecipe | |
from pathlib import Path | |
from dask.diagnostics import ProgressBar | |
# NOTE(darothen): use PR fsspec/kerchunk#204 | |
# from kerchunk.grib2 import scan_grib | |
import intake | |
## Local |
from itertools import product | |
import numpy as np | |
INV_SQRT_3 = 1.0 / np.sqrt(3.0) | |
ASIN_INV_SQRT_3 = np.arcsin(INV_SQRT_3) | |
def gaussian_bell(xs, ys, xc=0., yc=0., xsigma=1., ysigma=1.): | |
""" Compute a 2D Gaussian with asymmetric standard deviations and |
sample_data.nc filter=lfs diff=lfs merge=lfs -text |
This is a simple example for using snakemake to automate a basic work pipeline.
Makefiles and GNU Make are awesome for many reasons, and it's unforgivable for any scientist working with data processing pipelines to use them throughout their projects. But Makefiles, while feature-rich, are not really an ideal tool for automating complex data processing pipelines. If, by some chance, your analyses simply require you to collect different data, process them with identical procedures, collate them, and produce a plot, then sure - Makefiles will do. But in analyzing climate model output, I've found that I have to do a lot of quirky hacks to fit this sort of workflow model.
A perfect example is the analysis of hierarchical climate model output. It's quite common to run a climate model multiple times in a factorial factor, changing 2-3 parameters (say, an emissions dataset and a parameterization in the model). While you can pigeon-hole linear da