Skip to content

Instantly share code, notes, and snippets.

@brews
brews / README.md
Last active July 9, 2025 23:10
stcaf prototype run with multiple containers. Jobs orchestrated with docker compose.

Example

Warning

The data setup downloads roughly ~25 GB from the internet. This can take an hour or longer, depending on your internet connection. Be sure you have adequate storage and bandwidth.

Download the scripts into an empty directory and run

# Sets up a ./data file with input and output directories.
sh ./setup_data.sh
@brews
brews / isabelle_backfill.py
Last active December 12, 2023 18:48
Fills in implied directory blobs for GCS bucket mounted with GCSfuse without using the --implied-dir option.
"""
Fills in implied directory blobs for GCS bucket mounted with GCSfuse without using the --implied-dir option.
If you tell it to look in `mygcsbucket` for prefix `path/to/files/to/read/in/gcsfuse/`. It will
ensure implied dirs nested in gs://mygcsbucket/path/to/files/to/read/in/gcsfuse/ get covered.
It can handle 10 - 100k directories in under 30 minutes if you run it from Cloud Shell.
"""
import logging
from pathlib import Path
@brews
brews / used_containers.sh
Last active December 1, 2023 21:07
Parse GCP logging to get JSON list of container images used in the past 60 days in a GKE jupyterhub deployment.
#!/usr/bin/env bash
# Parse GCP logging to print sample JSON list of container images used in the
# past 60 days in a GKE singleuser-server Jupyterhub deployment.
# Be sure to set PROJECT_ID and have jq installed.
set -e
PROJECT_ID="gcpprojectid"
@brews
brews / trigger_zenodo_upload.py
Last active August 16, 2023 20:24
Trigger a Zenodo DOI for past Github Releases on a public repo.
"""
Trigger a Zenodo DOI for past Github Releases on a public repo.
Modified from @medley56 https://github.com/zenodo/zenodo/issues/1463#issuecomment-1468932469 on 2023-08-16.
"""
import requests
# Fill these in...
@brews
brews / lon360to180.py
Last active May 3, 2023 23:30
Convert 0 - 360 longitude to -180 - 180 longitude or -180 - 180 to 0 - 360 longitude in Python
def lon360to180(lon):
return (lon + 180.0) % 360.0 - 180.0
# The inverse going from -180 - 180 to 0 - 360 is
def lon180to360(lon):
return lon % 360.0
@brews
brews / cf_convention_example.txt
Created August 25, 2022 23:29
Example metadata in a random CMIP6 simulation data file following CF-conventions.
# Example CF-convention attrs/metadata
{'Conventions': 'CF-1.7 CMIP-6.2',
'activity_id': 'CMIP',
'branch_method': 'standard',
'branch_time_in_child': 0.0,
'branch_time_in_parent': 0.0,
'cmor_version': '3.4.0',
'creation_date': '2019-11-09T02:07:38Z',
'data_specs_version': '01.00.30',
'experiment': 'all-forcing simulation of the recent past',
@brews
brews / conda_pkg_size.sh
Last active April 7, 2022 21:43
Get the size of packages from within active conda/miniconda/anaconda/mamba environment.
grep '"size":' ${CONDA_PREFIX}/conda-meta/*.json | sort -k3rn | sed 's/.*conda-meta\///g' | column -t
@brews
brews / papermill-test-workflow.yaml
Last active February 12, 2022 00:38
Simple Jupyter notebook with a "parameter" tagged cell for running with papermill. Includes Argo Workflow running notebook on jupyterhub container image.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: papermill-test-
spec:
entrypoint: main
templates:
- name: main
inputs:
@brews
brews / postprocess_combine_example.py
Created February 8, 2022 21:25
Rough Argo Workflow to load daily PRISM variables, clip to CA bounding box, and output to single Zarr store.
import xarray as xr
tmin_url = "~/Downloads/tmin-20220208/annual.zarr"
tmax_url = "~/Downloads/tmax-20220208/annual.zarr"
tmean_url = "~/Downloads/tmean-20220208/annual.zarr"
out_url = "gs://fakebucketname/prism-ca-20220208.zarr"
tmin = xr.open_zarr(tmin_url).drop("crs")
tmax = xr.open_zarr(tmax_url).drop("crs")
@brews
brews / dask-internal-process-demo.yaml
Last active January 20, 2022 01:58
Example Argo Workflow creating a process-based dask.distributed job with a LocalCluster, internal to the pod's node.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: dask-internal-process-demo-
spec:
entrypoint: dask
activeDeadlineSeconds: 1800 # Safety first, kids!
templates:
- name: dask
script: