Skip to content

Instantly share code, notes, and snippets.

@ntabris
Created January 31, 2025 23:07
Show Gist options
  • Save ntabris/2acc1ecb4323ae59b7c8cc26e36e02e1 to your computer and use it in GitHub Desktop.
Save ntabris/2acc1ecb4323ae59b7c8cc26e36e02e1 to your computer and use it in GitHub Desktop.
dask_from_batch_example.py
### GET DASK CLIENT ###
import os
import coiled
cluster = coiled.Cluster(os.environ.get("COILED_CLUSTER_NAME"))
cluster.adapt()
client = cluster.get_client()
print("dask client:", client)
### USER CODE FOLLOWS ###
import dask
# generate random timeseries of data
df = dask.datasets.timeseries(
"1980", "2005", partition_freq="2w"
).persist()
# perform a groupby with an aggregation
result = df.groupby("name").aggregate({"x": "sum", "y": "max"}).compute()
print(result)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment