Skip to content

Instantly share code, notes, and snippets.

View jacobtomlinson's full-sized avatar

Jacob Tomlinson jacobtomlinson

View GitHub Profile
@jacobtomlinson
jacobtomlinson / .gitignore
Created November 14, 2024 10:37
Global gitignore
### Apple Specific ###
# ignore OS X hidden meta files
.DS_Store
.AppleDouble
.LSOverride
# Icon must end with two \r
Icon
@jacobtomlinson
jacobtomlinson / helloworld.py
Created October 31, 2024 11:01
Say hello to computers all around the world
import contextlib
import codecs
import subprocess
import pandas as pd
# Load list of global nameservers and country code information
print("Loading data sources...")
nameservers = pd.read_csv("https://public-dns.info/nameservers.csv")
countries = pd.read_csv("https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/raw/refs/heads/master/all/all.csv")
@jacobtomlinson
jacobtomlinson / README.md
Last active June 13, 2024 09:43
Run dask-cuda on a SLURM HPC

When using LocalCUDACluster on a single node it is possible to scale your work out on a SLURM based HPC with a few small tweaks.

First install the Dask Runners package. (Note: this is a prototype and will be merged into dask-jobqueue in the future)

pip install git+https://github.com/jacobtomlinson/dask-hpc-runner.git

Then replace LocalCUDACluster with the SLURMRunner class.

station mean_temp
Abha 18.0
Abidjan 26.0
Abéché 29.4
Accra 26.4
Addis Ababa 16.0
Adelaide 17.3
Aden 29.1
Ahvaz 25.4
Albuquerque 14.0
@jacobtomlinson
jacobtomlinson / run.py
Created November 2, 2023 17:21
Databrick run
import os
import subprocess
import time
import socket
DB_IS_DRIVER = os.getenv('DB_IS_DRIVER')
DB_DRIVER_IP = os.getenv('DB_DRIVER_IP')
if DB_IS_DRIVER == "TRUE":
print("This node is the Dask scheduler.")
@jacobtomlinson
jacobtomlinson / beam_k8s.py
Last active May 12, 2023 13:42
Apache Beam Dask Limitation MRE
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.runners.dask.dask_runner import DaskRunner
from dask.distributed import Client, performance_report
class NoopDoFn(beam.DoFn):
def process(self, item):
import time
time.sleep(0.1)
@jacobtomlinson
jacobtomlinson / beam_k8s.py
Created May 11, 2023 13:41
Apache Beam Dask Limitation MRE
import warnings
import time
from contextlib import contextmanager
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.runners.dask.dask_runner import DaskRunner
from dask.distributed import Client
from distributed.versions import VersionMismatchWarning
@jacobtomlinson
jacobtomlinson / notebook.yaml
Last active February 2, 2023 17:40
Kubernetes manifest to launch a Jupyter Notebook running RAPIDS ready for use with the Dask Operator
apiVersion: v1
kind: ServiceAccount
metadata:
name: jovyan
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: jovyan
rules:
@jacobtomlinson
jacobtomlinson / example.yaml
Created November 7, 2022 14:34
Example DaskCluster resource
apiVersion: kubernetes.dask.org/v1
kind: DaskCluster
metadata:
name: demo
spec:
worker:
replicas: 2
spec:
containers:
- name: worker
@jacobtomlinson
jacobtomlinson / README.md
Last active October 28, 2022 11:41
Recording textual demos with vhs

Install vhs.

$ conda create -n textual-demo python=3.10 -y

$ conda activate textual-demo

$ pip install textual

$ vhs < demo.tape