Skip to content

Instantly share code, notes, and snippets.

View sparticlesteve's full-sized avatar

Steve Farrell sparticlesteve

View GitHub Profile
@sparticlesteve
sparticlesteve / job.sh
Created May 15, 2025 03:40
working example for Noel
#!/bin/bash
#SBATCH -C gpu
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=1
#SBATCH --gpus-per-node=4
export IMAGE=nvcr.io/nvidia/pytorch:24.06-py3
export MASTER_ADDR=$(hostname)
export MASTER_PORT=29507
export OMP_NUM_THREADS=2
@sparticlesteve
sparticlesteve / start_mlflow.sh
Created December 6, 2024 19:03
Helper script for starting and viewing MLflow server from NERSC JupyterHub
#!/bin/bash
# Usage:
# - Go to https://jupyter.nersc.gov and start a server (e.g. login node)
# - Start a terminal from the launcher
# - Load your environment with mlflow installed and run this script:
# conda activate <my environment>
# ./start_mlflow.sh
# - Wait a few seconds for the server to start, then open the URL that is printed by the script.
import tensorflow_datasets as tfds
import tensorflow as tf
import os
# Download the dataset
tfds.disable_progress_bar()
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True)
mnist_train, mnist_test = datasets['train'], datasets['test']
$ tree -s cosmoUniverse_2019_05_4parE_tf_small
cosmoUniverse_2019_05_4parE_tf_small
|-- [ 4096] train
| |-- [ 16777287] univ_ics_2019-03_a10000668_000.tfrecord
| |-- [ 16777287] univ_ics_2019-03_a10000668_001.tfrecord
| |-- [ 16777287] univ_ics_2019-03_a10000668_002.tfrecord
| |-- [ 16777287] univ_ics_2019-03_a10000668_003.tfrecord
| |-- [ 16777287] univ_ics_2019-03_a10000668_004.tfrecord
| |-- [ 16777287] univ_ics_2019-03_a10000668_005.tfrecord
| |-- [ 16777287] univ_ics_2019-03_a10000668_006.tfrecord
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@sparticlesteve
sparticlesteve / gclstm_example.py
Created September 3, 2020 18:37
Modified graph conv LSTM example showing graph sequence data
import torch
import random
import numpy as np
import networkx as nx
import torch.nn.functional as F
from torch_geometric_temporal.nn.recurrent import GConvLSTM
def create_mock_data(number_of_nodes, edge_per_node, in_channels):
"""
@sparticlesteve
sparticlesteve / split_climate_data.sh
Created May 4, 2020 18:39
Script for splitting the climate benchmark dataset (until we have a better procedure)
#!/bin/bash
# Config
inDir=/global/cscratch1/sd/amahesh/gb_data/All-Hist
nTrain=65536
nValid=8192
nTest=8192
nTotal=$((nTrain + nValid + nTest))
# Shuffle and select all files we need
import torch
import torch.distributed as dist
# Configuration
ranks_per_node = 8
shape = 2**17
dtype = torch.float32
# Initialize
dist.init_process_group(backend='mpi')
import torch
import torch.distributed as dist
# Configuration
ranks_per_node = 8
shape = 2**16 # fails if 2**17
dtype = torch.float32
# Initialize
dist.init_process_group(backend='mpi')
$ srun -n 8 -c 10 -u -l python test_ddp.py --backend mpi
3: Initialized rank 3 local-rank 3 size 8
1: Initialized rank 1 local-rank 1 size 8
5: Initialized rank 5 local-rank 5 size 8
7: Initialized rank 7 local-rank 7 size 8
2: Initialized rank 2 local-rank 2 size 8
4: Initialized rank 4 local-rank 4 size 8
6: Initialized rank 6 local-rank 6 size 8
0: Initialized rank 0 local-rank 0 size 8
3: Generating a batch of data