Skip to content

Instantly share code, notes, and snippets.

@surya00060
surya00060 / gautschi.sub
Last active July 14, 2025 18:19
SLURM job submission script for Gautschi Cluster
#!/bin/bash -l
#SBATCH -p cocosys #Partition is fixed to cocosys
#SBATCH -A cocosys #Account is also fixed to cocosys
#SBATCH --nodes=1 #Number of nodes
#SBATCH --time=12:00:00 # Max Time limit
#SBATCH --job-name job_name
#SBATCH --gpus-per-node=1 #Number of GPUs per node
#SBATCH --cpus-per-gpu=14 #Number of CPUs per GPU. Requires 14:1 ratio of GPU and CPU
#SBATCH --output=job_output_gautschi.out
@surya00060
surya00060 / gilbreth.job
Last active August 25, 2025 20:59
SLURM job submission on Gilbreth Cluster
#!/bin/bash -l
#SBATCH --nodes=1 #Number of nodes
#SBATCH --ntasks-per-node=24 #Number of CPUs per node. 96 CPUs in 1 node
#SBATCH --gres=gpu:1 #Number of GPUs. We have 4 H100s in 1 node.
#SBATCH --partition=araghu #Always set to araghu
#SBATCH --mem=240G #Amount of memory required. Maximum is 2TB
#SBATCH --time=2-00:00:00 #Maximum Time Limit
#SBATCH -A araghu #Queue name: araghu or araghu-scale, whichever you have access to.
#SBATCH -J job_name
#SBATCH --output=job_output.out #Job output file
@surya00060
surya00060 / demo.py
Last active February 23, 2022 17:28
import time
import torch
print(f"Number of Threads = {torch.get_num_threads()}")
#######
## I have added parameters which can be controllable.
## But on setting them to the minimum we won't be able to notice any difference under different number of threads
numChannels = 32 ## Can be reducible till 1
inputSize = 14 ## Can be reducible till 3
@surya00060
surya00060 / gist:422feb0acbcc54db697590cd08d00193
Created December 14, 2021 19:11 — forked from yaroslavvb/gist:b73ff35424dd7ab762234620cf583aac
Example of restricting part of graph to run on single core
# try running cpu intensive test on two devices
import tensorflow as tf
import time
def matmul_op():
"""Multiply two matrices together"""
n = 2000
a = tf.ones((n, n), dtype=tf.float32)
@surya00060
surya00060 / parallel.py
Last active July 14, 2021 21:24
A code snippet to realize inconsistent operator scheduling. Some code written by Jake Stevens.
import tensorflow as tf
def device_mapped_call_factory(layer, mapping, model):
def device_mapped_call(inp, *args, **kwargs):
with tf.device(layer.mapping):
ret = layer.orig_call(inp, *args, **kwargs)
return ret
return device_mapped_call
def device_map_model(model, mappings):