🤖

Surya surya00060

🤖

surya00060 / gautschi.sub

Last active July 14, 2025 18:19

SLURM job submission script for Gautschi Cluster

	#!/bin/bash -l
	#SBATCH -p cocosys #Partition is fixed to cocosys
	#SBATCH -A cocosys #Account is also fixed to cocosys
	#SBATCH --nodes=1 #Number of nodes
	#SBATCH --time=12:00:00 # Max Time limit
	#SBATCH --job-name job_name
	#SBATCH --gpus-per-node=1 #Number of GPUs per node
	#SBATCH --cpus-per-gpu=14 #Number of CPUs per GPU. Requires 14:1 ratio of GPU and CPU
	#SBATCH --output=job_output_gautschi.out

surya00060 / gilbreth.job

Last active October 22, 2025 18:21

SLURM job submission on Gilbreth Cluster

	#!/bin/bash -l
	#SBATCH --nodes=1 #Number of nodes
	#SBATCH --ntasks-per-node=24 #Number of CPUs per node. 96 CPUs in 1 node
	#SBATCH --gres=gpu:1 #Number of GPUs. We have 4 H100s in 1 node.
	#SBATCH --partition=araghu #Always set to araghu
	#SBATCH --mem=240G #Amount of memory required. Maximum is 2TB
	#SBATCH --time=2-00:00:00 #Maximum Time Limit
	#SBATCH -A araghu #Queue name: araghu or araghu-scale, whichever you have access to.
	#SBATCH -J job_name
	#SBATCH --output=job_output.out #Job output file

surya00060 / demo.py

Last active February 23, 2022 17:28

	import time
	import torch

	print(f"Number of Threads = {torch.get_num_threads()}")
	#######
	## I have added parameters which can be controllable.
	## But on setting them to the minimum we won't be able to notice any difference under different number of threads

	numChannels = 32 ## Can be reducible till 1
	inputSize = 14 ## Can be reducible till 3

surya00060 / gist:422feb0acbcc54db697590cd08d00193

Created December 14, 2021 19:11 — forked from yaroslavvb/gist:b73ff35424dd7ab762234620cf583aac

Example of restricting part of graph to run on single core

	# try running cpu intensive test on two devices

	import tensorflow as tf
	import time

	def matmul_op():
	"""Multiply two matrices together"""

	n = 2000
	a = tf.ones((n, n), dtype=tf.float32)

surya00060 / parallel.py

Last active July 14, 2021 21:24

A code snippet to realize inconsistent operator scheduling. Some code written by Jake Stevens.

	import tensorflow as tf

	def device_mapped_call_factory(layer, mapping, model):
	def device_mapped_call(inp, args, *kwargs):
	with tf.device(layer.mapping):
	ret = layer.orig_call(inp, args, *kwargs)
	return ret
	return device_mapped_call

	def device_map_model(model, mappings):