This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash -l | |
#SBATCH -p cocosys #Partition is fixed to cocosys | |
#SBATCH -A cocosys #Account is also fixed to cocosys | |
#SBATCH --nodes=1 #Number of nodes | |
#SBATCH --time=12:00:00 # Max Time limit | |
#SBATCH --job-name job_name | |
#SBATCH --gpus-per-node=1 #Number of GPUs per node | |
#SBATCH --cpus-per-gpu=14 #Number of CPUs per GPU. Requires 14:1 ratio of GPU and CPU | |
#SBATCH --output=job_output_gautschi.out |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash -l | |
#SBATCH --nodes=1 #Number of nodes | |
#SBATCH --ntasks-per-node=24 #Number of CPUs per node. 96 CPUs in 1 node | |
#SBATCH --gres=gpu:1 #Number of GPUs. We have 4 H100s in 1 node. | |
#SBATCH --partition=araghu #Always set to araghu | |
#SBATCH --mem=240G #Amount of memory required. Maximum is 2TB | |
#SBATCH --time=2-00:00:00 #Maximum Time Limit | |
#SBATCH -A araghu #Queue name: araghu or araghu-scale, whichever you have access to. | |
#SBATCH -J job_name | |
#SBATCH --output=job_output.out #Job output file |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import time | |
import torch | |
print(f"Number of Threads = {torch.get_num_threads()}") | |
####### | |
## I have added parameters which can be controllable. | |
## But on setting them to the minimum we won't be able to notice any difference under different number of threads | |
numChannels = 32 ## Can be reducible till 1 | |
inputSize = 14 ## Can be reducible till 3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# try running cpu intensive test on two devices | |
import tensorflow as tf | |
import time | |
def matmul_op(): | |
"""Multiply two matrices together""" | |
n = 2000 | |
a = tf.ones((n, n), dtype=tf.float32) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import tensorflow as tf | |
def device_mapped_call_factory(layer, mapping, model): | |
def device_mapped_call(inp, *args, **kwargs): | |
with tf.device(layer.mapping): | |
ret = layer.orig_call(inp, *args, **kwargs) | |
return ret | |
return device_mapped_call | |
def device_map_model(model, mappings): |