Skip to content

Instantly share code, notes, and snippets.

@ntraft
Last active June 24, 2025 15:37
Show Gist options
  • Select an option

  • Save ntraft/c7473e525e24437d172200df68edb87e to your computer and use it in GitHub Desktop.

Select an option

Save ntraft/c7473e525e24437d172200df68edb87e to your computer and use it in GitHub Desktop.
An example Slurm sbatch script (which also includes GPU usage)
#!/bin/bash
# Specify a partition.
#SBATCH --partition=bdgpu
# Request physical nodes (usually 1).
#SBATCH --nodes=1
# Request tasks (usually 1).
#SBATCH --ntasks=1
# Request processor cores (only if your program has multithreading/multiprocessing).
#SBATCH --cpus-per-task=3
# Request GPUs (delete if not needed).
#SBATCH --gpus-per-task=1
# Specify memory.
#SBATCH --mem=10G
# Maximum time limit of 10 minutes.
# Format: D-HH:MM:SS. Leading zeroes can be omitted, but included here for explanation.
#SBATCH --time=0-00:10:00
echo 'Running in:' $(pwd)
source ~/.bash_profile
# Exit immediately if any command has a non-zero return code.
set -e
set -o pipefail
# Everything above this point is recommended to include in ALL sbatch scripts.
# Everything below this point is specific to this example script.
echo 'Value of CUDA_LAUNCH_BLOCKING:' $CUDA_LAUNCH_BLOCKING
# If NVIDIA tools are present, then print device driver info.
if command -v nvidia-smi 1>/dev/null 2>&1; then
nvidia-smi
fi
echo 'Activating conda environment...'
#conda activate deep
conda activate deep-amd
python -c '
import os
NUM_CORES = os.cpu_count()
if hasattr(os, "sched_getaffinity"):
# This function is only available on certain platforms. When running with Slurm, it can tell
# us the true number of cores we have access to.
NUM_CORES = len(os.sched_getaffinity(0))
print(f"Cores available: {NUM_CORES}")
import torch
print(f"GPU available: {torch.cuda.is_available()}")'
echo 'End of test.'
@ntraft
Copy link
Copy Markdown
Author

ntraft commented Jan 31, 2024

Example output on VACC BlackDiamond:

Running in: /gpfs1/home/n/t/ntraft/scratch/Development
Value of CUDA_LAUNCH_BLOCKING:
Activating conda environment...
Cores available: 3
GPU available: True
End of test.

Running on DeepGreen would instead print out NVIDIA information, since that cluster has NVIDIA GPUs instead of AMD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment