Skip to content

Instantly share code, notes, and snippets.

@bryan-lunt
Last active September 19, 2021 05:42
Show Gist options
  • Save bryan-lunt/3389deca4be96ed1301fb5a1ecdf971a to your computer and use it in GitHub Desktop.
Save bryan-lunt/3389deca4be96ed1301fb5a1ecdf971a to your computer and use it in GitHub Desktop.
Example for submitting multiple steps within a SLURM batch file, those steps should execute in parallel if possible.
#!/bin/bash
#SBATCH --time=04:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=12
#SBATCH --exclusive
#SBATCH --job-name subjob-sched-test
#SBATCH -p secondary
module load openmpi/4.1.0-gcc-7.2.0-pmi2 gcc cmake
## If not started with SLURM, figure out where we are relative to the build directory
#####Snippet from: http://stackoverflow.com/questions/59895/
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
#####end snippet
#IF SLURM_SUBMIT_DIR is not set, we are not running in PBS, choose directory relative to script.
SLURM_SUBMIT_DIR=${SLURM_SUBMIT_DIR:-${SCRIPT_DIR}}
#moves to the directory the user was in when they ran `sbatch`
cd ${SLURM_SUBMIT_DIR} #assumed to be the source tree
#setup the test script. The lets the whole thing be one script
cat << EOF > ./test.bash
#!/bin/bash
echo -n \${SLURM_STEP_ID} \${SLURM_CPUS_PER_TASK} \${SLURM_STEP_NODELIST} \$(date "+%s") " "
sleep 60
echo \$(date "+%s")
EOF
chmod 755 ./test.bash
#submit 12 single-core, 6 dual-core, 4 4-core, 2 6-core and 1 12-core subjobs, all submissions at once, then let SLURM schedule them within this allocation
for SIZE in 1 2 4 6 12
do
for REP in $(seq $((12/${SIZE})) )
do
srun --mpi=pmi2 --exclusive --ntasks-per-core 1 --mem=10M --ntasks 1 --cpus-per-task ${SIZE} ./test.bash > ./bar.s${SIZE}.r${REP} &
done
#Uncomment this wait and all the replicates of a particular size will get run together. This might improve scheduling.
#in other problems, you might not want that, but here we know what the execution times will be and the number of jobs packs the nodes.
#wait
done
wait
echo "Doneish?"
@bryan-lunt
Copy link
Author

sadly, it seems that setting core affinity --cpu-bind=cores breaks it again. I guess that's only useful for MPI under srun then.

@bryan-lunt
Copy link
Author

Obviously you will need to request more memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment