When using LocalCUDACluster
on a single node it is possible to scale your work out on a SLURM based HPC with a few small tweaks.
First install the Dask Runners package. (Note: this is a prototype and will be merged into dask-jobqueue
in the future)
pip install git+https://github.com/jacobtomlinson/dask-hpc-runner.git
Then replace LocalCUDACluster
with the SLURMRunner
class.
from dask_hpc_runner import SlurmRunner
# Tell the SLURM Runner to use the Dask CUDA worker class
cluster = SlurmRunner(worker_class="dask_cuda.CUDAWorker")
Run your code on a SLURM system.
srun -n4 python code.py
That's it!
If you run this script outside of SLURM it will raise a RuntimeError
, so if you want to make your code more flexible you could catch this and fall back to creating a LocalCUDACluster
.
See example.py
for a complete example.