Skip to content

Instantly share code, notes, and snippets.

@willirath
Last active December 11, 2023 14:15
Show Gist options
  • Save willirath/2176a9fa792577b269cb393995f43dda to your computer and use it in GitHub Desktop.
Save willirath/2176a9fa792577b269cb393995f43dda to your computer and use it in GitHub Desktop.
Dask-Jobqueue SLURMCluster with Singularity
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@vsoch
Copy link

vsoch commented Jul 31, 2020

haha okay, I haven't gotten my allocation yet so that's why I haven't seen that yet :)

@willirath
Copy link
Author

willirath commented Jul 31, 2020 via email

@thomasarsouze
Copy link

Hi @willirath.
Thanks a lot for this exemple, that's really something I want to make work. I've been trying to reproduce on a small local cluster that uses SLURM.

So:

  1. I've created an image of the container: singularity build --remote esm-vfc-stacks_latest.sif docker://esmvfc/esm-vfc-stacks:latest
  2. then run it: singularity run esm-vfc-stacks_latest.sif jupyter lab --no-browser --ip=login0.frioul
  3. Running inside the notebook your steps: pip install dask_jobqueue, overwritting sbatch, squeue, scancel. So I have a job_script so looks correct:
#!/usr/bin/env bash

#SBATCH -J dask-worker
#SBATCH -e /home/arsouze/pangeo/tests/logs/dask-worker-%J.err
#SBATCH -o /home/arsouze/pangeo/tests/logs/dask-worker-%J.out
#SBATCH -n 1
#SBATCH --cpus-per-task=40
#SBATCH --mem=159G
#SBATCH -t 00:20:00
#SBATCH -C quad,cache
export I_MPI_DOMAIN=auto
export I_MPI_PIN_RESPECT_CPUSET=0
module load intel intelmpi
singularity run /home/arsouze/pangeo/esm-vfc-stacks_latest.sif python -m distributed.cli.dask_worker tcp://195.83.183.81:36584 --nthreads 10 --nprocs 4 --memory-limit 42.50GB --name name --nanny --death-timeout 60 --local-directory $SCRATCHDIR --host ${SLURMD_NODENAME}.frioul

but when I try to scale I have the following error message:

Task exception was never retrieved
future: <Task finished coro=<_wrap_awaitable() done, defined at /srv/conda/envs/notebook/lib/python3.7/asyncio/tasks.py:623> exception=RuntimeError('Command exited with non-zero exit code.\nExit code: 255\nCommand:\nsbatch /tmp/tmplbeliirt.sh\nstdout:\n\nstderr:\n\n')>
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.7/asyncio/tasks.py", line 630, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/deploy/spec.py", line 50, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_jobqueue/core.py", line 310, in start
    out = await self._submit_job(fn)
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_jobqueue/core.py", line 293, in _submit_job
    return self._call(shlex.split(self.submit_command) + [script_filename])
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_jobqueue/core.py", line 393, in _call
    "stderr:\n{}\n".format(proc.returncode, cmd_str, out, err)
RuntimeError: Command exited with non-zero exit code.
Exit code: 255
Command:
sbatch /tmp/tmplbeliirt.sh
stdout:

stderr:

and the slurm job doesn't get submitted...
I think that at this point, I'm not clear on what is failing in the workflow. Do you have any hints ?

@willirath
Copy link
Author

willirath commented Dec 14, 2020

I'd debug this with manual slurm commands:

  1. Can you use the sbatch over SSH workaround? I'd check by trying if,
ssh $(hostname) -q -t ". /etc/profile && squeue [maybe more args]"

gives the same output as the equivalend squeue command directly run on the host machine.

  1. Then, the question is, if the job script that is written to some tmp location by the SLURMCluster running within the container is readable on the host machine.

  2. Finally, can you sumbit the job script from the tmp file with sbatch via SSH?

(edited: Add third point)

@willirath
Copy link
Author

  1. Can you SSH back into the host machine? The error could stem from ssh $(hostname) not being possible (at least without a password). You'd need to set up SSH keys for this.

@thomasarsouze
Copy link

Thx for your quick answer. Indeed, thanks to a local security feature provided by our admins, ssh $(hostname) is refused ! Will resume once they remove that.

@kathoef
Copy link

kathoef commented Mar 9, 2021

We have come up with a little convenience tool that provides a structured way of bind mounting host-system SLURM libraries into a Singularity container session and thus enables the batch scheduler commands. This approach omits the SSH restrictions that system administrators might have set up (we also use such an HPC system, which has motivated that development).

All you need is to come up with a system-specific "configuration file" (which needs basically a one-time exploratory session with a few strace commands to isolate the necessary batch scheduler shared libraries and configuration files). Make sure you have read the compatibility section, though, as there are a few limitations: https://github.com/ExaESM-WP4/Batch-scheduler-Singularity-bindings

/cc @vsoch and @thomasarsouze

@poplarShift
Copy link

This is great, thank you so much for this! Only needed minor tweaks to account for differences in my cluster and the singularity version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment