Use apt to install the necessary packages:
sudo apt install -y slurm-wlm slurm-wlm-doc
Load file:///usr/share/doc/slurm-wlm/html/configurator.html in a browser (or file://wsl%24/Ubuntu/usr/share/doc/slurm-wlm/html/configurator.html on WSL2), and:
- Set your machine's hostname in
SlurmctldHostandNodeName. - Set
CPUsas appropriate, and optionallySockets,CoresPerSocket, andThreadsPerCore. Use commandlscputo find what you have. - Set
RealMemoryto the number of megabytes you want to allocate to Slurm jobs, - Set
StateSaveLocationto/var/spool/slurm-llnl. - Set
ProctrackTypetolinuxprocbecause processes are less likely to escape Slurm control on a single machine config. - Make sure
SelectTypeis set toCons_res, and setSelectTypeParameterstoCR_Core_Memory. - Set
JobAcctGatherTypetoLinuxto gather resource use per job, and setAccountingStorageTypetoFileTxt.
Hit Submit, and save the resulting text into /etc/slurm-llnl/slurm.conf i.e. the configuration file referred to in /lib/systemd/system/slurmctld.service and /lib/systemd/system/slurmd.service.
Load /etc/slurm-llnl/slurm.conf in a text editor, uncomment DefMemPerCPU, and set it to 8192 or whatever number of megabytes you want each job to request if not explicitly requested using --mem during job submission. Read the docs and edit other defaults as you see fit.
Create /var/spool/slurm-llnl and /var/log/slurm_jobacct.log, then set ownership appropriately:
sudo mkdir -p /var/spool/slurm-llnl
sudo touch /var/log/slurm_jobacct.log
sudo chown slurm:slurm /var/spool/slurm-llnl /var/log/slurm_jobacct.log
Install mailutils so that Slurm won't complain about /bin/mail missing:
sudo apt install -y mailutils
Make sure munge is installed and running, and a munge.key was created with user-only read-only permissions, owned by munge:munge:
sudo service munge start
sudo ls -l /etc/munge/munge.key
Start services slurmctld and slurmd:
sudo service slurmd start
sudo service slurmctld start
Hi,
Thanks this has worked successfully but when i check status like,
sandeep@sandeep-VirtualBox:~$ systemctl status slurmd.service
● slurmd.service - Slurm node daemon
Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2022-04-30 22:50:47 IST; 9min ago
Docs: man:slurmd(8)
Process: 12931 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 12933 (slurmd)
Tasks: 1
Memory: 4.1M
CGroup: /system.slice/slurmd.service
└─12933 /usr/sbin/slurmd
Apr 30 22:50:47 sandeep-VirtualBox systemd[1]: Starting Slurm node daemon...
Apr 30 22:50:47 sandeep-VirtualBox slurmd-sandeep-VirtualBox[12931]: Node reconfigured socket/core boundaries SocketsPerBoard=4:1(hw) CoresPe>
Apr 30 22:50:47 sandeep-VirtualBox slurmd-sandeep-VirtualBox[12931]: Message aggregation disabled
Apr 30 22:50:47 sandeep-VirtualBox slurmd-sandeep-VirtualBox[12931]: CPU frequency setting not configured for this node
Apr 30 22:50:47 sandeep-VirtualBox systemd[1]: slurmd.service: Can't open PID file /run/slurmd.pid (yet?) after start: Operation not permitted
Apr 30 22:50:47 sandeep-VirtualBox slurmd-sandeep-VirtualBox[12933]: slurmd version 19.05.5 started
Apr 30 22:50:47 sandeep-VirtualBox slurmd-sandeep-VirtualBox[12933]: slurmd started on Sat, 30 Apr 2022 22:50:47 +0530
Apr 30 22:50:47 sandeep-VirtualBox systemd[1]: Started Slurm node daemon.
Apr 30 22:50:47 sandeep-VirtualBox slurmd-sandeep-VirtualBox[12933]: CPUs=4 Boards=1 Sockets=1 Cores=4 Threads=1 Memory=7951 TmpDisk=82909 Up>
Apr 30 22:50:56 sandeep-VirtualBox slurmd-sandeep-VirtualBox[12933]: error: Unable to register: Unable to contact slurm controller (connect f>
lines 1-21/21 (END)
last i got cant open pid file error and unable to contact slurm controller. Can you help what to do for this?