Skip to content

Instantly share code, notes, and snippets.

@asmateus
asmateus / instructions.md
Last active October 24, 2024 06:53
Simple Slurm configuration in Debian based systems

Slurm Configuration Debian based Cluster

Here I will describe a simple configuration of the slurm management tool for launching jobs in a really simplistic cluster. I will assume the following configuration: a main node (for me it is an Arch Linux distribution) and 3 compute nodes (for me compute nodes are Debian VMs). I also assume there is ping access between the nodes and some sort of mechanism for you to know the IP of each node at all times (most basic should be a local NAT with static IPs)

Basic Structure

Slurm management tool work on a set of nodes, one of which is considered the master node, and has the slurmctld daemon running; all other compute nodes have the slurmd daemon. All communications are authenticated via the munge service and all nodes need to share the same authentication key. Slurm by default holds a journal of activities in a directory configured in the slurm.conf file, however a Database management system can be set. All in all what we will try to do is:

  • Install `munge