This guide will help you setup a SPOT-based multi-instance type cluster. In order to accomplish this, we make a few assumptions:
a. Instances share the same number of vcpus, in this case it's all 96 vcpus:
c5.24xlarge
r5.24xlarge
m5.24xlarge
c5.metal
r5.metal
m5.metalb. All instances are launched in a single AZ
- Create a cluster, I used the following config:
[global]
cluster_template = multi-instance
update_check = true
sanity_check = true
[aws]
aws_region_name = us-east-1
[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}
[cluster multi-instance]
base_os = alinux2
key_name = amzn2
vpc_settings = us-east-1
scheduler = slurm
master_instance_type = c5.2xlarge
compute_instance_type = c5.24xlarge
initial_queue_size = 0
max_queue_size = 200
maintain_initial_size = true
disable_hyperthreading = true
fsx_settings = fsx
cluster_type = spot
[fsx fsx]
shared_dir = /fsx
storage_capacity = 1200
[vpc us-east-1]
vpc_id = vpc-b5d7e3cc
master_subnet_id = subnet-5eda8e04- Go to ASG Console >
parallelcluster-[cluster_name]> Details > Note the name of the Launch Template, i.e.ComputeServerLaunchTemplate_ncE8QrsA7HW9
- Go to Launch Templates Console > find the template you noted the name of above and click
Modify template (Create new version)
- On the Edit screen, Click on the
Instance typedrop down and select:Don't include in launch tempate:
Under Advanced details uncheck Request Spot instances:
Save the new version.
- Go back to ASG Console >
parallelcluster-[cluster_name]> Edit > SelectLaunch Template Version> Select2
Select fleet Composition Combine purchase options and instances. Select the Instance Types (all with the same number of VCPU's) that you want to use.
Click Save.
Go back to ASG Console > parallelcluster-[cluster_name] > Edit > Set Desired to 200 instances:
Grab a cup of ☕️ as it scales
🚀









@sean-smith - Is there an equivalent to this in later versions of parallelcluster? I had to roll back to an older version to get
pclusterto generate an autoscaling group