Tainting and Labeling Kubernetes Nodes to Run Special Workload - A quick guide that is finally NOT confusing
All right folks, I intend to keep this one short and that's what I will do. I mean, it's supposed to be easy but the official documentations(1, 2) make it unnecessary confusing. So I think maybe I can help filling in the gap.
I will be using one of our business requirements at Buffer in this project, as the example for this blog post.
So, we need a few nodes that are dedicated to running cronjobs, and nothing else. At the same time we want to make sure the cornjobs are scheduled to these nodes, and nowhere else. This means we need 2 things
- Tainted nodes that don't take other workloads
- Workload that only go to the destination nodes
Now, let's start from nodes, then the workload
Since the requirement is broken down to 2 aspects (see above), there are 2 things we will need to specify for node(s). As always, kops is my weapon of choice.
In kops you can do this kops edit ig <INSTANCE GROUP IN INTEREST>
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
labels:
kops.k8s.io/cluster: steven.k8s.com
name: frequent-cronjob-nodes
spec:
image: kope.io/k8s-1.13-debian-stretch
machineType: m4.xlarge
maxSize: 2
minSize: 2
nodeLabels:
kops.k8s.io/instancegroup: frequent-cronjob-nodes
role: Node
subnets:
- us-east-1b
- us-east-1c
taints:
- dedicated=frequent-cronjob-nodes:NoSchedule
This prevents other workloads being scheduled to them. It's achieved by these 2 lines
taints:
- dedicated=frequent-cronjob-nodes:NoSchedule
This help a specialized workload to locate the nodes. It's achieved by these 2 lines
nodeLabels:
kops.k8s.io/instancegroup: frequent-cronjob-nodes
I know there are people who don't use kops out there. If you are one of them, here are 2 commands to help
kubectl taint nodes <NODE IN INTEREST> dedicated=frequent-cronjob-nodes:NoSchedule
kubectl label nodes <NODE IN INTEREST> kops.k8s.io/instancegroup=frequent-cronjob-nodes
Similar to nodes, we will need to do 2 things to the deployment/cronjob yaml file. I'm including a complete yaml to save our eyes from this.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
namespace: dev
name: steven-cron
labels:
app: steven-cron
spec:
schedule: "* * * * *"
jobTemplate:
spec:
template:
spec:
nodeSelector:
kops.k8s.io/instancegroup: frequent-cronjob-nodes
tolerations:
- key: dedicated
value: frequent-cronjob-nodes
operator: "Equal"
effect: NoSchedule
containers:
- name: steven-cron
image: buffer/steven-cron
command: ["php", "./src/Crons/index.php"]
imagePullSecrets:
- name: buffer
This makes sure the workload can be scheduled to the tainted nodes. It's achieved by these lines
tolerations:
- key: dedicated
value: frequent-cronjob-nodes
operator: "Equal"
effect: NoSchedule
This makes sure the workload is only to be scheduled to the specified nodes. It's achieved by these 2 lines
nodeSelector:
kops.k8s.io/instancegroup: frequent-cronjob-nodes
This is it. We can now rest assure the right workload will be going to the right nodes. In this way we can start building some specialized node groups for specialized workloads, say GPU nodes for machine learning or memory intensive nodes for local caching.
I hope this helps in any way. Until next time, please feel free to hit me up on Twitter should you have any questions.