WARNING: This hasn't been tested extensively outside of my environment. Your mileage may vary.
Assumptions:
- Any security group modifications or creation that CAPA does that's not specifically flagged below are acceptable for a brief disruption when modified
- This is valid as of CAPA 2.0.2. This may not work with new versions (e.g., the steps were different pre-2.x and it was easier to import even the VPC itself pre-2.x)
Importing CAPA Cluster (using BYO VPC):
- Make sure
AWSManagedControlPlane.spec.eksClusterName
matches the EKS cluster name - Optionally set
AWSManagedControlPlane.spec.network.securityGroupOverrides.controlplane
to match the security group you have on the EKS controlplane. If you have extra security groups I haven't been able to figure out how to import those into CAPA but they stay attached to the EKS cluster and are just ignored by CAPA - Set the VPC information according to the BYO VPC specs https://cluster-api-aws.sigs.k8s.io/topics/bring-your-own-aws-infrastructure.html#configuring-the-awscluster-specification
- Determine if you need to set
AWSManagedControlPlane.spec.vpcCni.disabled
based on what you have installed on your cluster - AWS resources have the required tags according to https://cluster-api-aws.sigs.k8s.io/topics/bring-your-own-aws-infrastructure.html#tagging-aws-resources
- Set tag
kubernetes.io/cluster/<clusterName>
=owned
orshared
is set appropriately on VPC, Subnets, & Route table resources - Set tag
kubernetes.io/cluster/<clusterName>
=owned
is set on EKS cluster - Set tag
kubernetes.io/role/internal-elb
&kubernetes.io/role/elb
are set on the appropriate Subnets - Set tag
sigs.k8s.io/cluster-api-provider-aws/cluster/<clusterName>
=owned
on the EKS cluster - Set tag
sigs.k8s.io/cluster-api-provider-aws/role
=common
on the EKS cluster
- Set tag
- Make sure that the credentials/IAM Role that CAPA runs as will have access to the EKS cluster to manage things like CNI and/or
iamAuthenticatorConfig
(via theaws-auth
ConfigMap) - If you have an OIDC provider attached you'll need to have it detached before applying the yaml manifest or set
AWSManagedControlPlane.spec.associateOIDCProvider: false
(haven't been able to figure out why it doesn't detect it's already attached)
Caution: If you are running kube-proxy
via your legacy code/install, and set AWSManagedControlPlane.spec.kubeProxy.disabled
to true
, it will uninstall the kube-proxy
DaemonSet
At this point you are running/managing the EKS cluster via CAPA but the compute nodes are still running/connected using the non-CAPA system.
Migrating the workloads to CAPA managed compute tiers:
- Create new compute tiers using
MachineDeployment
orAWSManagedMachinePool
and size them appropriately - Cordon the old compute tiers
- If using AutoScalingGroups, add the tag
k8s.io/cluster-autoscaler/node-template/taint/managed-by
=legacy:NoSchedule
to the ASGs (or whatever taint you want to use to tell the cluster-autoscaler that the old nodes will have a taint) - Taint the old compute tiers with the above taint (this will ensure the cluster-autoscaler knows that any nodes from these ASGs will have the taint when started so it won't try to scale them up)
- Drain the old compute tier nodes
- You may be able to rely on the cluster-autoscaler to automatically delete/remove the old nodes but if not, remove them and terminate the instances
Now all compute nodes are managed via CAPA
fyi: https://kccnceu2022.sched.com/event/yttp/how-to-migrate-700-kubernetes-clusters-to-cluster-api-with-zero-downtime-tobias-giese-sean-schneeweiss-mercedes-benz-tech-innovation