I recently migrated a web service from Fargate to EKS with self-managed nodes. Ultimately, I wanted to see if it's possible to save money by moving to an EKS cluster with self-managed nodes.
With EKS, you have to pay a flat fee of $0.10 per hour just to run the Kubernetes control plane for a cluster, but with self-managed nodes, you can pick the exact instance types you want to use for your cluster, and that will help you save money.
There are a lot of puddles to step in and footguns to run into along the way, and I'm hoping this will help the next person make their move.
This web service exposes an Astro app on the internet. The server portion of the app also has permissions to interact with an S3 bucket. Pretty simple. I also had a log group and some WAF logic in my version of these, but I ripped those out for the sake of simplicity.
Here's what a very simple load-balanced web service might look like in ECS Fargate. This is the simplest I could make it as a "starter". You might add logging or some WAF logic as a next step.
import { Duration, Stack } from 'aws-cdk-lib'
import { Certificate } from 'aws-cdk-lib/aws-certificatemanager'
import { Repository } from 'aws-cdk-lib/aws-ecr'
import {
Cluster,
ContainerImage,
FargateTaskDefinition,
Protocol,
} from 'aws-cdk-lib/aws-ecs'
import { ApplicationLoadBalancedFargateService } from 'aws-cdk-lib/aws-ecs-patterns'
import { Effect, PolicyStatement } from 'aws-cdk-lib/aws-iam'
import { Bucket } from 'aws-cdk-lib/aws-s3'
export default class FargateStack extends Stack {
constructor(scope, id, props) {
super(scope, id, props)
const bucket = Bucket.fromBucketName(this, 'MyCoolBucket', 'my-cool-bucket')
const certificate = Certificate.fromCertificateArn(
this,
'MyCoolCertificate',
'arn:aws:acm:region:account:certificate/xxxx-xxxx-xxxx',
)
const repository = Repository.fromRepositoryName(
this,
'MyCoolRepo',
'MyCoolRepo',
)
const cluster = new Cluster(this, 'MyCoolCluster', {
clusterName: 'my-cool-cluster',
})
const taskDefinition = new FargateTaskDefinition(
this,
'MyCoolTaskDefinition',
{
cpu: 256, // minimum is 256
memoryLimitMiB: 512, // minimum is 512
},
)
taskDefinition.addToTaskRolePolicy(
new PolicyStatement({
effect: Effect.ALLOW,
actions: ['s3:PutObject', 's3:GetObject', 's3:ListBucket'],
resources: [bucket.arnForObjects('*'), bucket.bucketArn],
}),
)
const container = taskDefinition.addContainer('MyCoolContainer', {
image: ContainerImage.fromEcrRepository(repository, 'latest'),
})
container.addPortMappings({
containerPort: 4321,
hostPort: 4321,
protocol: Protocol.TCP,
})
const service = new ApplicationLoadBalancedFargateService(
this,
'MyCoolService',
{
certificate,
cluster,
desiredCount: 1,
healthCheckGracePeriod: Duration.minutes(3),
publicLoadBalancer: true,
redirectHTTP: true,
serviceName: 'b2-service',
taskDefinition,
},
)
}
}
Here is, for all intents and purposes, the same load-balanced web service in EKS. There are some subtle but important differences to dive into, but if one were to migrate from ECS to EKS, these templates might be a good place to start.
import { Stack } from 'aws-cdk-lib'
import {
AlbControllerVersion,
Cluster,
KubernetesVersion,
MachineImageType,
} from 'aws-cdk-lib/aws-eks'
import { KubectlV31Layer } from '@aws-cdk/lambda-layer-kubectl-v31'
import {
InstanceClass,
InstanceSize,
InstanceType,
Port,
} from 'aws-cdk-lib/aws-ec2'
import { Effect, PolicyStatement, Role, User } from 'aws-cdk-lib/aws-iam'
import { Repository } from 'aws-cdk-lib/aws-ecr'
import { Bucket } from 'aws-cdk-lib/aws-s3'
import { Certificate } from 'aws-cdk-lib/aws-certificatemanager'
export default class EksStack extends Stack {
constructor(scope, id, props) {
super(scope, id, props)
const bucket = Bucket.fromBucketName(this, 'MyCoolBucket', 'my-cool-bucket')
const certificate = Certificate.fromCertificateArn(
this,
'MyCoolCertificate',
'arn:aws:acm:region:account:certificate/xxxx-xxxx-xxxx',
)
const repository = Repository.fromRepositoryName(
this,
'MyCoolRepo',
'my-cool-repo',
)
const cluster = new Cluster(this, 'MyCoolCluster', {
albController: {
version: AlbControllerVersion.V2_8_2,
},
clusterName: 'my-cool-cluster',
defaultCapacity: 0,
kubectlLayer: new KubectlV31Layer(this, 'kubectl'),
version: KubernetesVersion.V1_31,
})
cluster.awsAuth.addUserMapping(user, {
username: 'myusername',
groups: ['system:masters'],
})
cluster.awsAuth.addRoleMapping(
Role.fromRoleArn(
this,
'MyCoolRoleMapping',
'arn:aws:iam::account:role/my-cool-cicd-deployment-role',
),
{
username: 'github-actions',
groups: ['system:masters'],
},
)
const nodeGroup = cluster.addAutoScalingGroupCapacity(
'MyCoolSelfManagedNodeGroup',
{
autoScalingGroupName: 'my-cool-self-managed-node-group',
instanceType: InstanceType.of(InstanceClass.T3, InstanceSize.SMALL),
minCapacity: 1,
maxCapacity: 3,
machineImageType: MachineImageType.AMAZON_LINUX_2,
},
)
nodeGroup.connections.allowFromAnyIpv4(
Port.tcpRange(30000, 32767),
'Allow load balancer traffic to NodePort range',
)
const appServiceAccount = cluster.addServiceAccount(
'MyCoolServiceAccount',
{
name: 'my-cool-app-service-account',
namespace: 'default',
},
)
appServiceAccount.addToPrincipalPolicy(
new PolicyStatement({
effect: Effect.ALLOW,
actions: ['s3:PutObject', 's3:GetObject', 's3:ListBucket'],
resources: [bucket.arnForObjects('*'), bucket.bucketArn],
}),
)
const timestamp = new Date().getTime().toString()
const appDeployment = {
apiVersion: 'apps/v1',
kind: 'Deployment',
metadata: {
name: 'my-cool-deployment',
annotations: { deployed_at: timestamp },
},
spec: {
replicas: 1,
selector: { matchLabels: { app: 'my-cool-app' } },
template: {
metadata: { labels: { app: 'my-cool-app' } },
spec: {
serviceAccountName: 'my-cool-app-service-account',
containers: [
{
name: 'my-cool-container',
image: `${repository.repositoryUri}:latest`,
ports: [{ containerPort: 4321 }],
},
],
},
},
},
}
const appService = {
apiVersion: 'v1',
kind: 'Service',
metadata: {
name: 'my-cool-service',
annotations: { deployed_at: timestamp },
},
spec: {
type: 'ClusterIP',
selector: { app: 'my-cool-app' },
ports: [
{
port: 80,
targetPort: 4321,
protocol: 'TCP',
},
],
},
}
const albIngress = {
apiVersion: 'networking.k8s.io/v1',
kind: 'Ingress',
metadata: {
name: 'job-jade-alb-ingress',
annotations: {
'kubernetes.io/ingress.class': 'alb',
'alb.ingress.kubernetes.io/listen-ports': '[{"HTTPS":443}]',
'alb.ingress.kubernetes.io/certificate-arn':
certificate.certificateArn,
'alb.ingress.kubernetes.io/scheme': 'internet-facing',
'alb.ingress.kubernetes.io/target-type': 'ip',
deployed_at: timestamp,
},
},
spec: {
ingressClassName: 'alb',
tls: [
{
hosts: ['mycooldomain.com'],
},
],
rules: [
{
host: 'mycooldomain.com',
http: {
paths: [
{
path: '/*',
pathType: 'ImplementationSpecific',
backend: {
service: {
name: 'my-cool-service',
port: { number: 80 },
},
},
},
],
},
},
],
},
}
cluster.addManifest('MyCoolDeployment', appDeployment)
cluster.addManifest('MyCoolService', appService)
cluster.addManifest('MyCoolIngress', albIngress)
}
}
Both stacks make references to these resources in the exact same way. There's no difference here.
const bucket = Bucket.fromBucketName(this, 'MyCoolBucket', 'my-cool-bucket')
const certificate = Certificate.fromCertificateArn(
this,
'MyCoolCertificate',
'arn:aws:acm:region:account:certificate/xxxx-xxxx-xxxx',
)
const repository = Repository.fromRepositoryName(
this,
'MyCoolRepo',
'MyCoolRepo',
)
const bucket = Bucket.fromBucketName(this, 'MyCoolBucket', 'my-cool-bucket')
const certificate = Certificate.fromCertificateArn(
this,
'MyCoolCertificate',
'arn:aws:acm:region:account:certificate/xxxx-xxxx-xxxx',
)
const repository = Repository.fromRepositoryName(
this,
'MyCoolRepo',
'MyCoolRepo',
)
Creating the cluster in ECS is short and sweet. You really just need to give it a name, and BAM! You've got a cluster.
const cluster = new Cluster(this, 'MyCoolCluster', {
clusterName: 'my-cool-cluster',
})
There are many ways to initialize an EKS cluster, but I've found this to be the easiest for the load-balanced web service scenario, even though it looks quite a bit more complicated than the ECS example above.
The albController
configuration creates an AWS Load Balanced Controller for the cluster.
One of the challenges of working with EKS is that it has its own way of handling ingress/egress and networking. The AWS Load Balanced Controller provides an easy way to create a load balancer that consolidates the two systems (Kubernetes and AWS).
You can also use the AlbController construct to create the AWS Load Balanced Controller, but it's nice that the Cluster construct provides a convenient way to configure this.
The ALB Controller has various published versions, and you need to provide a version for the configuration. V2_8_2
was the latest version at the time of writing this.
Because I'm wanting to create self-managed nodes (which are cheaper and provide more flexibility while requiring more operational overhead), I need to set defaultCapacity
to 0
. If this isn't done, then the cluster automatically comes with an AWS-managed node group with m5.large
nodes, which may be quite a bit more expensive than you're looking for.
Settings defaultCapacity
to 0
essentially tells the cluster, "don't worry about node groups when creating the cluster -- I'm going to take care of that myself."
The Kubernetes system itself has a version, and you need to set version
for the cluster. At the time of writing V1_31
was the latest available version within the aws-cdk-lib
JavaScript package.
Finally, kubectlLayer
is a required property where you create a KubectlLayer
construct. This construct is a lambda layer that contains both kubectl and Helm and allows you to use kubectl and Helm to interact with your cluster. You'll see lambda functions created as part of this cluster. When using kubectl, Helm, and deploying to your cluster, these lambdas will be hit, and you can glean info from the logging for these lambdas.
const cluster = new Cluster(this, 'MyCoolCluster', {
albController: {
version: AlbControllerVersion.V2_8_2,
},
clusterName: 'my-cool-cluster',
defaultCapacity: 0,
kubectlLayer: new KubectlV31Layer(this, 'kubectl'),
version: KubernetesVersion.V1_31,
})
In order to create a Fargate web service on ECS, we can use the FargateTaskDefinition
construct. This creates a task definition for Fargate. In EKS, we'll be creating a launch template in place of a task definition.
256mb is the bare minimum for cpu
, and 512mb is the bare minimum for memoryLimitMiB
. This might be equivalent to a t4g.nano
instance type.
const taskDefinition = new FargateTaskDefinition(
this,
'MyCoolTaskDefinition',
{
cpu: 256, // minimum is 256
memoryLimitMiB: 512, // minimum is 512
},
)
In the example below, I show that adding permissions for the web service is really just a matter of giving permissions to the task definition. At its simplest, this can be done with the addToTaskRolePolicy
helper.
taskDefinition.addToTaskRolePolicy(
new PolicyStatement({
effect: Effect.ALLOW,
actions: ['s3:PutObject', 's3:GetObject', 's3:ListBucket'],
resources: [bucket.arnForObjects('*'), bucket.bucketArn],
}),
)
You can define what containers are a part of the task definition with the addContainer
helper. Here, adding a container to the task definition is as simple as providing an image
to be used for the container.
const container = taskDefinition.addContainer('MyCoolContainer', {
image: ContainerImage.fromEcrRepository(repository, 'latest'),
})
You can update the port mappings for a container using the addPortMappings
helper. (4321 is the default port used by Astro).
container.addPortMappings({
containerPort: 4321,
hostPort: 4321,
protocol: Protocol.TCP,
})
VERY IMPORTANT NOTE: When creating launch templates with CDK (which is what happens under the hood when creating the auto scaling group in the EKS stack), you need to set the generateLaunchTemplateInsteadOfLaunchConfig
feature flag to true
in your cdk.json file. Otherwise, Cloudformation will try to instead create a launch configuration, which is deprecated. Without this feature flag, your stack will likely not be able to deploy. You set this flag in the context
of your cdk.json file:
"context": {
"@aws-cdk/aws-autoscaling:generateLaunchTemplateInsteadOfLaunchConfig": true
}
With the EKS stack, you can add self-managed nodes with the addAutoScalingGroupCapacity
helper. Here you can set the instance type and min and max capacity.
For machineImageType
, you should probably use AMAZON_LINUX_2
from the MachineImageType
enum exported from the eks module of CDK. Optionally, BOTTLEROCKET
on the same enum is a slimmer machine image made by AWS that has only the bare essentials to run your container, but I wouldn't start there if this is your first rodeo.
const nodeGroup = cluster.addAutoScalingGroupCapacity(
'MyCoolSelfManagedNodeGroup',
{
autoScalingGroupName: 'my-cool-self-managed-node-group',
instanceType: InstanceType.of(InstanceClass.T3, InstanceSize.SMALL),
minCapacity: 1,
maxCapacity: 3,
machineImageType: MachineImageType.AMAZON_LINUX_2,
},
)
You can update port connections for the node group with the connections
set of helpers. The below setting allows the load balancer traffic to hit the nodes in the node group.
To be honest, I'm not sure if this setting is really necessary. Maybe preparing the AWS Load Balancer Controller does this under the hood. But, it doesn't hurt to have it in there, anyway.
nodeGroup.connections.allowFromAnyIpv4(
Port.tcpRange(30000, 32767),
'Allow load balancer traffic to NodePort range',
)
To replicate the S3 permissions that we see in the ECS example, we can create a service account. Later on, we'll provide this service account to containers within our Kubernetes deployment. So, it's a little more complicated and requires a bit more setup, but the result will be the same.
const appServiceAccount = cluster.addServiceAccount(
'MyCoolServiceAccount',
{
name: 'my-cool-app-service-account',
namespace: 'default',
},
)
appServiceAccount.addToPrincipalPolicy(
new PolicyStatement({
effect: Effect.ALLOW,
actions: ['s3:PutObject', 's3:GetObject', 's3:ListBucket'],
resources: [bucket.arnForObjects('*'), bucket.bucketArn],
}),
)
This is the part where ECS really makes it easy. We can create an application load balanced fargate service using the ApplicationLoadBalancedFargateService
construct. With all of the params below, we tell the service to use the taskDefinition
and to allow traffic from the domain that matches the certificate
. It's a public web service, so we need to set publicLoadBalancer
to true
. The rest of these properties are easy to understand.
const service = new ApplicationLoadBalancedFargateService(
this,
'MyCoolService',
{
certificate,
cluster,
desiredCount: 1,
healthCheckGracePeriod: Duration.minutes(3),
publicLoadBalancer: true,
redirectHTTP: true,
serviceName: 'my-cool-service',
taskDefinition,
},
)
This is the challenging part for EKS. We need to set up a few different manifests. A Kubernetes manifest is a JSON/YAML configuration that describes a resource. (You can start to see how using EKS is very much like using a infra system within an infra system).
We need to create a Deployment
manifest, a Service
manifest, and an Ingress
manifest.
In the deployment manifest, we describe a deployment, which includes the containers we want, which image to use for the containers, and a possible service account we want to use for the container. This is where the previously made service account comes into play, providing our containers with S3 permissions.
The container will have the app label my-cool-app
. In our service
manifest, we have a selector for my-cool-app
which forwards internal traffic from port 80 to port 4321, which is where our Astro app is listening.
const appDeployment = {
apiVersion: 'apps/v1',
kind: 'Deployment',
metadata: {
name: 'my-cool-deployment',
annotations: { deployed_at: timestamp },
},
spec: {
replicas: 1,
selector: { matchLabels: { app: 'my-cool-app' } },
template: {
metadata: { labels: { app: 'my-cool-app' } },
spec: {
serviceAccountName: 'my-cool-app-service-account',
containers: [
{
name: 'my-cool-container',
image: `${repository.repositoryUri}:latest`,
ports: [{ containerPort: 4321 }],
},
],
},
},
},
}
The service
manifest creates a stable internal endpoint for the deployment to route traffic to. All incoming traffic on port 80 will be routed to port 4321, which is where the Astro app is being served.
const appService = {
apiVersion: 'v1',
kind: 'Service',
metadata: {
name: 'my-cool-service',
annotations: { deployed_at: timestamp },
},
spec: {
type: 'ClusterIP',
selector: { app: 'my-cool-app' },
ports: [
{
port: 80,
targetPort: 4321,
protocol: 'TCP',
},
],
},
}
Finally, the Ingress
manifest tells the ALB controller to actually create an application load balancer. The load balancer will listen to traffic on port 443 (https), and is expecting traffic from a domain that belongs to a given certificate.
There are three layers of port interaction here:
- The application load balancer is listening for traffic on port 443 that came from the
mycooldomain.com
domain. - The
ClusterIP
adds internal forwarding from 80 to 4321 - The deployment manifest houses a container with an exposed 4321 port, which receives this traffic.
const albIngress = {
apiVersion: 'networking.k8s.io/v1',
kind: 'Ingress',
metadata: {
name: 'job-jade-alb-ingress',
annotations: {
'kubernetes.io/ingress.class': 'alb',
'alb.ingress.kubernetes.io/listen-ports': '[{"HTTPS":443}]',
'alb.ingress.kubernetes.io/certificate-arn':
certificate.certificateArn,
'alb.ingress.kubernetes.io/scheme': 'internet-facing',
'alb.ingress.kubernetes.io/target-type': 'ip',
deployed_at: timestamp,
},
},
spec: {
ingressClassName: 'alb',
tls: [
{
hosts: ['mycooldomain.com'],
},
],
rules: [
{
host: 'mycooldomain.com',
http: {
paths: [
{
path: '/*',
pathType: 'ImplementationSpecific',
backend: {
service: {
name: 'my-cool-service',
port: { number: 80 },
},
},
},
],
},
},
],
},
}
Finally, we add these manifests to the cluster:
cluster.addManifest('MyCoolDeployment', appDeployment)
cluster.addManifest('MyCoolService', appService)
cluster.addManifest('MyCoolIngress', albIngress)
You might have noticed I added this to all of the annotations for all manifests:
const timestamp = new Date().getTime().toString()
...
{
annotations: {
...,
deployed_at: timestamp
}
}
This makes the manifest different on each cdk deploy
, which provides Kubernetes with a clue that a CDK deployment has occurred. Is this necessary? Maybe not, but the point is that you'll need mechanisms to be able to control the state as exists in both Cloudformation (CDK) as well as Kubernetes.
There are a few odds and ends in the EKS stack that are worth mentioning. You don't have to do this same setup in the ECS stack, but there are some additional niceties that can (and probably should) be added to the EKS stack.
While working with the EKS stack, either in the AWS console or in a CLI, you probably will want your role to be in the system:masters
group. The below allows a particular user to have administrative access to the cluster.
const user = User.fromUserName(this, 'MyCoolConsoleUser', 'myusername')
cluster.awsAuth.addUserMapping(user, {
username: user.userName,
groups: ['system:masters'],
})
Also, you're likely deploying to this EKS stack from some CICD workflow. You'll want to add the particular role that is used for CICD to the system:masters
group.
cluster.awsAuth.addRoleMapping(
Role.fromRoleArn(
this,
'MyCoolRoleMapping',
'arn:aws:iam::account:role/my-cool-cicd-deployment-role',
),
{
username: 'github-actions',
groups: ['system:masters'],
},
)
I hope this provides a solid start to working with either EKS or ECS in CDK. There is still a lot that could be discussed here. While working with Kubernetes in AWS, you're grappling with a system within a system, and you'll have to manage reconciling the state and networking aspects of these two systems.
That said, this guide is an "up-and-running" document, but is not even slightly exhaustive. I'm hoping to have more on this topic in the future, baseed on more feedback and further experience.