- Generalizing to your own Kubernetes cluster
- A note on non-standard
kops
andnodeup
binaries - Installing required binaries
- Setting up "permanent" resources
- Creating the cluster
- Administering the cluster
- Other operations on the cluster
- Updating your local state
- Deleting the cluster
- Stopping the cluster
- Restarting the cluster
- Setting up the developer sandbox
- Enabling RBAC
- Creating and locking down the
dev
namespace - Creating users in the developer sandbox
- (Optional) Push the developer self-registration application
- Appendix
system-access.yml
dev-access.yml
dev-kube-config.yml
nodeport-ingress.tf
While the instructions below are specific operating the "kubernetes.click" cluster, you can use these instructions to operate your own cluster on some other domain. If you purchase the domain through Amazon Route 53 in your own AWS account, then all of the below should apply, replacing "kubernetes.click" with whatever your domain is. If you want to use a domain that you've purchased outside of Route 53 and/or you want to use a subdomain of a domain you own, this is possible too, just make sure:
- You have a Route 53 Hosted Zone for the (sub)domain you want to use
- Copy the appropriate NS record:
- For a subdomain, copy the NS record in the Route 53 Hosted Zone for that subdomain and copy it to an NS record in the parent domain, pointing to the subdomain
- For a root domain purchased outside of Route 53, copy the NS record in the Route 53 Hosted Zone for that domain to an NS record at your domain registrar
In order to:
- get the Kuberentes APIs for role-based access control to work,
- restrict the port range for NodePort Services, and
- ensure the
kubedns
addon can always get scheduled on the master (despite lack of rescheduler),
the following instructions refer to a kops
binary for Mac and a nodeup
binary for Linux built from this SHA. They are available for download from a public S3 bucket and their S3 URLs will be referenced in the instructions below.
wget https://s3.amazonaws.com/amitkgupta-kops-binaries/kops
chmod +x kops
mv kops /usr/local/bin/kops
You may wish to create and destroy the cluster many times, but there are certain resources such as an IAM user, S3 buckets, and DNS resources that you will likely only want to create/purchase once, and reuse every time.
- Create an IAM user with "AdministratorAccess" policy; we'll call it "kubernetes.click-admin"
- Create an S3 bucket where
kops
will store state; we'll call it "kops-state-bucket.kubernetes.click" - Create an S3 bucket where
terraform
will store state; we'll call it "terraform-state-bucket.kubernetes.click" - Edit the bucket policy for both buckets to give access to our IAM user; each bucket's policy should look something like this:
{
"Version": "2012-10-17",
"Id": "BUCKET_NAME",
"Statement": [
{
"Sid": "RANDOM_SID_1",
"Effect": "Allow",
"Principal": {
"AWS": "IAM_USER_ARN"
},
"Action": [
"s3:GetBucketLocation",
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::BUCKET_NAME"
},
{
"Sid": "RANDOM_SID_2",
"Effect": "Allow",
"Principal": {
"AWS": "IAM_USER_ARN"
},
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::BUCKET_NAME/*"
}
]
}
- Purchase the "kubernetes.click" domain from AWS through Route 53; this will automatically create a Route 53 Hosted Zone called "kubernetes.click" for you
- Export AWS environment variables with credentials of the "kubernetes.click-admin" IAM user:
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_DEFAULT_REGION=us-east-1
- Create an SSH keypair for later access to cluster VMs:
ssh-keygen -f ~/.ssh/id_rsa_kubernetes.click -N ""
- Create the
kops
cluster configuration:
kops create cluster \
--cloud=aws \
--dns-zone=kubernetes.click \
--kubernetes-version 1.5.1 \
--master-size=t2.medium \
--master-zones=us-east-1a \
--node-count=2 \
--node-size=t2.medium \
--ssh-public-key=/Users/amitgupta/.ssh/id_rsa_kubernetes.click.pub \
--zones=us-east-1a,us-east-1b \
--name=kubernetes.click \
--state=s3://kops-state-bucket.kubernetes.click
Note: we need to jump ahead to Kubernetes version 1.5.1 due to a known issue with RBAC.
- Generate the inputs for
terraform
:
NODEUP_URL=https://s3.amazonaws.com/amitkgupta-kops-binaries/nodeup kops update cluster \
--state=s3://kops-state-bucket.kubernetes.click \
--name=kubernetes.click \
--target=terraform \
--out=.
- Configure
terraform
to use remote state:
terraform remote config \
-backend=s3 \
-backend-config="bucket=terraform-state-bucket.kubernetes.click" \
-backend-config="key=terraform.tfstate" \
-backend-config="region=us-east-1"
- Modify the generated
terraform
inputs to disable load balancer services:
sed -i.bak 's/elasticloadbalancing:\*/elasticloadbalancing:DescribeLoadBalancers/' data/aws_iam_role_policy_masters.*_policy
Note: we stil need to allow elasticloadbalancing:DescribeLoadBalancers
even for NodePort
services due to a known issue.
- Modify the generated
terraform
inputs to associate an ingress ELB for NodePort services to the Autoscaling Group for nodes:
grep load_balancers kubernetes.tf || sed -i.bak '/resource "aws_autoscaling_group" "nodes/a\
load_balancers = ["${aws_elb.nodeport-elb-kubernetes-click.name}"]
' kubernetes.tf
- Copy the
nodeport-ingress.tf
template to your working directory - From the generated
terraform
inputs, extract the subnets for the Autoscaling Group for nodes and insert that data intonodeport-ingress.tf
:
NODEPORT_ELB_SUBNETS=$(sed -e '/resource "aws_autoscaling_group" "node/,/vpc_zone_identifier/!d' kubernetes.tf | tail -n1 | cut -f2 -d=)
sed -i.bak "s/subnets.*/subnets = ${NODEPORT_ELB_SUBNETS}/" nodeport-ingress.tf
- Create the cluster:
terraform apply
kubectl
should just work.
You should make sure your local state is up-to-date before performing other operations on the cluster. These steps will not perform operations on the cluster, they will just update your local state:
- Export AWS environment variables as when creating the cluster
- Generate up-to-date inputs for
terraform
as when creating the cluster by usingkops update cluster
- Configure
terraform
to use the same remote state as when creating the cluster by usingterraform remote config
- Modify the generated
terraform
inputs to disable load balancer services as when creating the cluster by usingsed
- Modify the generated
terraform
inputs to associate an ingress ELB for NodePort services to the Autoscaling Group for nodes as when creating the cluster by usingsed
- Make sure you have the
nodeport-ingress.tf
template in your working directory as when creating the cluster - Insert the subnets for the Autoscaling Group for nodes into
nodeport-ingress.tf
as when creating the cluster usingsed
- Make sure your local state is up-to-date
- Delete the cluster:
terraform destroy -force
- Delete the
kops
cluster configuration:
kops delete cluster --name=kubernetes.click --state=s3://kops-state-bucket.kubernetes.click --yes
- Delete the SSH keypair:
rm ~/.ssh/id_rsa_kubernetes.click*
- Make sure your local state is up-to-date
- Modify the generated
terraform
inputs to set all autoscaling groups to 0:
sed -i.bak -E "s/(min_size|max_size).*/\1 = 0/" kubernetes.tf
- Update the cluster:
terraform apply
- Make sure your local state is up-to-date
- Update the cluster:
terraform apply
These instructions will create a dev
namespace and leverage role-based access control in Kubernetes to ensure developers only have access to that namespace.
- Create the RBAC resources to ensure system components work in RBAC mode:
kubectl create -f system-access.yml
(system-access.yml
) - Restrict the cluster to RBAC authorization only:
kops edit cluster --state=s3://kops-state-bucket.kubernetes.click
# set spec.kubeAPIServer.authorizationMode: RBAC
- Create the namespace and associated RBAC resources:
kubectl create -f dev-access.yml
(dev-access.yml
)
- Download the
kops
-generated CA certificate and signing key from S3:
s3://kops-state-bucket.kubernetes.click/kubernetes.click/pki/private/ca/*.key
s3://kops-state-bucket.kubernetes.click/kubernetes.click/pki/issued/ca/*.crt
- Generate a client key:
openssl genrsa -out client-key.pem 2048
- Generate a CSR:
openssl req -new \
-key client-key.pem \
-out client-csr.pem \
-subj "/[email protected]/O=dev"
- Generate a client certificate:
openssl x509 -req \
-in client-csr.pem \
-CA ~/Downloads/*.crt \
-CAkey ~/Downloads/*.key \
-CAcreateserial \
-out client-crt.pem \
-days 10000
- Base64-encode the client key, client certificate, and CA certificate, and populate those values in
dev-kube-config.yml
- Distribute the populated
dev-kube-config.yml
file to your developers
- Create a Route 53 CNAME record pointing
register.kubernetes.click
tokubernetesclick-developer-registration.cfapps.io
- Create a GitHub OAuth application with:
- Homepage URL: https://register.kubernetes.click
- Authorization callback URL: https://register.kubernetes.click/github_callback
- Push the self-registration application to Pivotal Web Services
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: system:node--kubelet
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:node
subjects:
- kind: User
name: kubelet
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: cluster-admin--kube-system:default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: default
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: system:node-proxier--kube-proxy
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:node-proxier
subjects:
- kind: User
name: kube-proxy
kind: Namespace
apiVersion: v1
metadata:
name: dev
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
namespace: dev
name: dev-all
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: dev-role-dev-all-members
namespace: dev
subjects:
- kind: Group
name: dev
- kind: Group
name: system:serviceaccounts:dev
roleRef:
kind: Role
name: dev-all
apiGroup: "rbac.authorization.k8s.io"
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: BASE64_ENCODED_CA_CERTIFICATE
server: https://api.kubernetes.click
name: kubernetes.click
contexts:
- context:
cluster: kubernetes.click
namespace: dev
user: [email protected]
name: kubernetes.click-dev
current-context: kubernetes.click-dev
kind: Config
preferences: {}
users:
- name: [email protected]
user:
client-certificate-data: BASE64_ENCODED_CLIENT_CERTIFICATE
client-key-data: BASE64_ENCODED_CLIENT_KEY
data "aws_route53_zone" "zone-kubernetes-click" {
name = "kubernetes.click"
}
resource "aws_route53_record" "dev-a-record-kubernetes-click" {
zone_id = "${data.aws_route53_zone.zone-kubernetes-click.zone_id}"
name = "dev.kubernetes.click"
type = "CNAME"
ttl = "300"
records = ["${aws_elb.nodeport-elb-kubernetes-click.dns_name}"]
}
resource "aws_security_group" "ingress-sg-nodeport-elb-kubernetes-click" {
name = "ingress-sg.nodeport-elb.kubernetes.click"
ingress {
from_port = 30000
to_port = 30099
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
vpc_id = "${aws_vpc.kubernetes-click.id}"
}
resource "aws_elb" "nodeport-elb-kubernetes-click" {
name = "nodeport-elb-kubernetes-click"
security_groups = ["${aws_security_group.nodes-kubernetes-click.id}", "${aws_security_group.ingress-sg-nodeport-elb-kubernetes-click.id}"]
subnets
health_check {
healthy_threshold = 2
unhealthy_threshold = 2
timeout = 3
target = "TCP:22"
interval = 30
}
listener {
instance_port = 30000
instance_protocol = "tcp"
lb_port = 30000
lb_protocol = "tcp"
}
listener {
instance_port = 30001
instance_protocol = "tcp"
lb_port = 30001
lb_protocol = "tcp"
}
listener {
instance_port = 30002
instance_protocol = "tcp"
lb_port = 30002
lb_protocol = "tcp"
}
listener {
instance_port = 30003
instance_protocol = "tcp"
lb_port = 30003
lb_protocol = "tcp"
}
listener {
instance_port = 30004
instance_protocol = "tcp"
lb_port = 30004
lb_protocol = "tcp"
}
listener {
instance_port = 30005
instance_protocol = "tcp"
lb_port = 30005
lb_protocol = "tcp"
}
listener {
instance_port = 30006
instance_protocol = "tcp"
lb_port = 30006
lb_protocol = "tcp"
}
listener {
instance_port = 30007
instance_protocol = "tcp"
lb_port = 30007
lb_protocol = "tcp"
}
listener {
instance_port = 30008
instance_protocol = "tcp"
lb_port = 30008
lb_protocol = "tcp"
}
listener {
instance_port = 30009
instance_protocol = "tcp"
lb_port = 30009
lb_protocol = "tcp"
}
listener {
instance_port = 30010
instance_protocol = "tcp"
lb_port = 30010
lb_protocol = "tcp"
}
listener {
instance_port = 30011
instance_protocol = "tcp"
lb_port = 30011
lb_protocol = "tcp"
}
listener {
instance_port = 30012
instance_protocol = "tcp"
lb_port = 30012
lb_protocol = "tcp"
}
listener {
instance_port = 30013
instance_protocol = "tcp"
lb_port = 30013
lb_protocol = "tcp"
}
listener {
instance_port = 30014
instance_protocol = "tcp"
lb_port = 30014
lb_protocol = "tcp"
}
listener {
instance_port = 30015
instance_protocol = "tcp"
lb_port = 30015
lb_protocol = "tcp"
}
listener {
instance_port = 30016
instance_protocol = "tcp"
lb_port = 30016
lb_protocol = "tcp"
}
listener {
instance_port = 30017
instance_protocol = "tcp"
lb_port = 30017
lb_protocol = "tcp"
}
listener {
instance_port = 30018
instance_protocol = "tcp"
lb_port = 30018
lb_protocol = "tcp"
}
listener {
instance_port = 30019
instance_protocol = "tcp"
lb_port = 30019
lb_protocol = "tcp"
}
listener {
instance_port = 30020
instance_protocol = "tcp"
lb_port = 30020
lb_protocol = "tcp"
}
listener {
instance_port = 30021
instance_protocol = "tcp"
lb_port = 30021
lb_protocol = "tcp"
}
listener {
instance_port = 30022
instance_protocol = "tcp"
lb_port = 30022
lb_protocol = "tcp"
}
listener {
instance_port = 30023
instance_protocol = "tcp"
lb_port = 30023
lb_protocol = "tcp"
}
listener {
instance_port = 30024
instance_protocol = "tcp"
lb_port = 30024
lb_protocol = "tcp"
}
listener {
instance_port = 30025
instance_protocol = "tcp"
lb_port = 30025
lb_protocol = "tcp"
}
listener {
instance_port = 30026
instance_protocol = "tcp"
lb_port = 30026
lb_protocol = "tcp"
}
listener {
instance_port = 30027
instance_protocol = "tcp"
lb_port = 30027
lb_protocol = "tcp"
}
listener {
instance_port = 30028
instance_protocol = "tcp"
lb_port = 30028
lb_protocol = "tcp"
}
listener {
instance_port = 30029
instance_protocol = "tcp"
lb_port = 30029
lb_protocol = "tcp"
}
listener {
instance_port = 30030
instance_protocol = "tcp"
lb_port = 30030
lb_protocol = "tcp"
}
listener {
instance_port = 30031
instance_protocol = "tcp"
lb_port = 30031
lb_protocol = "tcp"
}
listener {
instance_port = 30032
instance_protocol = "tcp"
lb_port = 30032
lb_protocol = "tcp"
}
listener {
instance_port = 30033
instance_protocol = "tcp"
lb_port = 30033
lb_protocol = "tcp"
}
listener {
instance_port = 30034
instance_protocol = "tcp"
lb_port = 30034
lb_protocol = "tcp"
}
listener {
instance_port = 30035
instance_protocol = "tcp"
lb_port = 30035
lb_protocol = "tcp"
}
listener {
instance_port = 30036
instance_protocol = "tcp"
lb_port = 30036
lb_protocol = "tcp"
}
listener {
instance_port = 30037
instance_protocol = "tcp"
lb_port = 30037
lb_protocol = "tcp"
}
listener {
instance_port = 30038
instance_protocol = "tcp"
lb_port = 30038
lb_protocol = "tcp"
}
listener {
instance_port = 30039
instance_protocol = "tcp"
lb_port = 30039
lb_protocol = "tcp"
}
listener {
instance_port = 30040
instance_protocol = "tcp"
lb_port = 30040
lb_protocol = "tcp"
}
listener {
instance_port = 30041
instance_protocol = "tcp"
lb_port = 30041
lb_protocol = "tcp"
}
listener {
instance_port = 30042
instance_protocol = "tcp"
lb_port = 30042
lb_protocol = "tcp"
}
listener {
instance_port = 30043
instance_protocol = "tcp"
lb_port = 30043
lb_protocol = "tcp"
}
listener {
instance_port = 30044
instance_protocol = "tcp"
lb_port = 30044
lb_protocol = "tcp"
}
listener {
instance_port = 30045
instance_protocol = "tcp"
lb_port = 30045
lb_protocol = "tcp"
}
listener {
instance_port = 30046
instance_protocol = "tcp"
lb_port = 30046
lb_protocol = "tcp"
}
listener {
instance_port = 30047
instance_protocol = "tcp"
lb_port = 30047
lb_protocol = "tcp"
}
listener {
instance_port = 30048
instance_protocol = "tcp"
lb_port = 30048
lb_protocol = "tcp"
}
listener {
instance_port = 30049
instance_protocol = "tcp"
lb_port = 30049
lb_protocol = "tcp"
}
listener {
instance_port = 30050
instance_protocol = "tcp"
lb_port = 30050
lb_protocol = "tcp"
}
listener {
instance_port = 30051
instance_protocol = "tcp"
lb_port = 30051
lb_protocol = "tcp"
}
listener {
instance_port = 30052
instance_protocol = "tcp"
lb_port = 30052
lb_protocol = "tcp"
}
listener {
instance_port = 30053
instance_protocol = "tcp"
lb_port = 30053
lb_protocol = "tcp"
}
listener {
instance_port = 30054
instance_protocol = "tcp"
lb_port = 30054
lb_protocol = "tcp"
}
listener {
instance_port = 30055
instance_protocol = "tcp"
lb_port = 30055
lb_protocol = "tcp"
}
listener {
instance_port = 30056
instance_protocol = "tcp"
lb_port = 30056
lb_protocol = "tcp"
}
listener {
instance_port = 30057
instance_protocol = "tcp"
lb_port = 30057
lb_protocol = "tcp"
}
listener {
instance_port = 30058
instance_protocol = "tcp"
lb_port = 30058
lb_protocol = "tcp"
}
listener {
instance_port = 30059
instance_protocol = "tcp"
lb_port = 30059
lb_protocol = "tcp"
}
listener {
instance_port = 30060
instance_protocol = "tcp"
lb_port = 30060
lb_protocol = "tcp"
}
listener {
instance_port = 30061
instance_protocol = "tcp"
lb_port = 30061
lb_protocol = "tcp"
}
listener {
instance_port = 30062
instance_protocol = "tcp"
lb_port = 30062
lb_protocol = "tcp"
}
listener {
instance_port = 30063
instance_protocol = "tcp"
lb_port = 30063
lb_protocol = "tcp"
}
listener {
instance_port = 30064
instance_protocol = "tcp"
lb_port = 30064
lb_protocol = "tcp"
}
listener {
instance_port = 30065
instance_protocol = "tcp"
lb_port = 30065
lb_protocol = "tcp"
}
listener {
instance_port = 30066
instance_protocol = "tcp"
lb_port = 30066
lb_protocol = "tcp"
}
listener {
instance_port = 30067
instance_protocol = "tcp"
lb_port = 30067
lb_protocol = "tcp"
}
listener {
instance_port = 30068
instance_protocol = "tcp"
lb_port = 30068
lb_protocol = "tcp"
}
listener {
instance_port = 30069
instance_protocol = "tcp"
lb_port = 30069
lb_protocol = "tcp"
}
listener {
instance_port = 30070
instance_protocol = "tcp"
lb_port = 30070
lb_protocol = "tcp"
}
listener {
instance_port = 30071
instance_protocol = "tcp"
lb_port = 30071
lb_protocol = "tcp"
}
listener {
instance_port = 30072
instance_protocol = "tcp"
lb_port = 30072
lb_protocol = "tcp"
}
listener {
instance_port = 30073
instance_protocol = "tcp"
lb_port = 30073
lb_protocol = "tcp"
}
listener {
instance_port = 30074
instance_protocol = "tcp"
lb_port = 30074
lb_protocol = "tcp"
}
listener {
instance_port = 30075
instance_protocol = "tcp"
lb_port = 30075
lb_protocol = "tcp"
}
listener {
instance_port = 30076
instance_protocol = "tcp"
lb_port = 30076
lb_protocol = "tcp"
}
listener {
instance_port = 30077
instance_protocol = "tcp"
lb_port = 30077
lb_protocol = "tcp"
}
listener {
instance_port = 30078
instance_protocol = "tcp"
lb_port = 30078
lb_protocol = "tcp"
}
listener {
instance_port = 30079
instance_protocol = "tcp"
lb_port = 30079
lb_protocol = "tcp"
}
listener {
instance_port = 30080
instance_protocol = "tcp"
lb_port = 30080
lb_protocol = "tcp"
}
listener {
instance_port = 30081
instance_protocol = "tcp"
lb_port = 30081
lb_protocol = "tcp"
}
listener {
instance_port = 30082
instance_protocol = "tcp"
lb_port = 30082
lb_protocol = "tcp"
}
listener {
instance_port = 30083
instance_protocol = "tcp"
lb_port = 30083
lb_protocol = "tcp"
}
listener {
instance_port = 30084
instance_protocol = "tcp"
lb_port = 30084
lb_protocol = "tcp"
}
listener {
instance_port = 30085
instance_protocol = "tcp"
lb_port = 30085
lb_protocol = "tcp"
}
listener {
instance_port = 30086
instance_protocol = "tcp"
lb_port = 30086
lb_protocol = "tcp"
}
listener {
instance_port = 30087
instance_protocol = "tcp"
lb_port = 30087
lb_protocol = "tcp"
}
listener {
instance_port = 30088
instance_protocol = "tcp"
lb_port = 30088
lb_protocol = "tcp"
}
listener {
instance_port = 30089
instance_protocol = "tcp"
lb_port = 30089
lb_protocol = "tcp"
}
listener {
instance_port = 30090
instance_protocol = "tcp"
lb_port = 30090
lb_protocol = "tcp"
}
listener {
instance_port = 30091
instance_protocol = "tcp"
lb_port = 30091
lb_protocol = "tcp"
}
listener {
instance_port = 30092
instance_protocol = "tcp"
lb_port = 30092
lb_protocol = "tcp"
}
listener {
instance_port = 30093
instance_protocol = "tcp"
lb_port = 30093
lb_protocol = "tcp"
}
listener {
instance_port = 30094
instance_protocol = "tcp"
lb_port = 30094
lb_protocol = "tcp"
}
listener {
instance_port = 30095
instance_protocol = "tcp"
lb_port = 30095
lb_protocol = "tcp"
}
listener {
instance_port = 30096
instance_protocol = "tcp"
lb_port = 30096
lb_protocol = "tcp"
}
listener {
instance_port = 30097
instance_protocol = "tcp"
lb_port = 30097
lb_protocol = "tcp"
}
listener {
instance_port = 30098
instance_protocol = "tcp"
lb_port = 30098
lb_protocol = "tcp"
}
listener {
instance_port = 30099
instance_protocol = "tcp"
lb_port = 30099
lb_protocol = "tcp"
}
}