These are notes for my Kubernetes/Prometheus workshop. The notes for the Prometheus introduction workshop can be found here.
The first part of this workshop is taken from episode 001 of the excellent TGI Kubernetes series by Heptio.
The demo runs on AWS.
Before the demo, do the following:
Set up K8S cluster as in heptio/aws-quickstart:
docker run --rm -t -i fstab/aws-cli
aws configure
AWS Access Key ID [None]: ...
AWS Secret Access Key [None]: ...
Default region name [None]: eu-central-1
Default output format [None]:
export STACK=fabian-k8s-test-stack
export TEMPLATEPATH=https://s3.amazonaws.com/quickstart-reference/heptio/latest/templates/kubernetes-cluster-with-new-vpc.template
export AZ=eu-central-1a
export INGRESS=0.0.0.0/0
export KEYNAME=fabian-k8s-test-keys
aws cloudformation create-stack --stack-name $STACK --template-body $TEMPLATEPATH --capabilities CAPABILITY_NAMED_IAM --parameters ParameterKey=AvailabilityZone,ParameterValue=$AZ ParameterKey=AdminIngressLocation,ParameterValue=$INGRESS ParameterKey=KeyName,ParameterValue=$KEYNAME ParameterKey=K8sNodeCapacity,ParameterValue=4
SSH environment
export SSH_KEY=~/.ssh/fabian-k8s-test-keys.pem
export BASTION=... # public domain name of Bation host, as taken from the AWS Web console
export MASTER=... # internal domain name of the K8S master, as taken from the AWS Web console
Test if SSH to K8S master works
ssh -i $SSH_KEY -o ProxyCommand="ssh -i $SSH_KEY ubuntu@$BASTION -W %h:%p" ubuntu@$MASTER
Get kubeconfig
mkdir k8stmp
cd k8stmp
scp -i $SSH_KEY -o ProxyCommand="ssh -i \"${SSH_KEY}\" ubuntu@$BASTION -W %h:%p" ubuntu@$MASTER:~/kubeconfig ./kubeconfig
Get kubectl
curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/darwin/amd64/kubectl
chmod 755 kubectl
Configure kubectl
export KUBECONFIG=./kubeconfig
Download and extract helm
from github.com/kubernetes/helm, copy executable to current directory
Before we can use helm, we need to create the corresponding service accounts (see http://jayunit100.blogspot.de/2017/07/helm-on.html):
- Create file
helm.yaml
with the service account definition:apiVersion: v1 kind: ServiceAccount metadata: name: helm namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: helm roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: helm namespace: kube-system
- Create the service account:
./kubectl create -f helm.yaml
- Init
helm
with that service account:rm -rf ~/.helm && ./helm init --service-account helm
If something goes wrong, reset the service account and helm config with
rm -rf ~/.helm
./kubectl delete deployment tiller-deploy --namespace=kube-system
./kubectl delete service tiller-deploy --namespace=kube-system
- Part 1:
- Look at Kubernetes from a user (software developer) perspective
- Understand what the implications are for monitoring and why simple Nagios checks are not a good fit
- Part 2:
- Run Prometheus in Kubernetes
- See some features like service discovery, labels, ...
- Part 3:
- Where to go from here?
- Three layers: Application Monitoring / Platform (Kubernetes) Monitoring / Infrastructure (VM) Monitoring
Examples taken from episode 001 of the TGI Kubernetes series on Youtube.
./kubectl get nodes
(master might not be marked because of #61)./kubectl describe nodes <name>
(with<name>
replaced with a name from the./kubectl get nodes
output)./kubectl get pods
(empty)./kubectl run --generator=run-pod/v1 --image=gcr.io/kuar-demo/kuard-amd64:1 kuard
./kubectl get pods
(shows kuard)./kubectl port-forward kuard 8080:8080
(then go to http://localhost:8080)- Explain how there is one Docker container per pod, except you need multiple containers in the same namespace
- Show the
./kubectl run ...
command with--dry-run -o yaml
option, explain Kubernetes' REST API (saving and maintaining the YAML gives you a declarative way of interacting with K8S, while thekubectl
command gives you an imperative way). ./kubectl get pods --namespace=kube-system
(Each pod runs in a namespace. The default namespace is calleddefault
, so./kubectl get pods
is the same as./kubectl get pods --namespace=default
. The namespacekube-system
contains the pods managing Kubernetes itself, just like Linuxps ax
shows system processes managing Linux itself)kubectl get pods --all-namespaces
./kubectl delete pod kuard
./kubectl run --image=gcr.io/kuar-demo/kuard-amd64:1 kuard --replicas=5
(also with--dry-run -o yaml
): This creates a deployment instead of a pod. This is the default, so we don't need the--generator...
. -> Mention that you would normally use some labels with--labels="key1=value1,key2=value2,..."
./kubectl get pods
./kubectl get deployments -o wide
./kubectl expose deployment kuard --type=LoadBalancer --port=80 --target-port=8080
(also with--dry-run -o yaml
):./kubectl port-forward
works only for single pods. Theexpose
command creats a service. The service can be used to expose deployments. The service finds the relevant IP addresses of the pods for the deployment based on labels. The service creates a virtual IP to expose the deployment. The load balancer is created as an AWS ELB pointing to the virtual IP. Kubernetes also has support for other load balancers, like f5../kubectl get service kuard -o wide
- Go to the ELB address shown as output. Hit reload from time to time to show it's redirected to different pods.
./kubectl get endpoints kuard -o yaml
- Window 1
export KUBECONFIG=./kubeconfig
watch -n 0.5 ./kubectl get pods
- Window 2:
while true ; do curl -s a4c4fce20930a11e795fa021eb30cd11-152347899.eu-central-1.elb.amazonaws.com/env/api | jq .env.HOSTNAME ; sleep 0.1 ; done
(replace with ELB hostname from./kubectl get service kuard -o wide
)
- Window 3: The following commands edit the deployment yaml on the fly.
export KUBECONFIG=./kubeconfig
./kubectl scale deployment kuard --replicas=10
(scale up)./kubectl set image deployment kuard kuard=gcr.io/kuar-demo/kuard-amd64:2
(update)./kubectl rollout undo deployment kuard
(undo update)
Clean up
./kubectl delete service kuard
./kubectl delete deployment kuard
- No fixed IPs or hostnames
- No fixed number of pods
- Monitoring should:
- Have automatic service discovery
- Make use of K8S labels
- Have alerts based on statistical values (example: less than 70% pods available for service A for more than 5 minutes)
Helm is a package manager for pre-configured Kubernetes deployment YAMLs. We are going to use helm to deploy a prometheus/alertmanager/node_exporter/... deployment.
Use helm to install Prometheus:
./helm search
(https://github.com/kubernetes/charts)./helm search prometheus
./helm install stable/prometheus --set server.persistentVolume.enabled=false --set alertmanager.persistentVolume.enabled=false --set pushgateway.enabled=false --set rbac.create=true
./helm list
./kubectl get pods -o wide
./kubectl port-forward inky-pika-prometheus-server-1024141831-7zf28 9090
(replace with prometheus server pod name from./kubectl get pods
)./kubectl logs inky-pika-prometheus-server-1024141831-7zf28 prometheus-server
(replace with prometheus server pod name from./kubectl get pods
)
What can we see on http://localhost:9090
- Configuration specifies service discovery, targets are discovered automatically
- K8S Labels are maintained
- Metrics of internal building blocks like etcd are available
Debugging
./kubectl logs inky-pika-prometheus-server-1024141831-7zf28 prometheus-server
(replace with prometheus server pod name from./kubectl get pods
)./kubectl exec -it inky-pika-prometheus-server-1024141831-7zf28 -c prometheus-server ash
(replace with prometheus server pod name from./kubectl get pods
)
Daemon Sets
./kubectl get daemonSets
(thenode_exporter
runs as a daemon set on all nodes. run with--all-namespaces
to view system daemon sets like the calico overlay network or etcd)
UI
- Kubernetes comes with a built-in dashboard deployment. Run
./kubectl proxy
and view the UI on http://localhost:8001/ui
Clean up
./helm delete silly-meerkat
(replace silly-meerkat
with the name from ./helm list
)
Three Layers of Monitoring
- Infrastructure (Bare Metal, VMWare machines, AWS instances, ...)
- Platform (Kubernetes itself, including etcd, calico, components of the K8S master, ...)
- Application
Use DaemonSet
to install node_exporter
on each node. When monitoring a node through a Docker container, make sure to get metrics from the host and not from the container.
If all monitoring layers are covered from within Kubernetes, what do you do if the cluster fails?
aws cloudformation delete-stack --stack-name $STACK