Created
June 9, 2025 09:55
-
-
Save wiggitywhitney/a534fddae5e2b629d3c46977693f0c3b to your computer and use it in GitHub Desktop.
Lightboard notes from 'Monitoring Kubernetes' ⚡️ Enlightning & 🌩️Thunder episodes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
🌩️Thunder Recap: https://youtu.be/AwwhHW4Ev38 | |
⚡️Enlightning Long Form: https://youtu.be/LRXU-cj6CDA | |
Observability is about better understanding what is happening in a system | |
✶ fix problems | |
✶ find issues | |
✶ build confidence | |
Common data types in observability | |
✶ logging | |
• structured vs unstructured | |
• events in time | |
• mostly text | |
✶ metrics | |
• measuring events over time | |
• time stamp + numerical value + label | |
→ these signals come together in a DASHBOARD (visualization) | |
✶ traces | |
• tracking an event across locations | |
✶ profiles | |
• tracking trends across locations | |
➤ An older app might write logs to a logfile | |
Monitoring Kubernetes | |
is understanding what is happening in a K8s cluster | |
✶ If you can understand Kubernetes performance, | |
→ you can improve K8s performance | |
→ The goal is optimization | |
• Monitoring K8s is helpful | |
✶ for bug fixing & right-sizing (efficiency) ✶ cost ✶ energy ✶ app performance | |
Efficiency in Kubernetes | |
✶ avoid wasted infrastructure | |
✶ avoid too-full infrastructure | |
✶ optimize cost | |
✶ optimize energy use | |
Where do we get logs in Kubernetes? | |
📍 Control Plane | |
• API Server | |
📍 Worker | |
• LOGS | |
→ events are accessed via API Server | |
→ probably stored in etcd | |
✶ Pod logs are stored in directories on the host machines | |
✶ kubelet/node logs also stored in separate directories on host machines | |
How do we capture logs? | |
✶ Install a daemonset | |
• gets all 3 types of logs | |
(more common ←) | |
✶ Another way: sidecar | |
• get logs directly from app | |
• no node logs | |
• possibly high overhead | |
• more flexible | |
How do we capture metrics? | |
3 methods are built in to K8s: | |
✶ Control plane components has a set of common plane metrics | |
→ each kubelet has metrics | |
• state of kubelets | |
• state of pods kubelet is managing | |
✶ cAdvisor = container metrics | |
→ ex: resource usage, network traffic | |
📝 each method stores metrics info in memory and then hands it over when collected (when asked!) | |
→ scrape interval → how often | |
What types of metrics are commonly collected? | |
✶ Node exporter → captures all node health | |
ex: ✶ GPU ✶ CPU ✶ mem ✶ disk space | |
• gets installed as a daemonset | |
✶ Kube State Metrics (KSM) | |
→ captures actual state vs desired state over time | |
ex: ✶ one pod in a replicaset keeps crashing | |
✶ OpenCost | |
→ captures cost of running infrastructure | |
ex: ✶ knows exact cost of scaling up a cluster in real-time | |
✶ Kepler | |
→ captures energy use of a workload | |
ex: ✶ determine which processor is more energy efficient | |
TRACES + PROFILES | |
✶ most interesting to APP Devs | |
✶ Kube-API emits traces, APP Devs ARE INTERESTED | |
✶ APP Devs instrument APPs to generate traces, FOR APP ONLY | |
→ But, APP observability is an important piece of a holistic view | |
Storing Telemetry Data | |
Purpose-built backends to store query quickly + cost-effectively | |
✶ logs = huge ✶ metrics = compressed | |
VISUALIZATION → | |
Helps us see & understand all this data we’re collecting | |
→ QUERIES + CALCULATIONS |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment