Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save wiggitywhitney/a534fddae5e2b629d3c46977693f0c3b to your computer and use it in GitHub Desktop.
Save wiggitywhitney/a534fddae5e2b629d3c46977693f0c3b to your computer and use it in GitHub Desktop.
Lightboard notes from 'Monitoring Kubernetes' ⚡️ Enlightning & 🌩️Thunder episodes
🌩️Thunder Recap: https://youtu.be/AwwhHW4Ev38
⚡️Enlightning Long Form: https://youtu.be/LRXU-cj6CDA
Observability is about better understanding what is happening in a system
✶ fix problems
✶ find issues
✶ build confidence
Common data types in observability
✶ logging
• structured vs unstructured
• events in time
• mostly text
✶ metrics
• measuring events over time
• time stamp + numerical value + label
→ these signals come together in a DASHBOARD (visualization)
✶ traces
• tracking an event across locations
✶ profiles
• tracking trends across locations
➤ An older app might write logs to a logfile
Monitoring Kubernetes
is understanding what is happening in a K8s cluster
✶ If you can understand Kubernetes performance,
→ you can improve K8s performance
→ The goal is optimization
• Monitoring K8s is helpful
✶ for bug fixing & right-sizing (efficiency) ✶ cost ✶ energy ✶ app performance
Efficiency in Kubernetes
✶ avoid wasted infrastructure
✶ avoid too-full infrastructure
✶ optimize cost
✶ optimize energy use
Where do we get logs in Kubernetes?
📍 Control Plane
• API Server
📍 Worker
• LOGS
→ events are accessed via API Server
→ probably stored in etcd
✶ Pod logs are stored in directories on the host machines
✶ kubelet/node logs also stored in separate directories on host machines
How do we capture logs?
✶ Install a daemonset
• gets all 3 types of logs
(more common ←)
✶ Another way: sidecar
• get logs directly from app
• no node logs
• possibly high overhead
• more flexible
How do we capture metrics?
3 methods are built in to K8s:
✶ Control plane components has a set of common plane metrics
→ each kubelet has metrics
• state of kubelets
• state of pods kubelet is managing
✶ cAdvisor = container metrics
→ ex: resource usage, network traffic
📝 each method stores metrics info in memory and then hands it over when collected (when asked!)
→ scrape interval → how often
What types of metrics are commonly collected?
✶ Node exporter → captures all node health
ex: ✶ GPU ✶ CPU ✶ mem ✶ disk space
• gets installed as a daemonset
✶ Kube State Metrics (KSM)
→ captures actual state vs desired state over time
ex: ✶ one pod in a replicaset keeps crashing
✶ OpenCost
→ captures cost of running infrastructure
ex: ✶ knows exact cost of scaling up a cluster in real-time
✶ Kepler
→ captures energy use of a workload
ex: ✶ determine which processor is more energy efficient
TRACES + PROFILES
✶ most interesting to APP Devs
✶ Kube-API emits traces, APP Devs ARE INTERESTED
✶ APP Devs instrument APPs to generate traces, FOR APP ONLY
→ But, APP observability is an important piece of a holistic view
Storing Telemetry Data
Purpose-built backends to store query quickly + cost-effectively
✶ logs = huge ✶ metrics = compressed
VISUALIZATION →
Helps us see & understand all this data we’re collecting
→ QUERIES + CALCULATIONS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment