wiggitywhitney · June 9, 2025 09:55
diff --git a/gistfile1.txt b/gistfile1.txt
 🌩️Thunder Recap: https://youtu.be/AwwhHW4Ev38
 ⚡️Enlightning Long Form: https://youtu.be/LRXU-cj6CDA


 Observability is about better understanding what is happening in a system
 ✶ fix problems
 ✶ find issues
 ✶ build confidence

 Common data types in observability
 ✶ logging
 • structured vs unstructured
 • events in time
 • mostly text

 ✶ metrics
 • measuring events over time
 • time stamp + numerical value + label
 → these signals come together in a DASHBOARD (visualization)

 ✶ traces
 • tracking an event across locations

 ✶ profiles
 • tracking trends across locations

 ➤ An older app might write logs to a logfile

 Monitoring Kubernetes
 is understanding what is happening in a K8s cluster
 ✶ If you can understand Kubernetes performance,
 → you can improve K8s performance

 → The goal is optimization
 • Monitoring K8s is helpful
 ✶ for bug fixing & right-sizing (efficiency) ✶ cost ✶ energy ✶ app performance

 Efficiency in Kubernetes
 ✶ avoid wasted infrastructure
 ✶ avoid too-full infrastructure
 ✶ optimize cost
 ✶ optimize energy use

 Where do we get logs in Kubernetes?
 📍 Control Plane
 • API Server

 📍 Worker
 • LOGS

 → events are accessed via API Server
 → probably stored in etcd

 ✶ Pod logs are stored in directories on the host machines
 ✶ kubelet/node logs also stored in separate directories on host machines

 How do we capture logs?
 ✶ Install a daemonset
 • gets all 3 types of logs
 (more common ←)

 ✶ Another way: sidecar
 • get logs directly from app
 • no node logs
 • possibly high overhead
 • more flexible

 How do we capture metrics?
 3 methods are built in to K8s:
 ✶ Control plane components has a set of common plane metrics
 → each kubelet has metrics
 • state of kubelets
 • state of pods kubelet is managing

 ✶ cAdvisor = container metrics
 → ex: resource usage, network traffic

 📝 each method stores metrics info in memory and then hands it over when collected (when asked!)
 → scrape interval → how often

 What types of metrics are commonly collected?
 ✶ Node exporter → captures all node health
 ex: ✶ GPU ✶ CPU ✶ mem ✶ disk space
 • gets installed as a daemonset

 ✶ Kube State Metrics (KSM)
 → captures actual state vs desired state over time
 ex: ✶ one pod in a replicaset keeps crashing

 ✶ OpenCost
 → captures cost of running infrastructure
 ex: ✶ knows exact cost of scaling up a cluster in real-time

 ✶ Kepler
 → captures energy use of a workload
 ex: ✶ determine which processor is more energy efficient

 TRACES + PROFILES
 ✶ most interesting to APP Devs
 ✶ Kube-API emits traces, APP Devs ARE INTERESTED
 ✶ APP Devs instrument APPs to generate traces, FOR APP ONLY
 → But, APP observability is an important piece of a holistic view

 Storing Telemetry Data
 Purpose-built backends to store query quickly + cost-effectively
 ✶ logs = huge ✶ metrics = compressed

 VISUALIZATION →
 Helps us see & understand all this data we’re collecting
 → QUERIES + CALCULATIONS
	🌩️Thunder Recap: https://youtu.be/AwwhHW4Ev38
	⚡️Enlightning Long Form: https://youtu.be/LRXU-cj6CDA


	Observability is about better understanding what is happening in a system
	✶ fix problems
	✶ find issues
	✶ build confidence

	Common data types in observability
	✶ logging
	• structured vs unstructured
	• events in time
	• mostly text

	✶ metrics
	• measuring events over time
	• time stamp + numerical value + label
	→ these signals come together in a DASHBOARD (visualization)

	✶ traces
	• tracking an event across locations

	✶ profiles
	• tracking trends across locations

	➤ An older app might write logs to a logfile

	Monitoring Kubernetes
	is understanding what is happening in a K8s cluster
	✶ If you can understand Kubernetes performance,
	→ you can improve K8s performance

	→ The goal is optimization
	• Monitoring K8s is helpful
	✶ for bug fixing & right-sizing (efficiency) ✶ cost ✶ energy ✶ app performance

	Efficiency in Kubernetes
	✶ avoid wasted infrastructure
	✶ avoid too-full infrastructure
	✶ optimize cost
	✶ optimize energy use

	Where do we get logs in Kubernetes?
	📍 Control Plane
	• API Server

	📍 Worker
	• LOGS

	→ events are accessed via API Server
	→ probably stored in etcd

	✶ Pod logs are stored in directories on the host machines
	✶ kubelet/node logs also stored in separate directories on host machines

	How do we capture logs?
	✶ Install a daemonset
	• gets all 3 types of logs
	(more common ←)

	✶ Another way: sidecar
	• get logs directly from app
	• no node logs
	• possibly high overhead
	• more flexible

	How do we capture metrics?
	3 methods are built in to K8s:
	✶ Control plane components has a set of common plane metrics
	→ each kubelet has metrics
	• state of kubelets
	• state of pods kubelet is managing

	✶ cAdvisor = container metrics
	→ ex: resource usage, network traffic

	📝 each method stores metrics info in memory and then hands it over when collected (when asked!)
	→ scrape interval → how often

	What types of metrics are commonly collected?
	✶ Node exporter → captures all node health
	ex: ✶ GPU ✶ CPU ✶ mem ✶ disk space
	• gets installed as a daemonset

	✶ Kube State Metrics (KSM)
	→ captures actual state vs desired state over time
	ex: ✶ one pod in a replicaset keeps crashing

	✶ OpenCost
	→ captures cost of running infrastructure
	ex: ✶ knows exact cost of scaling up a cluster in real-time

	✶ Kepler
	→ captures energy use of a workload
	ex: ✶ determine which processor is more energy efficient

	TRACES + PROFILES
	✶ most interesting to APP Devs
	✶ Kube-API emits traces, APP Devs ARE INTERESTED
	✶ APP Devs instrument APPs to generate traces, FOR APP ONLY
	→ But, APP observability is an important piece of a holistic view

	Storing Telemetry Data
	Purpose-built backends to store query quickly + cost-effectively
	✶ logs = huge ✶ metrics = compressed

	VISUALIZATION →
	Helps us see & understand all this data we’re collecting
	→ QUERIES + CALCULATIONS