Below is a “menu” of tactics you can mix-and-match depending on whether your 6-node, 48-vCPU cluster lives on-prem or in the cloud. I’ve ordered them roughly from quickest wins to bigger-picture moves.
- Set realistic CPU
requests
(what the scheduler reserves) and leavelimits
unset or >requests
so Pods can “burst” when you hit a spike. - Over-estimating requests leads to 90 % of your cluster sitting idle, yet the scheduler still thinks the nodes are “full.”
- Use the
kubectl top
plugin or Grafana dashboards to discover the p95 CPU demand per Pod and calibrate. - Once tuned, keep ±30 % head-room; that alone may absorb many spikes.