On production environments on a budget, it's hard to provision enough to support maximum possible usage spikes. Sometimes when a bad combination of memory usage spikes hits, the performance quickly degrades and eventually the node becomes unresponsive for an unacceptable amount of time.
A node under a low memory condition kills processes based on cgroups memory limits and OOM score, and/or cleans up
the page cache in kswapd
. But when this doesn't suffice to free up enough memory to operate normally, the node
becomes unresponsive, usually resulting in kswapd
going off crazy.
To mitigate this problem, you could set up a swap memory on each node, but this is not an option if your system has to maintain low latency. If your production environment is set up with a high availability configuration, it's better to just kill and restart applications than to have a very high latency service.
To prevent the severe memory starvation beforehand, you can deploy this user-land OOM trigger script before the amount of available memory becomes not enough to maintain the node healthy.
kubectl apply -f oom-killer.yml