systemd-oomd was enabled by default in Fedora 33, and more recently Ubuntu 22.04. After trying some large software builds on these distros, I ran into some unexpected problems.

Overall, I think this is a good thing for most users, but I need to tune the defaults for cases where I'm willing to give up some responsiveness for reliability under heavy load.

Hard learnings

Where'd my build go?

Don't run a huge build process with too many threads in Gnome Terminal. Each one has its own cgroup slice, and if too much memory/swap are consumed or too much pressure - the slice is killed. You can't see what was going on when it was killed unless you were tee'ing it to a file
Podman also creates 1 cgroup slice per container, but at least you may be able to get some clues from podman container ls and logs

Hyper-V dynamic memory does not mix well with systemd-oomd

I tried raising the max memory on my Hyper-V VM, and the VM would consume more memory, but the build cgroup slice was still killed. I don't know whether or not systemd-oomd was recalculating available based on memory "added" (actually freed) by hv_balloon. It may have just been too slow to respond. Either way, I had better luck setting the min ram back to max making it effectively a static allocation. More experimentation here may be helpful.

Helpful tips

See what was killed: journalctl -u systemd-oomd
Current status, including a helpful list of slices: oomctl
Change threshold for kills: sudo systemctl edit [email protected]

Tuning for builds

Disable zram on Fedora

My initial guess is that turning off zram swap will make things a bit more predictable. If the system is under a temporary memory usage spike, logically to me it seems like it would be better to have real swap rather than holding physical memory back and compressing something that will be used only a short time.

sudo dnf remove zram-generator-defaults

enable swap

Would like to make this a btrfs slice, but not sure if that's achievable. I may need to just add another vdisk to the VM and use an old school swap partition.

PatrickLang/systemd-oomd-learnings.md