Skip to content

Instantly share code, notes, and snippets.

@2E0PGS
Last active May 29, 2024 13:55
Show Gist options
  • Save 2E0PGS/2560d054819843d1e6da76ae57378989 to your computer and use it in GitHub Desktop.
Save 2E0PGS/2560d054819843d1e6da76ae57378989 to your computer and use it in GitHub Desktop.
Fixing khugepaged CPU usage VMware Workstation

If you run VMware Workstation 11 or above you may encounter high CPU usage from process khugepaged on Ubuntu 15.04+

The fix is to disable transparent hugepages. It seems Ubuntu has it enabled by default.

You can check the current status on your system by running:

cat /sys/kernel/mm/transparent_hugepage/enabled

cat /sys/kernel/mm/transparent_hugepage/defrag

Fedora outputs: always [madvise] never but Ubuntu outputs: [always] madvise never

Fedora seems to not be effected but I havn't tested it myself.

So I suggest not using madvise and just disable it totally.

To disable it run the following commands as root:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

echo never > /sys/kernel/mm/transparent_hugepage/defrag

That will only disable it for the current session.

To have it persistant across reboots I suggest adding this to your rc.local:

# Fix for VMware Workstation 11+ khugepaged.
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Ensure this goes above the line:

exit 0

More info and references:

@ajgringo619
Copy link

I ran into this as well on EndeavourOS with the 5.15 zen kernel. After switching to the 5.10 LTS kernel the problem went away. I will definitely try this latest fix as my overall system performance was better with zen; thank you!

@Nantris
Copy link

Nantris commented Jan 10, 2022

Ubuntu doesn't use rc.local anymore. What's the correct way to implement these modifications without an rc.local file?

@ckuhtz
Copy link

ckuhtz commented Jan 10, 2022

Script launched via systemd. Pretty straightforward to setup. Seh Google can help.

@Nantris
Copy link

Nantris commented Jan 10, 2022

All I've found is suggestions to create an rc.local file and service. Is that really the best way? If so, why was the rc.local file removed past Ubuntu 16.x?

@neves-0
Copy link

neves-0 commented Jan 10, 2022

Best way is to add vm.compaction_proactiveness=0 to the /etc/sysctl.conf file, as said by @gene-olson, and reboot (or use "sysctl -p" to apply the modification without reboot)

@ckuhtz
Copy link

ckuhtz commented Jan 10, 2022

@neves-0 is right, that’s the most efficient way, although I personally have sometimes additional logic around these things and keep them as sysctl calls in a script for that reason.

@slapbox you need to ask Linux folks why systemd :-)..

@ajgringo619
Copy link

Testing went well with the Arch linux-zen kernel; thanks again.

For Arch users, create a .conf file in /etc/sysctl.d/.

@shuhaowu
Copy link

shuhaowu commented Mar 2, 2022

Setting vm.compaction_proactiveness to be 0 doesn't solve the problem for me...

@xeyownt
Copy link

xeyownt commented May 25, 2022

Same for me.

None of the above works for me. Did the hugepage_transparent things, the compaction_proactiveness, both directly, in etc/, and in vmware *.vmx file. Checking the various variable from /proc and /sys returns the expected value. Still, kcompactd0 kicks in. It is slightly less severe than it uses to be, but still it freezes the guest to death.

I'm thinking about recompiling the kernel and disabling that kcompactd0 thing altogether.

Host Debian 5.16.18-1 (2022-03-29) x86_64 GNU/Linux, 16GB RAM.
VMWare 16.2.3 build-19376536
Guest Windows 10 (64-bit), 8GB RAM allocated.

@gene-olson
Copy link

When I first set only:
vm.compaction_proactiveness=0
back in December it appeared a complete fix, just like a number of previous (but different) fixes did.

However I now see some regular pauses, but much less serious ones. I might lock up occasionally for 10-30 seconds, but generally the problem clears after a few iterations and the system is fine again. It's an annoyance, for sure, but it doesn't block my work.

In case it lends any light on the problem, I am using a 12-core AMD with 64 GB of memory on Ubuntu 20.04. I have seen this problem in Windows VMs with 16 GB of virtual RAM, and 4 or 6 virtual CPUs.

I fear that this fix, like all the others, will gradually become less effective, finally making VMWare Workstation unusable. Previously this problem became so bad I temporarily switched to VirtualBox, and never saw the problem there.

@exeq89
Copy link

exeq89 commented Jun 3, 2022

i can confirm what gene-olson says.
i started get freezes even with compaction_proactiveness is 0.

@lijx10
Copy link

lijx10 commented Jun 8, 2022

Comfirm that win10 guest freezes and kcompactd0 kicks in with compaction_proactiveness=0.

Tested with:
Ubuntu 22.04 kernel 5.15.35 / 5.16.20
VMWare 16.2.3 / 16.2.0 / 16.1.2
Win10 Guest, 4GB / 6GB memory allocated

@exeq89
Copy link

exeq89 commented Jun 12, 2022

i have lowered the memory for 3d graphics and its better.
my config is:
ryzen 5 5600g / 32 gb ram
guest is win10 with 6 cpu and 16 gb ram.
lowered graphics to 256mb.

@clapbr
Copy link

clapbr commented Aug 1, 2022

i have lowered the memory for 3d graphics and its better. my config is: ryzen 5 5600g / 32 gb ram guest is win10 with 6 cpu and 16 gb ram. lowered graphics to 256mb.

same here, reduced to 1Gb and its fine now

@thebahadir
Copy link

thebahadir commented Nov 18, 2022

I've been dealing with this issue for months on Debian 11. None of the mentioned methods were the solution for me. And today i solved my problem. What was described here helped to solve my problem.

@msizanoen1
Copy link

msizanoen1 commented Feb 2, 2023

These three lines as root should fully disable kernel memory defragmentation:

echo never > /sys/kernel/mm/transparent_hugepage/defrag
sysctl -w vm.compaction_proactiveness=0
sysctl -w vm.extfrag_threshold=1000

Note that this will greatly increase memory fragmentation and therefore memory pressure as compation is fully disabled and should be reverted when VMWare is not in use.

How to revert:

sysctl -w vm.compaction_proactiveness=20
sysctl -w vm.extfrag_threshold=500
echo always > /sys/kernel/mm/transparent_hugepage/defrag
sysctl -w vm.compact_memory=1

@shuhaowu
Copy link

shuhaowu commented Jan 11, 2024

I tried the above options by @msizanoen1. Indeed the VM is now usable. However, it is still slow and I see that kswapd0 now occasionally will peg 1 CPU at 100%, despite the fact that I have no swap enabled. This process is only running when I run a VMware VM, which suggests this is somehow linked...

@msizanoen1
Copy link

I tried the above options by @msizanoen1. Indeed the VM is not usable. However, it is still slow and I see that kswapd0 now occasionally will peg 1 CPU at 100%, despite the fact that I have no swap enabled. This process is only running when I run a VMware VM, which suggests this is somehow linked...

It's likely that not using swap was the cause for kswapd consuming 100%. Generally it's not recommended to run a Linux system without some kind of swap, and this might be especially true when running with memory defragmentation disabled.

@eudocimus
Copy link

Try setting /proc/sys/vm/compaction_proactiveness to 1. The thing is, you need to compact eventually. What you want to avoid is a war between VMware and kernel. This will obvisouly happen if you compact too eagerly, which is the default. But if you compact too lazily, for example by not being proactive at all, you will run into a situation where you must do it reactively, with the same bad result. This is as if everyone has kept missing the proactiveness part. Setting it to 1 has now worked for me for some time, at least weeks, if not months.

@msizanoen1
Copy link

msizanoen1 commented May 29, 2024

Try setting /proc/sys/vm/compaction_proactiveness to 1. The thing is, you need to compact eventually. What you want to avoid is a war between VMware and kernel. This will obvisouly happen if you compact too eagerly, which is the default. But if you compact too lazily, for example by not being proactive at all, you will run into a situation where you must do it reactively, with the same bad result. This is as if everyone has kept missing the proactiveness part. Setting it to 1 has now worked for me for some time, at least weeks, if not months.

AFAIK (and through my own testing and reading of the kernel source) setting vm.extfrag_threshold=1000 and disabling transparent hugepage defrag will prevent the kernel from ever compacting memory, reactively or not, and will cause it to fall back to swapping pages out of memory and/or invoking the OOM killer instead.

vm.compaction_proactiveness and /sys/kernel/mm/transparent_hugepage/defrag controls different aspects of proactive memory compaction, while vm.extfrag_threshold controls reactive memory compaction (e.g. when the kernel needs to allocate a large chunk of continuous memory). Setting vm.extfrag_threshold to 1000 disables reactive memory compaction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment