The problem is described here. The quick fix:
echo "options vhost max_mem_regions=512" > /etc/modprobe.d/vhost.conf
rmmod vhost_net
rmmod vhost
modprobe vhost_net
cat /sys/module/vhost/parameters/max_mem_regions
# 512
Problem description is here. The situation is very critical when you use clustered file storage (Gluster/CEPH) and have dozens of attached disk drives inside the VM. Inside the VMs you can notice the strange locks of the block devices, high CPU load, etc.
Check open file limits:
# find open files limit per process
root@hv05-htz-nbg1:~# ulimit -n
1024
# count all opened files by all processes:
root@hv05-htz-nbg1:~# cat /proc/sys/fs/file-nr
12192 0 9223372036854775807
# get maximum allowed number of open files
root@hv05-htz-nbg1:~# cat /proc/sys/fs/file-max
9223372036854775807
How to check the limits per kvm
process:
#!/usr/bin/env bash
for PID in $(ps aux | grep /usr/bin/kvm | grep -v grep | awk '{ print $2 }'); do
SOFT_LIMIT=$(cat /proc/${PID}/limits 2>/dev/null | grep "Max open files" | awk '{ print $4 }')
HARD_LIMIT=$(cat /proc/${PID}/limits 2>/dev/null | grep "Max open files" | awk '{ print $5 }')
echo "PID ${PID} opened files: $(ls -1 /proc/${PID}/fd 2>/dev/null | wc -l)/${SOFT_LIMIT}/${HARD_LIMIT}"
done
Fix:
-
/etc/sysctl.d/90-rs-proxmox.conf:
# Default: 1048576 fs.nr_open = 2097152 # Default: 8388608 fs.inotify.max_queued_events = 8388608 # Default: 65536 fs.inotify.max_user_instances = 1048576 # Default: 4194304 fs.inotify.max_user_watches = 4194304 # Default: 262144 vm.max_map_count = 262144
Keep in mind that
fs.nr_open
should be greater-or-equal to the value ofnofile
you are going to set via limits.conf. If not - you'll have an error like:Dec 23 03:10:01 hv04-htz-nbg1 CRON[94421]: pam_limits(cron:session): Could not set limit for 'nofile' to soft=1048576, hard=2097152: Operation not permitted; uid=0,euid=0 Dec 23 03:17:01 hv04-htz-nbg1 CRON[111402]: pam_limits(cron:session): Could not set limit for 'nofile' to soft=1048576, hard=2097152: Operation not permitted; uid=0,euid=0 Dec 23 03:18:35 hv04-htz-nbg1 sshd[115352]: pam_limits(sshd:session): Could not set limit for 'nofile' to soft=1048576, hard=2097152: Operation not permitted; uid=0,euid=0 Dec 23 03:18:35 hv04-htz-nbg1 sshd[115359]: pam_limits(sshd:session): Could not set limit for 'nofile' to soft=1048576, hard=2097152: Operation not permitted; uid=0,euid=0
-
/etc/security/limits.d/90-rs-proxmox.conf:
* soft nofile 1048576 * hard nofile 2097152 root soft nofile 1048576 root hard nofile 2097152 * soft memlock 1048576 * hard memlock 2097152
-
mkdir -p /etc/systemd/system/pvedaemon.service.d && touch /etc/systemd/system/pvedaemon.service.d/limits.conf
Add the content of /etc/systemd/system/pvedaemon.service.d/limits.conf:[Service] LimitNOFILE=infinity LimitMEMLOCK=infinity LimitNPROC=infinity TasksMax=infinity
-
mkdir -p /etc/systemd/system/pve-guests.service.d && touch /etc/systemd/system/pve-guests.service.d/limits.conf
Add the content of /etc/systemd/system/pve-guests.service.d/limits.conf:[Service] LimitNOFILE=infinity LimitMEMLOCK=infinity LimitNPROC=infinity TasksMax=infinity
-
mkdir -p /etc/systemd/system/pve-ha-lrm.service.d && touch /etc/systemd/system/pve-ha-lrm.service.d/limits.conf
Add the content of /etc/systemd/system/pve-ha-lrm.service.d/limits.conf:[Service] LimitNOFILE=infinity LimitMEMLOCK=infinity LimitNPROC=infinity TasksMax=infinity
-
Restart processes:
systemctl daemon-reload systemctl restart pvedaemon.service pve-ha-lrm.service
-
By design, you are not allowed to restart pve-guests.service manually. So you might need to reboot your hypervisor machine.
You can also change limits on already running process.
#!/usr/bin/env bash
for PID in $(ps aux | grep /usr/bin/kvm | grep -v grep | awk '{ print $2 }'); do
SOFT_LIMIT="1048576"
HARD_LIMIT="2097152"
echo "Changing the limits for PID ${PID}"
prlimit --nofile=${SOFT_LIMIT}:${HARD_LIMIT} --pid ${PID}
done