This assumes you already have OVS and DPDK installed on your system
sudo mkdir -p /var/run/openvswitch
sudo killall ovsdb-server ovs-vswitchd
rm -f /var/run/openvswitch/vhost-user*
sudo rm -f /etc/openvswitch/conf.db
export DB_SOCK=/var/run/openvswitch/db.sock
sudo -E ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
sudo -E ovsdb-server --remote=punix:$DB_SOCK --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach
sudo -E ovs-vsctl --no-wait init
sudo -E ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0xf
sudo -E ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=1024
sudo -E ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
sudo sysctl -w vm.nr_hugepages=8192
sudo mount -t hugetlbfs -o pagesize=2048k none /dev/hugepages
sudo -E ovs-vswitchd unix:$DB_SOCK --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log
ps -ae | grep ovs
Create the bridge and end points to be used by our VM(s):
sudo ovs-vsctl add-br ovsbr -- set bridge ovsbr datapath_type=netdev
sudo ovs-vsctl add-port ovsbr vhost-user1 -- set Interface vhost-user1 type=dpdkvhostuser
sudo ovs-vsctl add-port ovsbr vhost-user2 -- set Interface vhost-user2 type=dpdkvhostuser
I end up pulling a couple of copies of the clear-container image and vmlinuz to a local scratch directory.
Both come from /usr/share/cc-oci-runtime/
:
sudo cp /usr/share/cc-oci-runtime/clear-13280-containers.img 1-clear-13280-containers.img
sudo cp /usr/share/cc-oci-runtime/clear-13280-containers.img 2-clear-13280-containers.img
sudo cp /usr/share/cc-oci-runtime/vmlinuz-4.9.4-53.container .
Launch the VM as follows:
sudo qemu-lite-system-x86_64 \
-machine pc,accel=kvm,kernel_irqchip,nvdimm \
-cpu host \
-m 2G,maxmem=5G,slots=2 \
-smp 2 \
-no-user-config \
-nodefaults \
-rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=discard \
-kernel vmlinuz-4.9.4-53.container \
-append 'reboot=k panic=1 rw tsc=reliable no_timer_check noreplace-smp root=/dev/pmem0p1 init=/usr/lib/systemd/systemd initcall_debug rootfstype=ext4 rootflags=dax,data=ordered dhcp rcupdate.rcu_expedited=1 clocksource=kvm-clock console=hvc0 single iommu=false' \
-device virtio-serial-pci,id=virtio-serial0 \
-chardev stdio,id=charconsole0 \
-device virtconsole,chardev=charconsole0,id=console0 \
-nographic \
-object memory-backend-file,id=mem0,mem-path=./2-clear-13280-containers.img,size=235929600 \
-device nvdimm,memdev=mem0,id=nv0 \
-no-reboot \
-chardev socket,id=char1,path=/run/openvswitch/vhost-user2 \
-netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce \
-device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet1 \
-object memory-backend-file,id=mem,size=2G,mem-path=/dev/hugepages,share=on,prealloc=on \
-numa node,memdev=mem
You'll see the number of free pages decrease by the number reserved (in this example 2G) once the VM is started. You can check this by monitoring the hugepages:
$ cat /proc/meminfo | grep ^Hug
HugePages_Total: 8192
HugePages_Free: 3584
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Once you shutdown the VM (via shutdown, or ctl-c), you'll see that memory is still consumed even though qemu has exited.
Interestingly, turns out if I store the clear-container.img within a 9pfs mounted directory, I see this memory issue. If I store the image in a normal directory on my host the issue goes away.