Skip to content

Instantly share code, notes, and snippets.

@Drallas
Last active October 25, 2024 05:52
Show Gist options
  • Save Drallas/7e4a6f6f36610eeb0bbb5d011c8ca0be to your computer and use it in GitHub Desktop.
Save Drallas/7e4a6f6f36610eeb0bbb5d011c8ca0be to your computer and use it in GitHub Desktop.

Mount Volumes into Proxmox VMs with Virtio-fs

Part of collection: Hyper-converged Homelab with Proxmox

Virtio-fs is a shared file system that lets virtual machines access a directory tree on the host. Unlike existing approaches, it is designed to offer local file system semantics and performance. The new virtiofsd-rs Rust daemon Proxmox 8 uses, is receiving the most attention for new feature development.

Performance is very good (while testing, almost the same as on the Proxmox host)

VM Migration is not possible yet, but it's being worked on!

Architecture

Screenshot 2023-09-27 at 17 47 52

Why?

Since I have a Proxmox High Available cluster with Ceph, I like to mount the Ceph File System, with CephFS Posix-compliant directories into my VM’s. I have been playing around with LXC container and Bind Mounts and even successfully setup Docker Swarm in LXC Containers. Unfortunately, this is not a recommended configuration and comes with some trade-offs and cumbersome configuration settings.

This Write-Up explains how to Create Erasure Coded CephFS Pools to store Volumes that than can be mounted into a VM via virtiofs.


Install virtiofsd

| This procedure has been tested with Ubuntu Server 22.04 and Debian 12!

Proxmox 8 Nodes, don’t have virtiofsd installed by default, so the first step is to install it.

apt install virtiofsd -y

# Check the version
/usr/lib/kvm/virtiofsd --version
virtiofsd backend 1.7.0

virtiofsd 1.7.0 has many issues (hangs after rebooting the vm, superblock errors etc...) version 1.7.2 and 1.8.0 seems to work much better, it can be found at virtio-fs releases page. But be carefull this package is not considered stable and not even in unstable Debian Package Tracker.


Add the hookscript to the vm

Still on the Proxmox host!

Get the Hookscript files files and copy them to /var/lib/vz/snippets, and make virtiofs_hook.pl executable.

Or use the get-hookscript.sh script to download the scripts files automatically to /var/lib/vz/snippets.

cd ~/
sudo sh -c "wget https://raw.githubusercontent.com/Drallas/Virtio-fs-Hookscript/main/get_hook_script.sh"
sudo chmod +x ~/get-hook%20script.sh
./get-hook%20script.sh

Modify the conf file

To set the VMID and the folders that a VM needs to mount, open the virtiofs_hook.conf file.

sudo nano /var/lib/vz/snippets/virtiofs_hook.conf

Set the Hookscript to an applicable VM

Set the hookscript to a VM.

qm set <vmid> --hookscript local:snippets/virtiofs_hook.pl

That's it, when it's added to the VM, the script does it magic on VM boot:

  • Adding the correct Args section to the virtiofsd args: -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node........
  • Creating the sockets that are needed for the folders.
  • Cleanup on VM Shutdown

Start / Stop VM

The VM can now be started and the hookscript takes care of the virtiofsd part.

qm start <vmid>

Check

Check the processes virtiofsd ps aux | grep virtiofsd or systemctl | grep virtiofsd for the systemd services.

If all is good, it looks like this: Screenshot 2023-09-23 at 12 51 55


Mount inside VM

Linux kernel >5.4 inside the VM, supports Virtio-fs natively

Mounting is in the format: mount -t virtiofs <tag> <local-mount-point>

To find the tag

On the Proxmox host; Exucute qm config <vmid> --current and look for the tag=xxx-docker inside the args section args: -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -chardev socket,id=char1,path=/run/virtiofsd/xxx-docker.sock -device vhost-user-fs-pci,chardev=char1,tag=<vmid>-<appname>

# Create a directory
sudo mkdir -p /srv/cephfs-mounts/<foldername>

# Mount the folder
sudo mount -t virtiofs mnt_pve_cephfs_multimedia /srv/cephfs-mounts/<foldername>

# Add them to /etc/fstab
sudo nano /etc/fstab

# Mounts for virtiofs
# The nofail option is used to prevent the system to hang if the mount fails!
<vmid>-<appname>  /srv/cephfs-mounts/<foldername>  virtiofs  defaults,nofail  0  0
 
# Mount everything from fstab
sudo systemctl daemon-reload && sudo mount -a

# Verify
ls -lah /srv/cephfs-mounts/<vmid>-<appname>

Issues

  1. New Vm's tend to trow a 'superblock' error on first boot:
mount: /srv/cephfs-mounts/download: wrong fs type, bad option, bad superblock on mnt_pve_cephfs_multimedia, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

To solve this, I poweroff the vm sudo /sbin/shutdown -HP now and then start it again from the host with qm start <vmid>, everything should mount fine now.

  1. Adding an extra volume throws also a 'superblock' error.
qm stop <vmid>
sudo nano /etc/pve/qemu-server/<vmid>.conf
# Remove the Arg entry
`args: -object memory-backend-memfd,id=mem,size=4096M,share=on..
qm start <vmid>

Now the Volume's all have a superblock error; I poweroff the vm sudo /sbin/shutdown -HP now and then start it again from the host with qm start <vmid>, everything should mount fine again.


Cleanup

To remove Virtio-fs from a VM and from the host:

nano /etc/pve/qemu-server/xxx.conf

# Remove the following lines
hookscript: local:snippets/virtiofs-hook.pl
args: -object memory-backend-memfd,id=mem,size=4096M,share=on..

Disable each virtiofsd-xxx service, replace xxx with correct values or use (* wildcard) to remove them all at once.

systemctl disable virtiofsd-xxx
sudo systemctl reset-failed virtiofsd-xxx

This should be enough, but if the reference persist:

# Remove leftover sockets and services.
rm -rf /etc/systemd/system/virtiofsd-xxx
rm -rf /etc/systemd/system/xxx.scope.requires/
rmdir /sys/fs/cgroup/system.slice/'system-virtiofsd\xxx' 

If needed reboot the Host, to make sure all references are purged from the system state.


Links


Moved to: https://github.com/Drallas/Virtio-fs-Hookscript/blob/main/Script/get-hook%20script.sh
# get-hook script.sh retained to keep it's commit history!
# Moved to: https://github.com/Drallas/Virtio-fs-Hookscript/blob/main/Script/virtiofs-hook.pl
@scottmeup
Copy link

Any idea why the tags wouldn't show up when running qm config <vmid> --current?

This is the qm config --current output

args: -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem
balloon: 2048
boot: order=scsi0;ide2;net0
cores: 1
cpu: x86-64-v2-AES
hookscript: local:snippets/virtiofs_hook.pl
ide2: local:iso/debian-12.5.0-amd64-netinst.iso,media=cdrom,size=629M
memory: 4096
meta: creation-qemu=8.1.5,ctime=1709037614
name: Debian-Public-Lan
net0: virtio=BC:24:11:9E:CB:50,bridge=vmbr0
numa: 0
onboot: 1
ostype: l26
scsi0: local-lvm:vm-103-disk-0,iothread=1,size=50G
scsihw: virtio-scsi-single
smbios1: uuid=9f2e888c-a50c-4631-b3b5-07fcfbd146a5
sockets: 1
startup: order=3
vmgenid: 3f09825f-f97a-43e1-83ad-ddc226bde4b6

This is the proxmox startup output

GUEST HOOK: 103 pre-start
103 is starting, doing preparations.
Creating directory: /run/virtiofsd/
attempting to install unit virtiofsd-103-common...
DIRECTORY DOES EXIST!
attempting to install unit virtiofsd-103-103...
DIRECTORY DOES EXIST!
-object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/103-common.sock -device vhost-user-fs-pci,chardev=char0,tag=103-common -chardev socket,id=char1,path=/run/virtiofsd/103-103.sock -device vhost-user-fs-pci,chardev=char1,tag=103-103
Appending virtiofs arguments to VM args.
GUEST HOOK: 103 post-start
103 started successfully.
Removing virtiofs arguments from VM args.
conf->args = -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/103-common.sock -device vhost-user-fs-pci,chardev=char0,tag=103-common -chardev socket,id=char1,path=/run/virtiofsd/103-103.sock -device vhost-user-fs-pci,chardev=char1,tag=103-103
vfs_args = -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/103-common.sock -device vhost-user-fs-pci,chardev=char0,tag=103-common -chardev socket,id=char1,path=/run/virtiofsd/103-103.sock -device vhost-user-fs-pci,chardev=char1,tag=103-103
-object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=memconf->args = -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem
TASK OK

@Drallas
Copy link
Author

Drallas commented Aug 14, 2024

@scottmeup See issues, that's all I know so far.

@GamerBene19
Copy link

GamerBene19 commented Sep 21, 2024

Any idea why the tags wouldn't show up when running qm config <vmid> --current?

Tags can't be longer than 36 chars (36 bytes). Perhaps your paths are longer than that? In this case qemu simply discards them iirc.

Edit: Just saw your tag in the output above, so the length is not the issue. Leaving this up as it might be helpful for others

@RyanChewKengYang
Copy link

Hi Drallas, thanks for the script and the very well written writeup! I am using it right now in production but we ran into an issue where virtual machines would get stuck trying to read from the mounted directory. Upon investigating, I was not able to even run ls in the mounted directory either. Only a hard reboot fixed the issue. Do you have any idea what's going on?

@GamerBene19
Copy link

Hi Drallas, thanks for the script and the very well written writeup! I am using it right now in production but we ran into an issue where virtual machines would get stuck trying to read from the mounted directory. Upon investigating, I was not able to even run ls in the mounted directory either. Only a hard reboot fixed the issue. Do you have any idea what's going on?

Did the virtiofs daemon on the host crash? Iirc the vm just hangs if it exits while the VM is still running.

@RyanChewKengYang
Copy link

Did the virtiofs daemon on the host crash? Iirc the vm just hangs if it exits while the VM is still running.

No the daemon did not. We have 4 VMs on the server and all of them have virtiofs shares mapped. Only one VM hangs while the rest were still able to access files on the virtiofs share.

@Drallas
Copy link
Author

Drallas commented Oct 7, 2024

Hi Drallas, thanks for the script and the very well written writeup! I am using it right now in production but we ran into an issue where virtual machines would get stuck trying to read from the mounted directory. Upon investigating, I was not able to even run ls in the mounted directory either. Only a hard reboot fixed the issue. Do you have any idea what's going on?

I have never experienced this behaviour in my Homelab, running an active Docker Swarm (with multiple Docker Volumes) on top of this! I assume all your packages are up to date!?

@RyanChewKengYang
Copy link

I have never experienced this behaviour in my Homelab, running an active Docker Swarm (with multiple Docker Volumes) on top of this! I assume all your packages are up to date!?

Yes all packages up to date with the latest proxmox kernel. We run kubernetes on each VM and each pod maps to the virtiofs mount on the VM and the workloads that we run are very iops intensive. Initially we got good results with virtiofs with 60GB/s reads and 12GB/s writes on dRAID-1 but the hanging seems to have gotten worse. I managed to attempt a bandaid fix by increasing the queue size to the maximum of 1024 which seemed to help for a while but the hanging came back after a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment