Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save Drallas/7e4a6f6f36610eeb0bbb5d011c8ca0be to your computer and use it in GitHub Desktop.
Save Drallas/7e4a6f6f36610eeb0bbb5d011c8ca0be to your computer and use it in GitHub Desktop.

Mount Volumes into Proxmox VMs with Virtio-fs

Part of collection: Hyper-converged Homelab with Proxmox

15-04-2025: Seems Proxmox added this now to the GUI; see Using VirtioFS backed by CephFS for bind mounts how to use it.

Virtio-fs is a shared file system that lets virtual machines access a directory tree on the host. Unlike existing approaches, it is designed to offer local file system semantics and performance. The new virtiofsd-rs Rust daemon Proxmox 8 uses, is receiving the most attention for new feature development.

Performance is very good (while testing, almost the same as on the Proxmox host)

VM Migration is not possible yet, but it's being worked on!

Architecture

Screenshot 2023-09-27 at 17 47 52

Why?

Since I have a Proxmox High Available cluster with Ceph, I like to mount the Ceph File System, with CephFS Posix-compliant directories into my VM’s. I have been playing around with LXC container and Bind Mounts and even successfully setup Docker Swarm in LXC Containers. Unfortunately, this is not a recommended configuration and comes with some trade-offs and cumbersome configuration settings.

This Write-Up explains how to Create Erasure Coded CephFS Pools to store Volumes that than can be mounted into a VM via virtiofs.


Install virtiofsd

| This procedure has been tested with Ubuntu Server 22.04 and Debian 12!

Proxmox 8 Nodes, don’t have virtiofsd installed by default, so the first step is to install it.

apt install virtiofsd -y

# Check the version
/usr/lib/kvm/virtiofsd --version
virtiofsd backend 1.7.0

virtiofsd 1.7.0 has many issues (hangs after rebooting the vm, superblock errors etc...) version 1.7.2 and 1.8.0 seems to work much better, it can be found at virtio-fs releases page. But be carefull this package is not considered stable and not even in unstable Debian Package Tracker.


Add the hookscript to the vm

Still on the Proxmox host!

Get the Hookscript files files and copy them to /var/lib/vz/snippets, and make virtiofs_hook.pl executable.

Or use the get-hookscript.sh script to download the scripts files automatically to /var/lib/vz/snippets.

cd ~/
sudo sh -c "wget https://raw.githubusercontent.com/Drallas/Virtio-fs-Hookscript/main/get_hook_script.sh"
sudo chmod +x ~/get-hook%20script.sh

Modify the conf file

To set the VMID and the folders that a VM needs to mount, open the virtiofs_hook.conf file.

sudo nano /var/lib/vz/snippets/virtiofs_hook.conf

Set the Hookscript to an applicable VM

Set the hookscript to a VM.

qm set <vmid> --hookscript local:snippets/virtiofs_hook.pl

That's it, when it's added to the VM, the script does it magic on VM boot:

  • Adding the correct Args section to the virtiofsd args: -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node........
  • Creating the sockets that are needed for the folders.
  • Cleanup on VM Shutdown

Start / Stop VM

The VM can now be started and the hookscript takes care of the virtiofsd part.

qm start <vmid>

Check

Check the processes virtiofsd ps aux | grep virtiofsd or systemctl | grep virtiofsd for the systemd services.

If all is good, it looks like this: Screenshot 2023-09-23 at 12 51 55


Mount inside VM

Linux kernel >5.4 inside the VM, supports Virtio-fs natively

Mounting is in the format: mount -t virtiofs <tag> <local-mount-point>

To find the tag

On the Proxmox host; Exucute qm config <vmid> --current and look for the tag=xxx-docker inside the args section args: -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -chardev socket,id=char1,path=/run/virtiofsd/xxx-docker.sock -device vhost-user-fs-pci,chardev=char1,tag=<vmid>-<appname>

# Create a directory
sudo mkdir -p /srv/cephfs-mounts/<foldername>

# Mount the folder
sudo mount -t virtiofs mnt_pve_cephfs_multimedia /srv/cephfs-mounts/<foldername>

# Add them to /etc/fstab
sudo nano /etc/fstab

# Mounts for virtiofs
# The nofail option is used to prevent the system to hang if the mount fails!
<vmid>-<appname>  /srv/cephfs-mounts/<foldername>  virtiofs  defaults,nofail  0  0
 
# Mount everything from fstab
sudo systemctl daemon-reload && sudo mount -a

# Verify
ls -lah /srv/cephfs-mounts/<vmid>-<appname>

Issues

  1. New Vm's tend to trow a 'superblock' error on first boot:
mount: /srv/cephfs-mounts/download: wrong fs type, bad option, bad superblock on mnt_pve_cephfs_multimedia, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

To solve this, I poweroff the vm sudo /sbin/shutdown -HP now and then start it again from the host with qm start <vmid>, everything should mount fine now.

  1. Adding an extra volume throws also a 'superblock' error.
qm stop <vmid>
sudo nano /etc/pve/qemu-server/<vmid>.conf
# Remove the Arg entry
`args: -object memory-backend-memfd,id=mem,size=4096M,share=on..
qm start <vmid>

Now the Volume's all have a superblock error; I poweroff the vm sudo /sbin/shutdown -HP now and then start it again from the host with qm start <vmid>, everything should mount fine again.


Cleanup

To remove Virtio-fs from a VM and from the host:

nano /etc/pve/qemu-server/xxx.conf

# Remove the following lines
hookscript: local:snippets/virtiofs-hook.pl
args: -object memory-backend-memfd,id=mem,size=4096M,share=on..

Disable each virtiofsd-xxx service, replace xxx with correct values or use (* wildcard) to remove them all at once.

systemctl disable virtiofsd-xxx
sudo systemctl reset-failed virtiofsd-xxx

This should be enough, but if the reference persist:

# Remove leftover sockets and services.
rm -rf /etc/systemd/system/virtiofsd-xxx
rm -rf /etc/systemd/system/xxx.scope.requires/
rmdir /sys/fs/cgroup/system.slice/'system-virtiofsd\xxx' 

If needed reboot the Host, to make sure all references are purged from the system state.


Links


Moved to: https://github.com/Drallas/Virtio-fs-Hookscript/blob/main/Script/get-hook%20script.sh
# get-hook script.sh retained to keep it's commit history!
# Moved to: https://github.com/Drallas/Virtio-fs-Hookscript/blob/main/Script/virtiofs-hook.pl
@GamerBene19
Copy link

GamerBene19 commented Sep 21, 2024

Any idea why the tags wouldn't show up when running qm config <vmid> --current?

Tags can't be longer than 36 chars (36 bytes). Perhaps your paths are longer than that? In this case qemu simply discards them iirc.

Edit: Just saw your tag in the output above, so the length is not the issue. Leaving this up as it might be helpful for others

@ryan-kengyang-alpha
Copy link

Hi Drallas, thanks for the script and the very well written writeup! I am using it right now in production but we ran into an issue where virtual machines would get stuck trying to read from the mounted directory. Upon investigating, I was not able to even run ls in the mounted directory either. Only a hard reboot fixed the issue. Do you have any idea what's going on?

@GamerBene19
Copy link

Hi Drallas, thanks for the script and the very well written writeup! I am using it right now in production but we ran into an issue where virtual machines would get stuck trying to read from the mounted directory. Upon investigating, I was not able to even run ls in the mounted directory either. Only a hard reboot fixed the issue. Do you have any idea what's going on?

Did the virtiofs daemon on the host crash? Iirc the vm just hangs if it exits while the VM is still running.

@ryan-kengyang-alpha
Copy link

Did the virtiofs daemon on the host crash? Iirc the vm just hangs if it exits while the VM is still running.

No the daemon did not. We have 4 VMs on the server and all of them have virtiofs shares mapped. Only one VM hangs while the rest were still able to access files on the virtiofs share.

@Drallas
Copy link
Author

Drallas commented Oct 7, 2024

Hi Drallas, thanks for the script and the very well written writeup! I am using it right now in production but we ran into an issue where virtual machines would get stuck trying to read from the mounted directory. Upon investigating, I was not able to even run ls in the mounted directory either. Only a hard reboot fixed the issue. Do you have any idea what's going on?

I have never experienced this behaviour in my Homelab, running an active Docker Swarm (with multiple Docker Volumes) on top of this! I assume all your packages are up to date!?

@ryan-kengyang-alpha
Copy link

I have never experienced this behaviour in my Homelab, running an active Docker Swarm (with multiple Docker Volumes) on top of this! I assume all your packages are up to date!?

Yes all packages up to date with the latest proxmox kernel. We run kubernetes on each VM and each pod maps to the virtiofs mount on the VM and the workloads that we run are very iops intensive. Initially we got good results with virtiofs with 60GB/s reads and 12GB/s writes on dRAID-1 but the hanging seems to have gotten worse. I managed to attempt a bandaid fix by increasing the queue size to the maximum of 1024 which seemed to help for a while but the hanging came back after a while.

@cpslowik
Copy link

cpslowik commented Jan 27, 2025

Has anyone else had prolonged shutdowns due to the VMs using the virtiofs mount hanging? It seems that when I do a whole system reboot, the virtiofsd systemd service gets stopped prior to the VM, then when the VM (debian in my case) is trying to sync block devices it hangs until it eventually times out. When I shut down the VM normally it proceeds without issue, only when I do a full PVE reboot that something seems to be terminating the virtiofsd service

UPDATE: I found this thread which seemed to be a similar issue with NFS running on the PVE host. I changed the generated systemd service files and updated the hookscript to add Before=pve-guests.service in the [Unit] section which seems to have solved the issue and allows the VM to shutdown cleanly!

@Blumlaut
Copy link

Blumlaut commented Feb 10, 2025

Any idea why the tags wouldn't show up when running qm config <vmid> --current?

experiencing a similar issue as @scottmeup currently, i tried everything in a VM running proxmox and everything worked fine there, now on my actual production instance i cant get virtiofs to work properly, the hook script is installed and seemingly working fine:

GUEST HOOK: 102 pre-start
102 is starting, doing preparations.
Creating directory: /run/virtiofsd/
attempting to install unit virtiofsd-102-nextcloud...
DIRECTORY DOES EXIST!
-object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/102-nextcloud/102-nextcloud.sock -device vhost-user-fs-pci,chardev=char0,tag=102-nextcloud
Appending virtiofs arguments to VM args.
GUEST HOOK: 102 post-start
102 started successfully.
Removing virtiofs arguments from VM args.
conf->args = -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/102-nextcloud/102-nextcloud.sock -device vhost-user-fs-pci,chardev=char0,tag=102-nextcloud
vfs_args = -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/102-nextcloud/102-nextcloud.sock -device vhost-user-fs-pci,chardev=char0,tag=102-nextcloud
-object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=memconf->args = -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem
TASK OK

qm config 102 --current shows no tag:

agent: 1
args: -object memory-backend-memfd,id=mem,size=4096M,share=on -numa node,memdev=mem
boot: order=scsi0;net0
cores: 6
cpu: x86-64-v2-AES
description: https%3A//192.168.178.102%3A9443
hookscript: local:snippets/virtiofs_hook.pl
memory: 4096
meta: creation-qemu=9.0.2,ctime=1738925409
name: cloud
net0: virtio=BC:24:11:37:D0:BC,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-lvm:vm-102-disk-0,iothread=1,size=20G
scsihw: virtio-scsi-single
smbios1: uuid=9f44a43b-55a7-4fa1-b89c-c8caa3ebb4aa
sockets: 1
vmgenid: 05783bc7-c899-4afc-b01d-024ca0cee538

my .conf is very basic:

102: /LEGOSHIBA/shares/nextcloud/

the service seems to be OK too:

 [email protected] - virtiofsd filesystem share at  /LEGOSHIBA/shares/nextcloud/ for VM 102
     Loaded: loaded (/etc/systemd/system/[email protected]; enabled; preset: enabled)
     Active: active (running) since Mon 2025-02-10 23:50:56 CET; 4min 38s ago
   Main PID: 6765 (virtiofsd)
      Tasks: 3 (limit: 37581)
     Memory: 844.0K
        CPU: 3ms
     CGroup: /system.slice/system-virtiofsd\x2d102\x2dnextcloud.slice/[email protected]
             ├─6765 /usr/libexec/virtiofsd --log-level debug --socket-path /run/virtiofsd/102-nextcloud.sock --shared-dir /LEGOSHIBA/shares/nextcloud/ --cache=auto --announce-submounts --inode-file-handles=mandatory
             └─6768 /usr/libexec/virtiofsd --log-level debug --socket-path /run/virtiofsd/102-nextcloud.sock --shared-dir /LEGOSHIBA/shares/nextcloud/ --cache=auto --announce-submounts --inode-file-handles=mandatory

Feb 10 23:50:56 pve systemd[1]: Started [email protected] - virtiofsd filesystem share at  /LEGOSHIBA/shares/nextcloud/ for VM 102.
Feb 10 23:50:56 pve virtiofsd[6768]: [2025-02-10T22:50:56Z DEBUG virtiofsd::passthrough::mount_fd] Creating MountFd: mount_id=620, mount_fd=10
Feb 10 23:50:56 pve virtiofsd[6768]: [2025-02-10T22:50:56Z DEBUG virtiofsd::passthrough::mount_fd] Dropping MountFd: mount_id=620, mount_fd=10
Feb 10 23:50:56 pve virtiofsd[6768]: [2025-02-10T22:50:56Z INFO  virtiofsd] Waiting for vhost-user socket connection...

attempting to mount it anyway on the VM leads to an error:

virtio-fs: tag <102-nextcloud> not found

VM is running Debian Bookworm, kernel 6.1.0-31-amd64

has anyone else encountered this before? im running out of ideas 😕

Edit: weirdly enough, creating a new VM and doing the exact same thing yields it working fine, huh??

@MafiaInc
Copy link

I figured the hookscrip doesn't add all args changes to my VM and therefore virtiofs ID(s) were missing and not found in my VM. I was able to fix this be applying the following patch. If you don't want to apply the patch then you can still figure the full list of arguments (also the output of the hookscript is correctly displaying them on pre-start) and manually edit your VM config, then comment the lines where there is "PVE::QemuConfig->write_config" in the hookscript and leave it only for the purpose to handle the systemd for the virtiofsd on the host.

@ONesterkin
Copy link

ONesterkin commented Feb 22, 2025

Worked this out. TLDR: NUMA should be disabled in guest vm processor settings.

Symptoms below
Have 2 VMs that share same virtiofs drives. One starts perfectly (if we work around superblock issue as you wrote above), but another one fails to start miserably:

  • intial start - all ok, but have superblock problem
  • shutdown and then start it again - VM fails to start completely with following:
generating cloud-init ISO
GUEST HOOK: 203 pre-start
203 is starting, doing preparations.
Creating directory: /run/virtiofsd/
attempting to install unit virtiofsd-203-oleg...
DIRECTORY DOES EXIST!
attempting to install unit virtiofsd-203-natalia...
DIRECTORY DOES EXIST!
-object memory-backend-memfd,id=mem,size=2048M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/203-oleg.sock -device vhost-user-fs-pci,chardev=char0,tag=203-oleg -chardev socket,id=char1,path=/run/virtiofsd/203-natalia.sock -device vhost-user-fs-pci,chardev=char1,tag=203-natalia
Appending virtiofs arguments to VM args.
kvm: total memory for NUMA nodes (0x180000000) should equal RAM size (0x80000000)
TASK ERROR: start failed: QEMU exited with code 1

Some additional observations.

  • Doesn't seem to be related to the order of vm start. If I just try to start 203 w/o 201 - all the same story
  • Cleaning up "args" section of qemu config helps to overcome the issue with VM startup, but then I have to shutdown/start to resolve superblock problem, everything is back there.
  • trying to get qm config 203 --current for cases when it's running - all is ok, args section is full args: -object memory-backend-memfd,id=mem,size=2048M,share=on -numa node,memdev=mem -chardev socket,id=char0,path=/run/virtiofsd/203-oleg.sock -device vhost-user-fs-pci,chardev=char0,tag=203-oleg -chardev socket,id=char1,path=/run/virtiofsd/203-natalia.sock -device vhost-user-fs-pci,chardev=char1,tag=203-natalia
  • despite vm ids, args sections for both vms are identical
  • I see virtiofsd service failing to start for VM 203, but this is only when I shutdown it with superblock problem [email protected] loaded failed failed virtiofsd filesystem share at /spool/oleg for VM 203

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment