Skip to content

Instantly share code, notes, and snippets.

@kiler129
Last active November 1, 2024 11:43
Show Gist options
  • Save kiler129/4f765e8fdc41e1709f1f34f7f8f41706 to your computer and use it in GitHub Desktop.
Save kiler129/4f765e8fdc41e1709f1f34f7f8f41706 to your computer and use it in GitHub Desktop.
Definitely prevent stubborn devices from being bound by the host driver in PCI passthrough scenario

Scenario

You're running a KVM-based virtualization. You want to do PCI/PCIe passthrough of some device. You don't want it to attach to the host OS at all.

Your device looks like that:

00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller [8086:1c02] (rev 05)
	Subsystem: Hewlett-Packard Company 6 Series/C200 Series Chipset Family 6 port Desktop SATA AHCI Controller [103c:330d]
	Kernel driver in use: ahci
	Kernel modules: ahci

Problem

Usually the solutions are simple:

  1. If you have only one device listing some module in Kernel modules (e.g. nvidiafb) you can add it to /etc/modprobe.d/some-file.conf as blacklist nvidiafb
  2. If you have multiple and they're normal devices you just add options vfio-pci ids=8086:1c02 to some file in/etc/modprobe.d/ (make sure to use the id in [...] and not pci location 00:1f.2)

However, these will not work if your device is handled by something loaded very very VERY early... like a driver for your second SATA controller.

  1. You cannot blacklist ahci (like in example here) because you will prevent all controllers from working (=no boot volume)
  2. You cannot use modprobe.d to set options because vfio-pci loads waaaaay too late.

Solution

There are two prerequisites:

  1. vfio-pci must be availbale before rootfs is attached
  2. vfio-pci must load before ahci loads

The first is simple:

  • add vfio-pci to /etc/initramfs-tools/modules
  • update initramfs: update-initramfs -u -k $(uname -r)
  • Proxmox on UEFI: if you're using Proxmox 7 booted using UEFI mode you also need to run proxmox-boot-tool refresh
  • it will place the module in initramfs disk (in /etc/conf/modules)

The second is more complicated:

  • entry in /etc/initramfs-tools/modules will load vfio-pci before the rootfs is mounted
  • however, /etc/conf/modules from ramdisk is loaded after some scripts (see /init in ramdisk)
  • these scripts (scripts/init-top/) load some drivers... and udev... and udev loads ahci
  • solution:
    • create /usr/share/initramfs-tools/scripts/init-top/load_vfio-pci with
      #!/bin/sh
      modprobe vfio-pci ids=8086:1c02
    • chmod +x /usr/share/initramfs-tools/scripts/init-top/load_vfio-pci
    • edit /usr/share/initramfs-tools/scripts/init-top/udev and change PREREQS="" to PREREQS="load_vfio-pci"
  • update initramfs: update-initramfs -u -k $(uname -r)
  • Proxmox on UEFI: if you're using Proxmox 7 booted using UEFI mode you also need to run proxmox-boot-tool refresh
  • note: this will not work if placed in "standard place" (/etc/initramfs-tools/scripts...) as dependencies are not cross-directory and /usr/share comes first

Verify

Without the mod:

# lspci -knn
...
00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller [8086:1c02] (rev 05)
	Subsystem: Hewlett-Packard Company 6 Series/C200 Series Chipset Family 6 port Desktop SATA AHCI Controller [103c:330d]
	Kernel driver in use: ahci
	Kernel modules: ahci

With the mod:

# lspci -knn
...
00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller [8086:1c02] (rev 05)
	Subsystem: Hewlett-Packard Company 6 Series/C200 Series Chipset Family 6 port Desktop SATA AHCI Controller [103c:330d]
	Kernel driver in use: vfio-pci
	Kernel modules: ahci
@mattventura
Copy link

Thank you, this works great. Do you know if there's any way of getting it to load prior to udev without modifying the udev script? It seems that my package manager reverted the changes.

@kiler129
Copy link
Author

@mattventura: In such case I would put a hook in your package manager to apply this change. Since you're intending to modify AHCI you practically need it to load as soon as possible, before the root fs is remounted.
@suminus: for proxmox 7 with e.g. UEFI you need to run proxmox-boot-tool refresh as per proxmox docs.

@wzyboy
Copy link

wzyboy commented Jul 5, 2023

I upgraded my PVE installation today and had the issue that SATA controller passthrough stopped working and all SATA disks appear in lsblk on host machine.

Previously I enabled passthrough with kernel parameters:

# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.15.108-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on vfio-pci.ids=8086:a352

This method stopped working after the upgrade.

I solved the issue by adjusting the loading order of modules.

Just add a line to /etc/modprobe.d/:

softdep ahci pre: vfio-pci

According to modprobe.d(5), this line makes vfio-pci loads before ahci.

Updating initramfs + reboot, and now lsblk no longer shows SATA disks. lspci shows that SATA controller is now handled by vfio-ahci.

Reference: https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF#Loading_vfio-pci_early

@mwpastore
Copy link

mwpastore commented Nov 13, 2023

I use @wzyboy's softdep solution. The other advantage is that you don't have to add anything to /etc/initramfs-tools/modules; when the initramfs scripts see the softdep they'll automatically pull in vfio_pci (and its dependencies).

The other useful tool for getting out of sticky vfio_pci binding situations is driverctl, although that kicks in a bit later in the boot process so it's less helpful for the specific issue addressed in this gist.

@b0wtie
Copy link

b0wtie commented Sep 8, 2024

Hey there,

I just wanted to leave a very big and kind THANK YOU. I have a virtualized TrueNAS Scale VM on my Proxmox server currently running 8.2.4 which was always trying to import the zpool that belongs to TrueNAS.
I thought it would be easy to prevent Proxmox from seeing the passed through SATA controller by adding it to vfio.conf, but softdep did not work (it kept asking me to import the pool manually at initramfs). Only this gist did the trick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment