This guide walks through enabling NVIDIA GPUDirect Storage (GDS) so a system can perform direct DMA transfers between NVMe devices and GPU VRAM.
The minimum success criteria:
nvidia_fs.koloads successfullygdscheckreports GDS operational- Benchmark tools such as
gdsiorun successfully
This guide is written for operators managing:
- Proxmox or other hypervisors
- Ubuntu-based guests (recommended baseline)
- Environments where GPU and NVMe are passed through to a VM
GPUDirect Storage requires:
- GPU accessible inside the OS
- NVMe accessible as a raw block device
- NVIDIA driver + CUDA installed
nvidia-fskernel module loaded- Filesystem and mount options compatible with direct I/O
In virtualised environments, both GPU and NVMe must be PCI passthrough devices, not emulated disks.
This section is the most important. Most failures originate here.
In BIOS:
- Enable VT-d / AMD-Vi
- Enable SR-IOV if available
- Enable Above 4G decoding
On the host kernel:
For Intel:
intel_iommu=on iommu=pt
For AMD:
amd_iommu=on iommu=pt
Reboot after applying changes.
On the Proxmox host:
lspci | grep -E "NVIDIA|Non-Volatile"
Record PCI addresses such as:
0000:65:00.0 GPU
0000:5e:00.0 NVMe
GPUDirect Storage performs best when each GPU and its NVMe device share:
- the same PCIe root complex, or
- the same PCIe switch
lspci -tv
You want GPU and NVMe to appear under the same upstream bridge where possible.
Example of a good layout:
-[0000:5d]-+-00.0 PCI bridge
+-00.1 NVIDIA GPU
+-00.2 NVMe controller
Avoid layouts where:
- GPU is under one root complex
- NVMe is under another CPU socket
Those paths cross inter-socket links and reduce performance.
If the GPU driver is loaded on the host:
nvidia-smi topo -m
Key indicators:
- PIX = same PCIe switch (ideal)
- PXB = multiple switches (acceptable)
- SYS = across CPU sockets (not ideal)
Aim for PIX or PXB relationships between GPU and NVMe.
root@greenthread-h100:~# nvidia-smi topo -m
GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 NIC0 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NV12 PHB PHB PHB PHB PHB PHB PHB 0-251 0 N/A
GPU1 NV12 X PHB PHB PHB PHB PHB PHB PHB 0-251 0 N/A
GPU2 PHB PHB X NV12 PHB PHB PHB PHB PHB 0-251 0 N/A
GPU3 PHB PHB NV12 X PHB PHB PHB PHB PHB 0-251 0 N/A
GPU4 PHB PHB PHB PHB X NV12 PHB PHB PHB 0-251 0 N/A
GPU5 PHB PHB PHB PHB NV12 X PHB PHB PHB 0-251 0 N/A
GPU6 PHB PHB PHB PHB PHB PHB X NV12 PHB 0-251 0 N/A
GPU7 PHB PHB PHB PHB PHB PHB NV12 X PHB 0-251 0 N/A
NIC0 PHB PHB PHB PHB PHB PHB PHB PHB X
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinksCheck IOMMU groups:
find /sys/kernel/iommu_groups/ -type l
Requirements:
- GPU functions grouped correctly
- NVMe controller isolated or safely passthrough-able
If devices share groups with critical host devices, motherboard slot placement may need adjustment.
For each VM:
Pass through:
- GPU
- GPU audio function (if present)
- NVMe controller (not a virtual disk)
Recommended VM settings:
- Machine type: q35
- CPU type: host
- PCIe enabled
- Ballooning disabled (for benchmarking)
- Hugepages optional but recommended
Ubuntu 22.04 or 24.04 is a reliable baseline.
After booting the VM, confirm hardware visibility:
nvidia-smi
lsblk
lspci
Both GPU and NVMe should be visible.
Example:
apt update
apt install nvidia-driver-590
Reboot and verify:
nvidia-smi
Example:
apt install cuda
Confirm:
nvcc --version
If available via repository:
apt install nvidia-fs gds-tools
Load module:
modprobe nvidia-fs
Verify:
lsmod | grep nvidia_fs
If packages are unavailable or kernel mismatched:
Install prerequisites:
apt install build-essential linux-headers-$(uname -r)
Build:
git clone https://github.com/NVIDIA/gds-nvidia-fs.git
cd gds-nvidia-fs/src
make
insmod nvidia-fs.ko
Verify:
lsmod | grep nvidia_fs
Recommended:
- Local NVMe device
- EXT4 or XFS
- Mounted normally (no network filesystem during initial testing)
Create test mount:
mkfs.ext4 /dev/nvme0n1
mount /dev/nvme0n1 /mnt/nvme
gdscheck -v
Expected indicators:
- nvidia_fs loaded
- GPU detected
- Compatible filesystem detected
Example:
gdsio -f /mnt/nvme/testfile -d 0 -s 1G
This performs direct storage to GPU transfers.
To confirm topology is correct:
nvidia-smi topo -m
Ensure:
- GPU ↔ NVMe path is PIX or PXB
- Not SYS if possible
If SYS appears, consider:
- Moving NVMe to a different slot
- Moving GPU to a different slot
- Using a PCIe switch backplane
Check:
dmesg | grep nvidia
Typical causes:
- Kernel headers missing
- Driver mismatch
- Secure Boot enabled
Possible causes:
- NVMe not passthrough
- Unsupported filesystem
- Module not loaded
Common causes:
- GPU and NVMe on different NUMA nodes
- Virtual disk instead of raw NVMe
- Incorrect PCIe slot topology
A configuration that consistently works:
Host:
- Proxmox VE 8.x
- IOMMU enabled
- GPU and NVMe on same PCIe root complex
Guest:
- Ubuntu 22.04 or 24.04
- NVIDIA driver 535–590
- CUDA 12.x
- nvidia-fs loaded
Storage:
- Local NVMe
- EXT4
Hypervisor:
- IOMMU enabled
- GPU passthrough working
- NVMe passthrough working
- GPU and NVMe share PCIe path
Guest:
- NVIDIA driver installed
- CUDA installed
- nvidia_fs module loaded
- gdscheck passes
Benchmark:
- gdsio runs successfully
- gt-benchy runs with GDS support