On your Proxmox host run to see exactly where your driver is being loaded from:
modinfo amdgpu | grep filename
If it says: /lib/modules/7.x.x-x-pve/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko -> This is the native Proxmox driver. Safe to remove the DKMS packages if installed, if not the skip to point #4.
If it says /updates/dkms/amdgpu.ko, then the DKMS actually loaded, should be rare. Do:
- Remove the Ubuntu DKMS and ROCm apt packages
apt purge -y amdgpu-dkms rocm* amdgpu-install
apt autoremove -y --purge
- Clean up the leftover directories and apt sources
rm -rf /opt/rocm* /opt/amdgpu* /etc/apt/sources.list.d/amdgpu.list /etc/apt/sources.list.d/rocm.list
- Rebuild the boot image just to ensure the broken DKMS module isn't accidentally queued for the next boot
- Verify
dmesg | grep amdgpu
ls -l /dev/kfd
- Download the specific gfx906 nightly build
cd /tmp
wget https://therock-nightly-tarball.s3.amazonaws.com/therock-dist-linux-gfx906-7.14.0a20260612.tar.gz
- Extract to /opt
mkdir -p /opt/therock
tar -xzf therock-dist-linux-gfx906-7.14.0a20260612.tar.gz -C /opt/therock
- Add to system path permanently
echo 'export PATH=/opt/therock/bin:$PATH' > /etc/profile.d/therock.sh
echo 'export LD_LIBRARY_PATH=/opt/therock/lib:/opt/therock/llvm/lib:$LD_LIBRARY_PATH' >> /etc/profile.d/therock.sh
source /etc/profile.d/therock.sh
- Verify
rocm-smi --showproductname
(Optional) Lazy way to allow access to host volume mount models passed to LXC.
chmod 755 /srv/storage/ai/models/*
chown -R 100000:100000 /srv/storage/ai/models/*
eg downloaded by HuggingFaceModelDownloader bash <(curl -sSL https://g.bodaay.io/hfd) install