This guide outlines the procedure for debugging NVIDIA graphics cards on a Talos Linux node. Due to Talos Linux's immutable and secure nature, direct driver installation and typical debugging steps are not possible. Instead, we leverage Kubernetes features and NVIDIA's containerized drivers.
Procedure:
-
Run the Debug Pod:
Execute the following
kubectl
command to launch a privileged debug pod on the target Talos Linux node. This pod will contain the NVIDIA driver binaries.