Skip to content

Instantly share code, notes, and snippets.

@mb00g
Last active July 22, 2024 13:08
Show Gist options
  • Save mb00g/d9862c51eeb693fe9b2cfb6ddaa7a88f to your computer and use it in GitHub Desktop.
Save mb00g/d9862c51eeb693fe9b2cfb6ddaa7a88f to your computer and use it in GitHub Desktop.

Set the GPU Device to vGPU Mode Using the vSphere Host Graphics Setting

A GPU card can be configured in one of two modes: vSGA (shared virtual graphics) and vGPU. The NVIDIA card should be configured with vGPU mode. This is specifically for use of the GPU in compute workloads, such as in machine learning or high performance computing applications. To enable vGPU mode on the ESXi host, use the command line to execute this command:

esxcli graphics host set --default-type SharedPassthru

Check the Host Graphics Settings

[root@esxi-tesla:~] esxcli graphics host get
   Default Graphics Type: SharedPassthru
   Shared Passthru Assignment Policy: Performance

Install the NVIDIA vGPU Manager VIB into the ESXi Hypervisor

The NVIDIA Virtual GPU Manager for VMware vSphere ESXi is distributed as a vSphere Installation Bundle (VIB) file. You can download here

Copy the NVIDIA Virtual GPU Manager VIB file to the ESXi host (using SCP)

Put the ESXi host into maintenance mode.

$ esxcli system maintenanceMode set -e true

Run the esxcli command to install the NVIDIA Virtual GPU Manager from the VIB file.

$ esxcli software vib install –v directory/NVIDIA**.vib

directory is the path to the directory that contains the VIB file.

Exit maintenance mode.

$ esxcli system maintenanceMode set –e false

Reboot the ESXi host.

$ reboot

Verify that the NVIDIA kernel driver can successfully communicate with the physical GPUs in your system by running the nvidia-smi command without any options.

$ nvidia-smi

If successful, the nvidia-smi command lists all the GPUs in your system.

[root@esxi-tesla:~] nvidia-smi
Wed Jan 22 04:48:03 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.43       Driver Version: 440.43       CUDA Version: N/A      |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:65:00.0 Off |                  Off |
| N/A   52C    P8    18W /  70W |   4118MiB / 16383MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

Choosing the vGPU Profile for the Virtual Machine

One example vGPU profile we can choose from the above list for the VM is : grid_t4-8q. This profile allows the VM to use at most 8GB of the physical GPU’s memory (which is 16GB in total), grid_t4-4q for 4GB.

VM Guest Operating System vGPU Driver Installation

Ensure that the developer tools such as gcc are installed using the commands as follows:

apt update
apt upgrade
sudo apt install build-essential

Download the .run file for the NVIDIA vGPU Linux guest VM driver from the NVIDIA site.

NOTE:

This is a special driver that comes with the NVIDIA vGPU software – it is not a stock NVIDIA driver that is found outside of that product.

Copy the NVIDIA vGPU Linux driver package (for example the NVIDIA-Linux-x86_64-440.43-grid.run file) into the Linux VM’s file system.

./NVIDIA-Linux-x86_64-440.43-grid.run

after finish, check using nvidia-smi

ubuntu@ubuntu-tesla:~$ nvidia-smi
Wed Jan 22 12:05:05 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.43       Driver Version: 440.43       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GRID T4-4Q          On   | 00000000:02:04.0 Off |                  N/A |
| N/A   N/A    P8    N/A /  N/A |    272MiB /  4064MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment