Skip to content

Instantly share code, notes, and snippets.

@gsoykan
Created November 4, 2023 11:08
Show Gist options
  • Save gsoykan/e117f48cd0a898d16f02ebcac0763738 to your computer and use it in GitHub Desktop.
Save gsoykan/e117f48cd0a898d16f02ebcac0763738 to your computer and use it in GitHub Desktop.
setting cuda driver version on hpc cluster

Q:

i am working on the cluster - there module for cuda/12.3 - i want cuda version to use that path

A:

If you're working on a cluster that provides a specific module for CUDA version 12.3, and you want to use that CUDA version in your environment, you can follow these steps to load the CUDA module and set the appropriate environment variables:

  1. First, you should log in to your cluster and open a terminal.

  2. Use the module load command to load the CUDA module for version 12.3. The command might look something like this, but the exact module name and syntax can vary depending on your cluster's configuration:

module load cuda/12.3
  1. After loading the CUDA module, you should set environment variables to ensure that your Python or PyTorch installation uses the correct CUDA version. Typically, you'd need to set the PATH, LD_LIBRARY_PATH, and CUDA_HOME environment variables. You can do this in your shell session using the export command. Here's an example:
export PATH=/path/to/cuda/12.3/bin:$PATH
export LD_LIBRARY_PATH=/path/to/cuda/12.3/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/path/to/cuda/12.3

Replace /path/to/cuda/12.3 with the actual path to the CUDA module on your cluster.

  1. After setting the environment variables, you can install or run PyTorch within this environment. When you install PyTorch, it should recognize the CUDA version specified in your environment variables and use CUDA 12.3 for GPU acceleration.

It's important to note that the exact commands and paths may vary depending on your cluster's configuration and the specific module system in use. You should consult your cluster's documentation or contact your cluster administrators for any cluster-specific instructions or module names.

By following these steps, you'll configure your environment to use CUDA 12.3, and PyTorch should be able to utilize this version for GPU computations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment