Last active
April 11, 2025 16:05
-
-
Save chrisbu/687cafefb87e0ddb3cb2d73301a9c64d to your computer and use it in GitHub Desktop.
Install LLAMA CPP PYTHON in WSL2 (jul 2024, ubuntu 24.04)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Might be needed from a fresh install | |
sudo apt update | |
sudo apt upgrade | |
sudo apt install gcc | |
# Might be needed, per: https://docs.nvidia.com/cuda/wsl-user-guide/index.html#cuda-support-for-wsl-2 | |
sudo apt-key del 7fa2af80 | |
# From https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0&target_type=deb_local | |
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin | |
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600 | |
wget https://developer.download.nvidia.com/compute/cuda/12.5.1/local_installers/cuda-repo-wsl-ubuntu-12-5-local_12.5.1-1_amd64.deb | |
sudo dpkg -i cuda-repo-wsl-ubuntu-12-5-local_12.5.1-1_amd64.deb | |
sudo cp /var/cuda-repo-wsl-ubuntu-12-5-local/cuda-*-keyring.gpg /usr/share/keyrings/ | |
sudo apt-get update | |
sudo apt-get -y install cuda-toolkit-12-5 | |
# If needed | |
sudo apt install python3.12-venv | |
python3 -m venv ./ai_env | |
source ./ai_env/bin/activate | |
# Install and build llamapa-cpp-python | |
CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DGGML_CUDA=on -DCMAKE_CUDA_ARCHITECTURES=all-major" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade |
I also did the following to finally make it work on my install in APR2025 after installing cuda toolkit 12.5 with the above script and activating my virtual environment, some of my arguments might be redundant:
export CC=/usr/bin/gcc-12
export CXX=/usr/bin/g++-12
export CUDAHOSTCXX=/usr/bin/g++-12
CMAKE_ARGS="-DGGML_CUDA=on -DCMAKE_CUDA_ARCHITECTURES=all-major" CC=gcc-12 CXX=g++-12 FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks, i've struggled to get it working until i came across your steps