You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
Instantly share code, notes, and snippets.
🌴
Sabbatical IV
Matthias Kreier
kreier
🌴
Sabbatical IV
Teacher for Life Science and AP CSP at SSIS. Coach for VEX team 76209.
Research in Solid State Physics (CdHgTe) at Humboldt University to Berlin.
Jetson Nano with current llama.cpp and GPU CUDA support
llama.cpp with CUDA support on a Jetson Nano
It is possible to compile and run a recent llama.cpp with gcc 8.5 and nvcc 10.2 (latest supported CUDA compiler from Nvidia for the 2019 Jetson Nano) that also supports the use of the GPU.
Setup Guide for llama.cpp on Nvidia Jetson Nano 4GB
This is a full account of the steps I ran to get llama.cpp running on the Nvidia Jetson Nano 2GB. It accumulates multiple different fixes and tutorials, whose contributions are referenced at the bottom of this README.
Remark 2025-01-21: This gist is from April 2024. The current version of llama.cpp should be able to compile on the Jetson Nano out of the box. Or you can directly run ollama on the Jetson nano, it just works. But the inference is only done on the CPU, the GPU is not utilized - and probably never will. See ollama issue 4140 regarding JetPack 4, CUDA 10.2 and gcc-11.
Note 2025-04-07: This gist does not work. The three changes to the Makefile let it compile in just 7 minutes, and the created main and llama-bench do work, just not with GPU acceleration. As soon as the parameter --n-gpu-layers 1 and the system crashes with GGML_ASSERT: ggml-cuda.cu:255: !"CUDA error". There
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters