Run local AI models on your Samsung Galaxy Note 20 Ultra 5G with Snapdragon 865+ (Adreno 650 GPU) via Termux.

Then, install Termux:API:(Davide Fornelli)

pkg install termux-api

Grant Termux access to your device's storage:(GitHub)

termux-setup-storage

Update packages and install required dependencies:(GitHub)

pkg update && pkg upgrade
pkg install git clang cmake make python python-dev libomp wget
pkg install vulkan-tools
pkg install termux-api

Clone the llama.cpp repository and build it:(Mendhak Code)

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make clean
make -j$(nproc)

Download a quantized GGUF model (e.g., 7B) from a trusted source and place it in the llama.cpp directory.

Execute the model using the built binary:(Mendhak Code)

./main -m model.gguf -p "Hello, AI!"

To utilize GPU acceleration:(Wolfchip Electronics)

Install Vulkan Drivers:

Ensure Vulkan drivers are installed on your device.
Build with Vulkan Support:

Modify the build process to include Vulkan support.
```
make clean
make USE_VULKAN=1 -j$(nproc)
```
Run with Vulkan:

Execute the model with Vulkan support:
```
./main -m model.gguf -p "Hello, GPU!"
```

This setup allows you to run local AI models on your Android device using Termux, leveraging the Adreno 650 GPU for accelerated inference.

nine1one/locally.md