Skip to content

Instantly share code, notes, and snippets.

@nine1one
Last active January 5, 2026 08:17
Show Gist options
  • Select an option

  • Save nine1one/bb865b38c9e1657cb46acc17195dd6e0 to your computer and use it in GitHub Desktop.

Select an option

Save nine1one/bb865b38c9e1657cb46acc17195dd6e0 to your computer and use it in GitHub Desktop.
# (NOT YET TESTED COMPLETELY) Run local AI models on your Samsung Galaxy Note 20 Ultra 5G with Snapdragon 865+ (Adreno 650 GPU) via Termux.

ivonblog.com/en-us/posts...

Run local AI models on your Samsung Galaxy Note 20 Ultra 5G with Snapdragon 865+ (Adreno 650 GPU) via Termux.


1. Install Termux and Termux:API

Install Termux from F-Droid or GitHub.(Davide Fornelli)

Then, install Termux:API:(Davide Fornelli)

pkg install termux-api

2. Set Up Termux Storage Access

Grant Termux access to your device's storage:(GitHub)

termux-setup-storage

3. Install Dependencies

Update packages and install required dependencies:(GitHub)

pkg update && pkg upgrade
pkg install git clang cmake make python python-dev libomp wget
pkg install vulkan-tools
pkg install termux-api

4. Clone and Build llama.cpp

Clone the llama.cpp repository and build it:(Mendhak Code)

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make clean
make -j$(nproc)

5. Download a Model

Download a quantized GGUF model (e.g., 7B) from a trusted source and place it in the llama.cpp directory.


6. Run the Model

Execute the model using the built binary:(Mendhak Code)

./main -m model.gguf -p "Hello, AI!"

7. Enable GPU Acceleration (Optional)

To utilize GPU acceleration:(Wolfchip Electronics)

  1. Install Vulkan Drivers:

    Ensure Vulkan drivers are installed on your device.

  2. Build with Vulkan Support:

    Modify the build process to include Vulkan support.

    make clean
    make USE_VULKAN=1 -j$(nproc)
  3. Run with Vulkan:

    Execute the model with Vulkan support:

    ./main -m model.gguf -p "Hello, GPU!"

Resources


This setup allows you to run local AI models on your Android device using Termux, leveraging the Adreno 650 GPU for accelerated inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment