Skip to content

Instantly share code, notes, and snippets.

View mcraveiro's full-sized avatar

Marco Craveiro mcraveiro

View GitHub Profile
@mcraveiro
mcraveiro / Notes.txt
Created May 23, 2026 17:29
Setup llama.cpp with MTP support for my 2090 NVidia card
# Must get latest code from https://github.com/ggml-org/llama.cpp
# then build with cuda support
CUDACXX=/usr/local/cuda/bin/nvcc cmake -DGGML_CUDA=ON --preset x64-linux-gcc-release
cd build-x64-linux-gcc-release
ninja -j3
# notice we are pointing to unsloth MTP Q2 model: https://huggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF?show_file_info=Qwen3.6-27B-UD-IQ2_XXS.gguf
# and enabling MTP. play with n-max locally (1-3)
# no-mmproj: disable non-text modalities to fit my 11 GB VRAM
./llama-server -hf unsloth/Qwen3.6-27B-GGUF:UD-IQ2_XXS \
@mcraveiro
mcraveiro / Screenshot from 2019-09-23 08-24-16.png
Last active September 23, 2019 07:32
Cascadia Code font is fairly usable on Emacs
Screenshot from 2019-09-23 08-24-16.png