This is a practical deployment note for running Qwen3.6-35B-A3B on a single RTX 5060 Ti 16GB with llama.cpp, including multimodal input.
Useful Hugging Face sources:
- Distilled GGUF family used for this deployment:
https://huggingface.co/reedmayhew/UD-2.0-Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GGUF