Skip to content

Instantly share code, notes, and snippets.

@swyxio
swyxio / gist:324fc884061bf20e97a2ecbe59bae34a
Last active May 4, 2026 16:07
r/localLlama + r/localLLM + r/sillytavernAI preferred models list - apr 2026
Model Size/Class Format Hosted Provider Best Local Path Notes
Huihui Gemma 4 E2B Abliterated v2 E2B GGUF No Ollama / llama.cpp Gemma 4 MoE with ~2B active params. Multimodal (image+text in, text out). Abliterated for reduced refusal. Lightweight enough to run fast, but MoE active-param sizing means quality punches above its weight class.
Huihui Gemma 4 E4B Abliterated E4B GGUF No Ollama / llama.cpp Same Gemma 4 MoE family as E2B but with ~4B active params. Multimodal. Better quality ceiling than E2B at the cost of more compute per token.
SultrySilicon V2 7B GGUF No Ollama / llama.cpp Roleplay-focused 7B model. Smallest in the set. Good for quick creative/RP sanity checks, not for reasoning or instruction-fol