Visualize importance score statistics for three Qwen3-30B-A3B llama-imatrix files.

Used @EAddario's PR ggml-org/llama.cpp#12718 to generate imatrix statistics.
These were the imatrix data files used, and appear in each mosaic top to bottom in this order (barto, uber, unsloth)

Similar to https://huggingface.co/ikawrakow/Qwen3-30B-A3B for https://huggingface.co/ikawrakow/Qwen3-30B-A3B but I didn't use the 128k usnloth one and I didn't see ik's to run.

See attached images below generated using some python/matplotlib/image magic scripts vibe coded using ubergarm/Qwen3-30B-A3B-mix-IQ3_K. You can click them to load them larger, they are not too big at 100dpi. You may need to shift-reload to refresh before clicking on them as possibly I attached them while this gist was being edited in private mode before making public.

attn_q

attn_k

attn_v

attn_output

ffn_gate_inp

ffn_down_exps

ffn_gate_exps

ffn_up_exps

output

(only ubergarm had the non-repeating output layer, probably because I used ik's fork to make the imatrix? I arbitrarily mapped it to layer "99" and the graph x-axis threw decimals but ignore that.)

./build/bin/llama-imatrix \ -m Qwen3-235B-A22B-GGUF/Qwen3-235B-A22B-BF16-00001-of-00011.gguf \ -f unsloth_calibration_Qwen3-235B-A22B.txt \ # <--- text containing chat template of model -o Qwen3-235B-A22B-GGUF/imatrix_unsloth.dat \ --ctx-size 12288 \ # <--- default is 512, above note suggests unsloth is using 6144 - 12288 --batch-size 2048 \ # <--- probably something bigger than this default in attempt to speed up on 8xH100s? --ubatch-size 512 \ # <--- probably something bigger... -ngl 99 \ --threads 1

./build/bin/llama-imatrix \ --verbosity 1 \ --layer-similarity \ -m /mnt/raid/models/ubergarm/Qwen3-235B-A22B-GGUF/Qwen3-235B-A22B-Q8_0.gguf \ -f calibration_data_v5_rc.txt \ -o /mnt/raid/models/ubergarm/Qwen3-235B-A22B-GGUF/imatrix-Qwen3-235B-A22B.dat \ --ctx-size 512 \ -ngl 34 \ --threads 24

ubergarm/README.md

attn_q

attn_k

attn_v

attn_output

ffn_gate_inp

ffn_down_exps

ffn_gate_exps

ffn_up_exps

output

bartowski1182 commented May 4, 2025 •

edited

Loading

Uh oh!

ubergarm commented May 4, 2025 •

edited

Loading

Uh oh!

bartowski1182 commented May 4, 2025

Uh oh!

ubergarm commented May 4, 2025 •

edited

Loading

Uh oh!

ubergarm/README.md

attn_q

attn_k

attn_v

attn_output

ffn_gate_inp

ffn_down_exps

ffn_gate_exps

ffn_up_exps

output

bartowski1182 commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ubergarm commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bartowski1182 commented May 4, 2025

Uh oh!

ubergarm commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bartowski1182 commented May 4, 2025 •

edited

Loading

ubergarm commented May 4, 2025 •

edited

Loading

ubergarm commented May 4, 2025 •

edited

Loading