Skip to content

Instantly share code, notes, and snippets.

@AmosLewis
Created September 6, 2025 00:25
Show Gist options
  • Save AmosLewis/159f958ed69207954194229bdfc39e16 to your computer and use it in GitHub Desktop.
Save AmosLewis/159f958ed69207954194229bdfc39e16 to your computer and use it in GitHub Desktop.
iree
commit 33548616294b02b60467d9c7b68e494a85c7b17f (HEAD -> main, origin/main, origin/HEAD)
Author: Vivian Zhang <[email protected]>
Date:   Fri Sep 5 13:09:33 2025 -0700

shark-ai
commit 9c173373eb1db0c2be523580a261c28e8115ad52 (HEAD -> main, origin/main, origin/HEAD)
Author: Alex Vasile <[email protected]>
Date:   Fri Sep 5 18:59:59 2025 -0400
./run_offline.sh --shortfin-config shortfin_405b_config_fp4.json
@AmosLewis
Copy link
Author

AmosLewis commented Sep 6, 2025

python -m sharktank.tools.run_llm_vmfb \
--prompt "Where is the captical of US?" \
--irpa /shark-dev/llama3.1/405b/instruct/weights/fp4/fp4_2025_07_10_fn.irpa \
--vmfb /sharedfile/f4/2500/405b/pp1/out/f4_bs4_ds4.iree0905.shark0905_dfc.vmfb  \
--config /sharedfile/f4/2500/405b/pp1/out/f4_bs4_ds4.iree0905.shark0905_dfc.json \
--tokenizer /shark-dev/llama3.1/405b/instruct/weights/fp4/tokenizer.json \
--tokenizer_config /shark-dev/llama3.1/405b/instruct/weights/fp4/tokenizer_config.json \
--steps 2 \
--kv-cache-dtype float8_e4m3fn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment