Skip to content

Instantly share code, notes, and snippets.

@ochafik
Last active April 26, 2024 14:21
Show Gist options
  • Select an option

  • Save ochafik/f85fd0530ceb0ce82b3ce5283316e201 to your computer and use it in GitHub Desktop.

Select an option

Save ochafik/f85fd0530ceb0ce82b3ce5283316e201 to your computer and use it in GitHub Desktop.
Pending PR

https://github.com/ggerganov/llama.cpp/compare/master...ochafik:llama.cpp:model-args?expand=1

Improve usability of --model-url & related flags

Fixes ggml-org/llama.cpp#6887

  • --model is now inferred from --model-url / -mu or --hf-file / -hff if set (it still defaults to models/7B/gguf-model-f16.gguf otherwise). Downloading different URLs will no longer overwrite previous downloads.

  • URL model download now write a .json companion metadata file (instead of the previous separate .etag & .lastModified files). This also contains the URL itself, which is useful to remember the exact origin of models & prevents accidental overwrites of files.

  • Log about etag / modified time changes that cause re-downloads

  • Incidentally, enable the defaulting of --hf-file to --model on server (as was done on main)

make clean && make -j LLAMA_CURL=1 main server

./main -p Test -n 100 -mu https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf

./main -p Test -n 100 -hfr NousResearch/Meta-Llama-3-8B-Instruct-GGUF -hff Meta-Llama-3-8B-Instruct-Q4_K_M.gguf

ls models/
# Meta-Llama-3-8B-Instruct-Q4_K_M.gguf
# Meta-Llama-3-8B-Instruct-Q4_K_M.gguf.json
# Phi-3-mini-4k-instruct-q4.gguf
# Phi-3-mini-4k-instruct-q4.gguf.json

cat models/Phi-3-mini-4k-instruct-q4.gguf.json
# {
#     "url": "https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf"
#     "etag": "\"b83ce18f1e735d825aa3402db6dae311-145\"",
#     "lastModified": "Thu, 25 Apr 2024 21:26:15 GMT",
# }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment