https://github.com/ggerganov/llama.cpp/compare/master...ochafik:llama.cpp:model-args?expand=1

Improve usability of --model-url & related flags

--model is now inferred from --model-url / -mu or --hf-file / -hff if set (it still defaults to models/7B/gguf-model-f16.gguf otherwise). Downloading different URLs will no longer overwrite previous downloads.
URL model download now write a .json companion metadata file (instead of the previous separate .etag & .lastModified files). This also contains the URL itself, which is useful to remember the exact origin of models & prevents accidental overwrites of files.
Log about etag / modified time changes that cause re-downloads
Incidentally, enable the defaulting of --hf-file to --model on server (as was done on main)

make clean && make -j LLAMA_CURL=1 main server

./main -p Test -n 100 -mu https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf

./main -p Test -n 100 -hfr NousResearch/Meta-Llama-3-8B-Instruct-GGUF -hff Meta-Llama-3-8B-Instruct-Q4_K_M.gguf

ls models/
# Meta-Llama-3-8B-Instruct-Q4_K_M.gguf
# Meta-Llama-3-8B-Instruct-Q4_K_M.gguf.json
# Phi-3-mini-4k-instruct-q4.gguf
# Phi-3-mini-4k-instruct-q4.gguf.json

cat models/Phi-3-mini-4k-instruct-q4.gguf.json
# {
#     "url": "https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf"
#     "etag": "\"b83ce18f1e735d825aa3402db6dae311-145\"",
#     "lastModified": "Thu, 25 Apr 2024 21:26:15 GMT",
# }

ochafik/README.md Secret

Select an option

No results found

Select an option

No results found

Improve usability of --model-url & related flags