The LLaMA model weights may be converted from Huggingface PyTorch format back to GGML in two steps:
- download from decapoda-research/llama-7b-hf
and save as pytorch
.pth
- use the ggerganov/llama.cpp script,
convert-pth-to-ggml.py
to convert from pytorch.pth
to GGML
This process will result in ggml model with float16
(fp16
) precision.