The following instructions were used to get Facebook's Llama-2 up and running on Ubuntu 22.04 (70B model) and M1 Macbook Air (7B model).
Divided into 2 parts:
- Part 1: Download models from facebook's repo: https://github.com/facebookresearch/llama
- Part 2: Use llama.cpp repo to convert the model to make inference: https://github.com/ggerganov/llama.cpp
Important if you are tyring to work with the 70B model and have 500 GB or less free space