High level instructions for getting GPT4All working on MacOS with LLaMACPP

See nomic-ai/gpt4all for canonical source.

Environment

This walkthrough assumes you have created a folder called ~/GPT4All. Adjust the following commands as necessary for your own environment.
It's highly advised that you have a sensible python virtual environment. A conda config is included below for simplicity. Install it with conda env create -f conda-macos-arm64.yaml and then use with conda activate gpt4all.

# file: conda-macos-arm64.yaml
name: gpt4all
channels:
  - apple
  - conda-forge
  - huggingface
dependencies:
  - python>3.9,<3.11 # pin to 3.9 or 3.10 for now
  - tensorflow-deps
  - pip
  - onnxruntime
  - transformers
  - pip:
      # Apple Silicon
      # see: https://developer.apple.com/metal/tensorflow-plugin/
      - tensorflow-macos
      - tensorflow-metal # see TF issue https://stackoverflow.com/a/75973297/322358
      # Use nightly build for Tensorflow with --pre (preview)
      - --pre
      - --prefer-binary
      - --extra-index-url https://download.pytorch.org/whl/nightly/cpu
      # - --extra-index-url https://download.pytorch.org/whl/torch_stable.html
      # - --extra-index-url https://download.pytorch.org/whl/cu116
      - --trusted-host https://download.pytorch.org
      - torch
      - torchvision
      - numpy

Download the 4bit Quantitized Model

Put the downloaded file into ~/GPT4All/input

Available sources for this:

Safe Version:

Unsafe Version: (This model had all refusal to answer responses removed from training.)

Get the Original LLaMA models

Put the downloaded files into ~/GPT4All/LLaMA

As detailed in the official facebookresearch/llama repository pull request.

In order to download the checkpoints and tokenizer, fill this google form or if it you want to save our bandwidth use this bit torrent link: magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA

Convert GPT4All model

Get the pre-reqs and ensure folder structure exists.

python -m pip install pyllamacpp
mkdir -p `~/GPT4All/{input,output}`

Convert the input model to LLaMACPP

pyllamacpp-convert-gpt4all \
  ~/GPT4All/input/gpt4all-lora-quantized.bin \
  ~/GPT4All/LLaMA/tokenizer.model \
  ~/GPT4All/output/gpt4all-lora-q-converted.bin

darth-veitcher/gpt4all-mac.md

High level instructions for getting GPT4All working on MacOS with LLaMACPP

Environment

Download the 4bit Quantitized Model

Get the Original LLaMA models

Convert GPT4All model