This is a guide on how to set up a way to loom with a local LLM on your computer. Loom is the name for a way of making use of an LLM base model, such as GPT-2, in order to read, write and explore generated text.
To loom you can use the following software stack:
- Obsidian with Loomsidian plugin
- llama.cpp llama-server running GPT2
Install https://obsidian.md/
Create a vault.
Put this inside your obsidian plugins folder, which is in the subdirectory of your vault named /.obsidian/plugins
.
git clone https://github.com/rain-1/loom
You need npm
to build this. The version in the repos does not currently support llama.cpp.
- Option 1 and Option 2 here did not work https://www.digitalocean.com/community/tutorials/how-to-install-node-js-on-ubuntu-22-04 I recommend Option 3 which uses nvm. I installed node v22.9.0 and this worked.
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
Do not worry about graphics card acceleration or anything like that yet. Just get started with GPT2 on CPU.
You can download quantized versions of GPT2-xl here:
I recommend gpt2-xl.Q4_K_S.gguf
. Save it into llama.cpp/models/
You can now launch the API endpoint locally on your computer with ./llama-server --host 0.0.0.0 -m ./models/gpt2-xl.Q4_K_M.gguf
. You can test it in browser by loading http://localhost:8080
.
You need to configure the loomsidian plugin to have a gpt2 profile with the following options: