-
-
Save lvnilesh/a28f19d11754e99e5a0140ed2e069217 to your computer and use it in GitHub Desktop.
# a machine with a GPU would be nice. makes it faster. | |
# install Ollama - a runner for the models | |
sudo curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama | |
sudo chmod +x /usr/bin/ollama | |
sudo useradd -r -s /bin/false -m -d /usr/share/ollama ollama | |
# create service unit file | |
tee /usr/lib/systemd/system/ollama.service > /dev/null <<EOF | |
[Unit] | |
Description=Ollama Service | |
After=network-online.target | |
[Service] | |
ExecStart=/usr/bin/ollama serve | |
User=ollama | |
Group=ollama | |
Restart=always | |
RestartSec=3 | |
Environment="OLLAMA_HOST=0.0.0.0" | |
Environment="OLLAMA_ORIGINS=*" | |
[Install] | |
WantedBy=default.target | |
EOF | |
# reload the systemd daemon and enable ollama | |
sudo systemctl daemon-reload | |
sudo systemctl enable ollama | |
sudo systemctl start ollama | |
# testing | |
sudo lsof -i :11434 | |
sudo systemctl stop ollama | |
export OLLAMA_HOST=0.0.0.0 | |
ollama serve | |
# run the llama3.1 model | |
ollama run llama3.1 | |
# restart the ollama runner | |
sudo service ollama restart | |
# watch the logs | |
journalctl -xeu ollama | |
# watch nvidia GPU | |
watch -n 1 nvidia-smi | |
# DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens. | |
### Models available | |
ollama run deepseek-coder | |
ollama run deepseek-coder:6.7b | |
ollama run deepseek-coder:33b | |
### API access to your own machine | |
Example using curl: | |
curl -X POST http://your-machine-ip-address:11434/api/generate -d '{ | |
"model": "llama3.1", | |
"prompt":"Why is the sky blue?" | |
}' | |
# Enchanted App on iOS app store | |
http://your-machine-ip-address:11434 | |
# zed editor integrates personal ollama | |
# vscode editor integration with personal ollama | |
# obsidian notes integration via khoj plugin connected to personal ollama |
prompt example
You are an expert note-making AI for obsidian who specializes in the Linking Your Thinking (LYK) strategy. The following is a transcription of recording of someone talking aloud or people in a conversation. There may be a lot of random things said given fluidity of conversation or thought process and the microphone's ability to pick up all audio. Give me detailed notes in markdown language on what was said in the most easy-to-understand, detailed, and conceptual format. Include any helpful information that can conceptualize the notes further or enhance the ideas, and then summarize what was said. Do not mention "the speaker" anywhere in your response. The notes your write should be written as if I were writting them. Finally, ensure to end with code for a mermaid chart that shows an enlightening concept map combining both the transcription and the information you added to it. The following is the transcribed audio:
I run llama3.1 model on an nvidia cpu machine and then run open-webui in another machine using docker compose. That way I can get a locally running inferance engine at my finger tips
pair it with an instance of openai-webui
docker compose up -d