Skip to content

Instantly share code, notes, and snippets.

@chris-piekarski
Last active April 9, 2025 17:13
Show Gist options
  • Save chris-piekarski/63c5dbe94aec60c7ab90a6aaf4df6862 to your computer and use it in GitHub Desktop.
Save chris-piekarski/63c5dbe94aec60c7ab90a6aaf4df6862 to your computer and use it in GitHub Desktop.
docker run --rm --gpus all -p 11434:11434 ollama/ollama:latest
#with persistent storage of models
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 ollama/ollama:latest
#if wanting to use webui
docker run -d -p 8080:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
#query models
firefox http://localhost:8080
curl -X POST http://localhost:11434/api/pull -H "Content-Type: application/json" -d '{"model": "llama2"}'
./llama.sh "tell me why life is important"
#!/bin/bash
# Check if a question was provided
if [ "$#" -lt 1 ]; then
echo "Usage: $0 \"Your question here\""
exit 1
fi
# Get the question from the first command-line argument
QUESTION="$1"
# Execute the curl command with the provided question in the JSON payload
time curl -X POST http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [
{"role": "user", "content": "'"${QUESTION}"'"}
],
"temperature": 0.7,
"max_tokens": 1000
}'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment