Skip to content

Instantly share code, notes, and snippets.

@hololeo
Created October 24, 2024 21:16
Show Gist options
  • Save hololeo/c286f7fddc4b48bc6447566b384e8054 to your computer and use it in GitHub Desktop.
Save hololeo/c286f7fddc4b48bc6447566b384e8054 to your computer and use it in GitHub Desktop.
Ollama cheatsheets and tips
# taken from Ollama discord users
Ollama Cheat Sheet: Use Cases and Commands
Here is a cheat sheet of Ollama commands and their corresponding use cases, based on the provided sources and our conversation history.
Basic Commands
ollama run [model_name]: This command starts an interactive session with a specific model. For example, ollama run llama2 starts a conversation with the Llama 2 7b model.
ollama pull [model_name]: Use this to download a model from the Ollama registry. Example: ollama pull llama2-uncensored downloads the uncensored variant of Llama 2.
ollama list: Lists all the models you have downloaded locally.
ollama rm [model_name]: This command removes a specified model from your local machine.
ollama create [new_model_name] -f [modelfile_path]: Creates a new custom model from a Modelfile. This command is often used to specify a different system prompt or to package a fine-tuned model.
Use Cases and Examples
Chatbots: Ollama can be used to create advanced chatbots. The Discord bot 'discollama' is an example of a chatbot built on Ollama.
Language Translation: Ollama models have the ability to translate text between languages.
Text Summarization: Users can leverage Ollama to condense long articles or documents into concise summaries.
Content Generation: Ollama can generate text based on a provided prompt.
Code Generation: You can use Ollama to generate code in various programming languages. For example, prompting with "no prose, only output valid . " can generate code without additional text explanations. A user is also developing a code auto-completion tool that integrates with Ollama.
Voice Assistants: Ollama's capabilities extend to building advanced voice assistants that are more interactive than typical voice assistants.
Named Entity Recognition (NER): A user reported success in using Ollama for NER tasks.
Document Parsing: The sources show a method for parsing documents (like Markdown files) by feeding the document content into the SYSTEM prompt. However, this approach has limitations with larger documents due to the context window size.
Fine-tuning with LoRA: Several users experimented with fine-tuning Ollama models using LoRA (Low-Rank Adaptation). Currently, Ollama doesn't directly support the fine-tuning process, but users can create Modelfiles to package and run their fine-tuned models. Upcoming features may simplify the fine-tuning and model sharing workflow.
Integrating Embeddings for Context: Users can use Ollama's embeddings API to incorporate external information into their prompts. This involves generating embeddings for additional context, storing them (often in a vector database), and then retrieving them during a conversation to provide the model with more information.
Function Calling: Ollama is being explored for use with models that support function calling. This feature allows models to execute specific functions based on the user's input.
Modelfile Customization
The Modelfile is a key component for customizing models in Ollama. Here are some ways you can customize it:
FROM [model_path]: Specifies the base model to use. This can be a path to a local model file or a model name from the Ollama registry.
TEMPLATE [template_string]: Defines the prompt template. The template uses Go templating syntax to inject variables like the user prompt and system message.
SYSTEM [system_message]: Sets the system message that guides the model's behavior.
PARAMETER [parameter_name] [parameter_value]: Allows you to modify model parameters like temperature and context window size.
Note: It is mentioned in the conversation that documentation for features like embeddings and LoRA integration within the Modelfile are still under development.
API Usage
Ollama offers a REST API for programmatic interaction. While this cheat sheet focuses on command-line usage, the conversation highlights key points about the API:
Maintaining Conversation History: The API's /generate endpoint includes a context parameter that accepts an array of tokens. This parameter allows you to feed back the context from previous turns in a conversation, enabling the model to maintain history.
Token Streaming: The API returns responses as a stream of JSON objects, each containing a word or a part of a word represented as a token.
Session Management (Upcoming): The developers are working on introducing "sessions" to simplify conversation management, potentially making the context parameter management automatic.
This cheat sheet provides a starting point for exploring Ollama. For in-depth information, refer to the official documentation and the examples provided in the Ollama GitHub repository.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment