Figure 1: Remote development setup with MacStudio as server and VS Code integration
A step-by-step guide to creating a powerful, private AI coding assistant using Ollama and Continue extension in VS Code, with remote server capabilities.
As developers, we're always looking for ways to enhance our coding workflow while maintaining privacy and control over our tools. In this guide, I'll show you how to set up a powerful, locally-hosted AI coding assistant using Ollama models and the Continue extension for VS Code, with a unique twist: running it on a remote server while accessing it from any client machine.
- Privacy: All AI operations run on your hardware
- Performance: Utilize powerful server hardware for AI processing
- Flexibility: Access from any client machine
- Cost-effective: Free, open-source solution
- Customizable: Choose and configure your own models
Our setup consists of two main components:
-
Server (MacStudio):
- VS Code Server
- Ollama server running AI models
- Handles all AI computations
-
Client (MacBook Air or any machine):
- VS Code with remote tunnel connection
- Continue extension
- Connects to server for AI assistance
-
Server machine (MacStudio in our case) with:
- VS Code Server installed
- Ollama installed
- Sufficient RAM (32GB+ recommended)
- GPU (optional but recommended)
-
Client machine with:
- VS Code installed
- Continue extension installed
- Network access to server
First, let's configure the server components:
# Install VS Code Server
code-server --install-extension continue.continue
# Install Ollama
curl https://ollama.ai/install.sh | sh
# Configure Ollama for remote access in .zshrc
export OLLAMA_HOST=0.0.0.0
# Pull required models
ollama pull llama3.3:70b
ollama pull codestral
ollama pull nomic-embed-text
ollama pull linux6200/bge-reranker-v2-m3
On your client machine:
Figure 2: VS Code environment with Continue extension and AI code suggestions
- Install VS Code
- Install Continue extension (continue.dev)
- Set up VS Code tunnel:
- Command Palette (Cmd + Shift + P)
- "Remote Tunnels: New"
- Follow authentication process
Create or update Continue's config.json:
{
"models": [
{
"title": "Llama 3.3 70B",
"provider": "ollama",
"model": "llama3.3:70b",
"baseUrl": "http://[SERVER-IP]:11434",
"apiBase": "http://[SERVER-IP]:11434"
}
],
"tabAutocompleteModel": {
"title": "Code Completion",
"provider": "ollama",
"model": "codestral",
"baseUrl": "http://[SERVER-IP]:11434",
"apiBase": "http://[SERVER-IP]:11434"
},
"embeddingModel": {
"title": "Text Embedding",
"provider": "ollama",
"model": "nomic-embed-text",
"baseUrl": "http://[SERVER-IP]:11434",
"apiBase": "http://[SERVER-IP]:11434"
},
"rerankingModel": {
"title": "Reranking",
"provider": "ollama",
"model": "linux6200/bge-reranker-v2-m3",
"baseUrl": "http://[SERVER-IP]:11434",
"apiBase": "http://[SERVER-IP]:11434"
}
}
Remember to replace [SERVER-IP] with your actual server IP address in all configuration examples.
Critical step for macOS users (client):
- Open System Settings
- Navigate to Privacy & Security > Network
- Find Visual Studio Code
- Toggle ON local network access
Our setup uses specific models for different tasks:
-
Chat Model: Llama3.3:70b
- Comprehensive code understanding
- Natural language interaction
- Complex problem-solving
-
Autocompletion: Codestral
- Efficient fill-in-middle completions
- Context-aware suggestions
- Fast response times
-
Embedding: nomic-embed-text
- Semantic code search
- Similar code identification
- Context understanding
-
Reranking: bge-reranker-v2-m3
- Improved search results
- Better context matching
- Enhanced relevance sorting
Continue provides a robust tools system that enhances AI capabilities through custom function calling. Here's how different models handle tool integration:
- File Operations: Create, modify, and manage files
- Git Integration: Generate commit messages, analyze diffs
- Terminal Commands: Execute and manage CLI operations
- Search Codebase: Find relevant code snippets
- Workspace Management: Handle files and directories
-
Claude Models:
- Native function calling with structured outputs
- Support for complex multi-step reasoning
- Can chain multiple tools together Example:
{
"tools": {
"terminal": {
"description": "Run terminal commands",
"parameters": {
"command": "string"
}
}
}
}
-
Ollama Models:
- System prompt-based tool invocation
- Supports basic command execution
- Works best with clear, single-step instructions Example:
{
"systemPrompt": "You can run commands using /cmd {command}",
"tools": {
"cmd": {
"description": "Execute terminal command"
}
}
}
- Custom Tool Integration You can add your own tools by modifying Continue's config:
{
"customTools": [
{
"name": "database",
"description": "Query the database",
"parameters": {
"query": "string",
"database": "string"
},
"handler": async (params) => {
// Custom implementation
}
}
]
}
-
Security:
- Always validate tool inputs
- Use permission prompts for sensitive operations
- Limit tool access scope
-
Performance:
- Cache frequent tool results
- Use async operations for long-running tasks
- Implement proper error handling
-
Integration:
- Keep tool descriptions clear and specific
- Test tools with different model behaviors
- Document custom tool implementations
Common issues and solutions:
-
Remote Connection Errors (Issue #1090):
- Problem: Continue fails to connect to remote Ollama server
- Solution: Add
apiBase
to model configuration:
{ "models": [{ "title": "Llama 3.3 70B", "provider": "ollama", "model": "llama3.3:70b", "baseUrl": "http://[SERVER-IP]:11434", "apiBase": "http://[SERVER-IP]:11434" // Add this line }] }
- Remember to add this for all Ollama models in your config
-
EHOSTUNREACH Error on macOS (Issue #1145):
- Problem: Error message: "connect EHOSTUNREACH [SERVER-IP]:11434"
- Cause: macOS security settings blocking VS Code network access
- Solution:
- Open System Settings
- Navigate to Privacy & Security > Network
- Find Visual Studio Code
- Toggle ON local network access
- Note: This is mandatory for macOS Ventura and later versions
-
General Connection Issues:
- Verify Ollama is running:
curl http://[SERVER-IP]:11434/api/tags
- Check firewall settings: Allow port 11434
- Verify Ollama is running:
-
Model Loading Issues:
- Verify model installation:
ollama list
- Check server resources
- Enable detailed Ollama logging:
# Stop any running Ollama instance first pkill ollama # Start Ollama with debug logging OLLAMA_DEBUG=1 ollama serve
- Monitor Ollama logs:
tail -f ~/.ollama/logs/ollama.log
- Verify model installation:
-
Debugging Steps:
- Enable Continue debug logs in settings.json:
{ "continue.enableDebugLogs": true, "continue.showLogInConsole": true }
- Check VS Code Developer Tools (Help > Toggle Developer Tools)
- Verify network requests in Developer Tools Network tab
This setup provides a powerful, private AI coding assistant that combines the best of local processing with remote accessibility. The Continue extension with Ollama offers a flexible, customizable solution that can enhance your coding workflow while maintaining control over your data and models.
- Continue Documentation
- Ollama GitHub (https://github.com/ollama/ollama/blob/main/README.md#quickstart)
- VS Code Remote Development
Tags: AI, Development, VS Code, Ollama, Continue, Remote Development