coderplay/cline-agent.md

Last active July 25, 2025 07:26

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/coderplay/12c068605a5fb8750fe0f663cb5c9d69.js"></script>
Save coderplay/12c068605a5fb8750fe0f663cb5c9d69 to your computer and use it in GitHub Desktop.

Download ZIP

Raw

cline-agent.md

Cline Agent System Architecture & Core Workflow

Overview

Cline is a sophisticated AI Agent system built as a VSCode extension that provides intelligent coding assistance through a multi-modal architecture. The system implements a dual-mode (Plan/Act) agent loop with robust state management, tool execution capabilities, and seamless integration with various AI providers.

Core Architecture Components

1. Extension Layer (Entry Point)

Location: src/extension.ts
Responsibility: VSCode extension lifecycle management, activation/deactivation, command registration
Key Functions:
- activate(): Initializes the extension, sets up webview providers, registers commands
- deactivate(): Cleans up resources, disposes webview instances
- URI handling for authentication callbacks

2. Webview Layer (UI Frontend)

Location: webview-ui/src/
Technology: React-based webview using TypeScript
Responsibility: User interface rendering, state management, user interaction handling
Key Components:
- App.tsx: Main application component with view routing
- ChatView: Primary chat interface
- ExtensionStateContext: React context for state management
- gRPC client for backend communication

3. Controller Layer (State Management)

Location: src/core/controller/index.ts
Responsibility: Central state management, task coordination, API configuration
Key Functions:
- initTask(): Initializes new task instances
- handleWebviewMessage(): Processes messages from webview
- togglePlanActModeWithChatSettings(): Manages Plan/Act mode switching
- State persistence across VSCode sessions

4. Task Layer (Agent Core)

Location: src/core/task/index.ts
Responsibility: Core agent logic, LLM interaction, tool execution
Key Classes:
- Task: Main agent instance managing conversation flow
- TaskState: Manages task lifecycle state
- MessageStateHandler: Handles message persistence

5. Tool Execution Layer

Location: src/core/task/ToolExecutor.ts
Responsibility: Tool discovery, validation, and execution
Supported Tools:
- File operations (read, write, edit)
- Terminal command execution
- Browser automation
- MCP server integration
- Git operations

6. API Provider Layer

Location: src/api/
Responsibility: LLM provider abstraction and management
Supported Providers:
- Anthropic (Claude)
- OpenRouter
- OpenAI
- AWS Bedrock
- Local models (Ollama, LM Studio)

7. Context Management Layer

Location: src/core/context/
Responsibility: Conversation context management, token optimization
Key Components:
- ContextManager: Handles context window management
- FileContextTracker: Tracks file modifications
- ModelContextTracker: Tracks model usage

8. Storage Layer

Location: src/core/storage/
Responsibility: Persistent storage for tasks, history, and configuration
Storage Types:
- Global state (VSCode globalStorage)
- Workspace state (VSCode workspaceStorage)
- Secrets (VSCode secretsStorage)
- File-based task storage

System Architecture Diagram

graph TD
    subgraph VSCode Extension Host
        subgraph Extension Layer
            EXT[Extension Entry Point]
            CMD[Command Registry]
            URI[URI Handler]
        end
        
        subgraph Webview Layer
            WV[Webview Provider]
            REACT[React App]
            GRPC[gRPC Client]
        end
        
        subgraph Controller Layer
            CTRL[Controller]
            STATE[State Manager]
            CONFIG[Config Manager]
        end
        
        subgraph Task Layer
            TASK[Task Instance]
            TSTATE[Task State]
            MSG[Message Handler]
        end
        
        subgraph Tool Layer
            TOOLS[Tool Executor]
            TERM[Terminal Manager]
            BROWSER[Browser Session]
            MCP[MCP Hub]
        end
        
        subgraph API Layer
            API[API Handler]
            PROVIDERS[Provider Factory]
            STREAM[Stream Manager]
        end
        
        subgraph Context Layer
            CTX[Context Manager]
            FILECTX[File Context]
            MODELCTX[Model Context]
        end
        
        subgraph Storage Layer
            GLOBAL[Global Storage]
            WORKSPACE[Workspace Storage]
            SECRETS[Secrets Storage]
            TASKSTORAGE[Task Storage]
        end
    end
    
    EXT --> WV
    EXT --> CMD
    EXT --> URI
    
    WV --> REACT
    REACT --> GRPC
    GRPC --> CTRL
    
    CTRL --> STATE
    CTRL --> CONFIG
    CTRL --> TASK
    
    TASK --> TSTATE
    TASK --> MSG
    TASK --> TOOLS
    
    TOOLS --> TERM
    TOOLS --> BROWSER
    TOOLS --> MCP
    
    TASK --> API
    API --> PROVIDERS
    API --> STREAM
    
    TASK --> CTX
    CTX --> FILECTX
    CTX --> MODELCTX
    
    CTRL --> GLOBAL
    CTRL --> WORKSPACE
    CTRL --> SECRETS
    TASK --> TASKSTORAGE

Core Agent Loop (Main Workflow)

1. Initialization Flow

// Extension activation
activate() → WebviewProvider.create() → Controller() → Task()

// Task initialization
initTask() → new Task() → startTask() → initiateTaskLoop()

2. Agent Loop Pseudocode

class Task {
    async initiateTaskLoop(userContent: UserContent) {
        let nextUserContent = userContent
        
        while (!this.taskState.abort) {
            // 1. Prepare context and environment
            const [processedContent, environmentDetails] = await this.loadContext(nextUserContent)
            
            // 2. Make API request to LLM
            const stream = await this.attemptApiRequest(previousApiReqIndex)
            
            // 3. Process streaming response
            for await (const chunk of stream) {
                switch (chunk.type) {
                    case "text":
                        await this.presentAssistantMessage()
                        break
                    case "tool_use":
                        await this.toolExecutor.executeTool(toolBlock)
                        break
                }
            }
            
            // 4. Handle tool results and continue
            if (didEndLoop) {
                break
            } else {
                nextUserContent = await this.prepareNextUserContent()
            }
        }
    }
}

3. Message Processing Flow

sequenceDiagram
    participant User
    participant Webview
    participant Controller
    participant Task
    participant LLM
    participant Tools
    
    User->>Webview: Input message
    Webview->>Controller: send message
    Controller->>Task: initTask()
    Task->>Task: prepare context
    Task->>LLM: API request
    LLM-->>Task: streaming response
    
    loop For each content block
        Task->>Task: parse content
        alt Text content
            Task->>Webview: display text
        else Tool use
            Task->>Tools: execute tool
            Tools-->>Task: tool result
            Task->>LLM: send tool result
        end
    end
    
    LLM-->>Task: completion
    Task->>Webview: display completion

Communication Patterns

1. Webview ↔ Extension Communication

Protocol: gRPC over VSCode's postMessage API
Message Types: Defined in /proto/ directory
State Synchronization: Real-time bidirectional updates

2. Extension ↔ LLM Communication

Protocol: REST API with streaming support
Providers: Multiple AI provider support via factory pattern
Error Handling: Automatic retry, context window management

3. Tool Execution Flow

// Tool execution sequence
Task.presentAssistantMessage() → ToolExecutor.executeTool() → 
Specific Tool Implementation → Tool Result → Task.recursivelyMakeClineRequests()

Plan/Act Mode Architecture

Plan Mode

Purpose: Information gathering, planning, discussion
Tools Available: plan_mode_respond (conversational only)
User Interaction: Clarifying questions, plan refinement

Act Mode

Purpose: Task execution, tool usage, implementation
Tools Available: All tools except plan_mode_respond
User Interaction: Tool approval, result review

Mode Switching

// Mode transition flow
togglePlanActModeWithChatSettings() → 
updateGlobalState("mode") → 
update API configuration → 
Task.chatSettings update → 
continue with new mode

State Management Architecture

1. Global State

API configurations
User preferences
MCP server configurations
Task history

2. Task State

Conversation history
Tool execution state
Checkpoint information
Context window usage

3. Webview State

UI preferences
Current view (chat/settings/history)
Input state
Message display state

Error Handling & Recovery

1. API Errors

Automatic retry with exponential backoff
Context window management with truncation
Provider-specific error handling

2. Tool Errors

Tool validation before execution
User approval for sensitive operations
Rollback capabilities via checkpoints

3. State Recovery

Task resumption from history
Checkpoint restoration
Conversation history recovery

Security & Privacy

1. Data Protection

API keys stored in VSCode secrets storage
Sensitive data never logged
Local-only processing by default

2. Access Control

.clineignore file support
User approval for file operations
Configurable auto-approval settings

3. Audit Trail

Complete task history
Token usage tracking
Checkpoint-based change tracking

Performance Optimizations

1. Context Management

Intelligent conversation truncation
Token usage monitoring
Context window optimization

2. Caching

API response caching
File content caching
Model metadata caching

3. Streaming

Real-time message streaming
Partial content updates
Efficient state synchronization

This architecture enables Cline to function as a sophisticated AI agent capable of complex software development tasks while maintaining user control, security, and performance.

Raw

cline-tool.md

Cline Tool Handling System

Overview

Cline implements a sophisticated tool calling system that enables the AI agent to interact with the file system, terminal, browser, and external services. The system supports 18 built-in tools plus MCP (Model Context Protocol) server integration, with robust error handling and user approval mechanisms.

Tool Definitions

Tool Structure

Each tool is defined with the following structure:

interface ToolUse {
  type: "tool_use"
  name: ToolUseName
  params: Partial<Record<ToolParamName, string>>
  partial: boolean
}

// Tool names (18 built-in tools)
const toolUseNames = [
  "execute_command",           // Execute shell commands
  "read_file",                 // Read file contents
  "write_to_file",             // Create new files
  "replace_in_file",           // Edit existing files
  "search_files",              // Search files with regex
  "list_files",                // List directory contents
  "list_code_definition_names", // Parse code definitions
  "browser_action",            // Browser automation
  "use_mcp_tool",              // MCP tool execution
  "access_mcp_resource",       // MCP resource access
  "ask_followup_question",     // Ask clarifying questions
  "plan_mode_respond",         // Plan mode responses
  "load_mcp_documentation",    // Load MCP documentation
  "attempt_completion",        // Task completion
  "new_task",                  // Create new tasks
  "condense",                  // Context condensation
  "report_bug",                // Bug reporting
  "new_rule",                  // Create new rules
  "web_fetch"                  // Web content fetching
] as const

// Parameter names
const toolParamNames = [
  "command", "requires_approval", "path", "content", "diff", "regex",
  "file_pattern", "recursive", "action", "url", "coordinate", "text",
  "server_name", "tool_name", "arguments", "uri", "question", "options",
  "response", "result", "context", "title", "what_happened", "steps_to_reproduce",
  "api_request_output", "additional_context"
] as const

Tool Parameter Validation

Each tool has specific required parameters:

Tool	Required Parameters	Description
`execute_command`	`command`, `requires_approval`	Shell command to execute
`read_file`	`path`	File path to read
`write_to_file`	`path`, `content`	File path and content to write
`replace_in_file`	`path`, `diff`	File path and diff for replacement
`search_files`	`path`, `regex`	Directory path and search regex
`list_files`	`path`	Directory path to list
`list_code_definition_names`	`path`	Directory path for code parsing
`browser_action`	`action`	Browser action type
`use_mcp_tool`	`server_name`, `tool_name`	MCP server and tool names
`access_mcp_resource`	`server_name`, `uri`	MCP server and resource URI
`ask_followup_question`	`question`	Question to ask user
`plan_mode_respond`	`response`	Response for plan mode
`attempt_completion`	`result`	Completion result text

Tool Call Detection & Parsing

Parsing Process

The system uses three parsing algorithms (V1, V2, V3) to detect tool calls in LLM responses:

// Parsing flow
LLM Response → parseAssistantMessageV2/V3() → ToolUse[] → ToolExecutor.executeTool()

// Example parsing
"<execute_command>\n<command>npm install</command>\n<requires_approval>true</requires_approval>\n</execute_command>"
↓
{
  type: "tool_use",
  name: "execute_command",
  params: {
    command: "npm install",
    requires_approval: "true"
  },
  partial: false
}

Streaming Support

The parser handles partial tool calls during streaming:

partial: true indicates incomplete tool data
Parameters are accumulated as they arrive
Tool execution waits for partial: false

Tool Execution Flow

Successful Tool Execution Sequence

sequenceDiagram
    participant LLM
    participant Parser
    participant ToolExecutor
    participant AutoApprove
    participant User
    participant ToolImpl
    participant Context
    
    LLM->>Parser: Tool use XML
    Parser->>ToolExecutor: ToolUse object
    ToolExecutor->>AutoApprove: shouldAutoApproveTool(toolName, path?)
    alt Auto-approved
        AutoApprove-->>ToolExecutor: true
        ToolExecutor->>ToolImpl: Execute tool
        ToolImpl-->>ToolExecutor: Tool result
        ToolExecutor->>Context: pushToolResult(result)
    else Requires approval
        AutoApprove-->>ToolExecutor: false
        ToolExecutor->>User: ask("tool", message)
        User-->>ToolExecutor: approval + optional feedback
        alt Approved
            ToolExecutor->>ToolImpl: Execute tool
            ToolImpl-->>ToolExecutor: Tool result
            ToolExecutor->>Context: pushToolResult(result)
        else Rejected
            User-->>ToolExecutor: rejection
            ToolExecutor->>Context: pushToolResult("Tool denied")
        end
    end
    ToolExecutor->>Context: saveCheckpoint()

Error Handling Flow

sequenceDiagram
    participant ToolExecutor
    participant ErrorHandler
    participant User
    participant Context
    
    ToolExecutor->>ToolExecutor: Execute tool
    alt Tool execution error
        ToolExecutor->>ErrorHandler: handleError(action, error, block)
        ErrorHandler->>User: say("error", errorMessage)
        ErrorHandler->>Context: pushToolResult(formatResponse.toolError(error))
        ErrorHandler->>Context: saveCheckpoint()
    else Parameter validation error
        ToolExecutor->>User: sayAndCreateMissingParamError(toolName, paramName)
        ToolExecutor->>Context: pushToolResult(missingParamError)
        ToolExecutor->>Context: saveCheckpoint()
    else User rejection
        ToolExecutor->>Context: pushToolResult("Tool denied")
        ToolExecutor->>Context: saveCheckpoint()
    end

Tool Implementation Details

1. File Operations

read_file: Reads file content with .clineignore filtering
write_to_file: Creates new files with content validation
replace_in_file: Uses diff-based editing with streaming JSON replacement for Claude 4 models

2. Terminal Operations

execute_command: Executes shell commands with approval system
Supports auto-approval for "safe" commands
Real-time output streaming with chunked buffering

3. Browser Operations

browser_action: Puppeteer-based browser automation
Actions: launch, click, type, scroll, close
Screenshot and console log capture

4. Search Operations

search_files: Regex-based file search using ripgrep
list_files: Directory listing with recursive option
list_code_definition_names: Tree-sitter based code parsing

5. MCP Integration

use_mcp_tool: Execute MCP server tools
access_mcp_resource: Access MCP server resources
Real-time notification handling

6. Context Management

ask_followup_question: Interactive user questions
plan_mode_respond: Plan mode responses
condense: Context window management
attempt_completion: Task completion handling

Error Handling Mechanisms

1. Parameter Validation

// Missing parameter detection
if (!relPath) {
    this.taskState.consecutiveMistakeCount++
    this.pushToolResult(await this.sayAndCreateMissingParamError("read_file", "path"), block)
    await this.saveCheckpoint()
    break
}

2. Execution Error Handling

private handleError = async (action: string, error: Error, block: ToolUse) => {
    const errorString = `Error ${action}: ${JSON.stringify(serializeError(error))}`
    await this.say("error", `Error ${action}:\n${error.message}`)
    this.pushToolResult(formatResponse.toolError(errorString), block)
}

3. User Rejection Handling

private askApproval = async (type: ClineAsk, block: ToolUse, message: string) => {
    const { response, text, images, files } = await this.ask(type, message, false)
    if (response !== "yesButtonClicked") {
        this.pushToolResult(formatResponse.toolDenied(), block)
        return false
    }
    return true
}

4. Auto-Approval System

private shouldAutoApproveTool(toolName: ToolUseName): boolean | [boolean, boolean] {
    return this.autoApprover.shouldAutoApproveTool(toolName)
}

Tool Result Formatting

Success Results

// File operations
this.pushToolResult(content, block)

// Command execution
this.pushToolResult(`Command executed.\n${result}`, block)

// Browser actions
this.pushToolResult(
    formatResponse.toolResult(
        `Browser action completed.\nLogs: ${logs}`,
        [screenshot]
    ),
    block
)

Error Results

// Parameter errors
formatResponse.toolError(formatResponse.missingToolParameterError(paramName))

// Execution errors
formatResponse.toolError(`Error reading file: ${error.message}`)

// User rejection
formatResponse.toolDenied()

Integration Points

1. LLM Integration

Tools are presented to LLM via system prompt
LLM generates XML-like tool calls
Parser extracts tool calls from response

2. State Management

Tool results are added to conversation history
Checkpoints are created after tool execution
Context is updated with tool results

3. User Interface

Real-time tool execution updates
Approval dialogs for sensitive operations
Error notifications and recovery options

4. Security

.clineignore file filtering
Path validation and sanitization
Command approval system
MCP server access control

This comprehensive tool handling system enables Cline to safely and effectively interact with the development environment while maintaining user control and security.