Skip to content

Instantly share code, notes, and snippets.

@Donavan
Last active October 5, 2025 16:30
Show Gist options
  • Save Donavan/cbe225bb0c2b809b3ce8d14a933ef08d to your computer and use it in GitHub Desktop.
Save Donavan/cbe225bb0c2b809b3ce8d14a933ef08d to your computer and use it in GitHub Desktop.
Human / Agent collaboration is crucial

Human / Agent collaboration is crucial

The latest client for Agent C has been developed exclusively by agents in a manner that I myself explicitly discourage.

My #1 rule is that agent instructions should be tuned by experts in the task the agents are being built for. I am most definitely NOT a Typescript / React developer, and I'm not all all good at CSS.

My #2 rule is that the "driver" know enough about the task being peroformed to be able to head off mistakes and help provide technical guidance to the agents. As I said, that's NOT me.

However, I am the architect behind this framework and have built MANY clients for it that were not web clients. While nowhere near as effective as an actual Typescript / React dev would have been I've been able to work the agents through many of their difficulties.

When things look ALMOST like a duck and they "quock" instead of "quack" it's easy to confuse agents

While there are many chat apps, and even many "chat with agents" apps out there, none of them have an an event stream like the one in Agent C. But for an agent, this looks SO CLOSE to all those other apps / frameworks they often make false assumptions.

One of the things that makes the Agent C Framework special is not just that it's fully event driven, but that our agent tools not only allow the agent to send things to the UI, it allows wools to fully participate in the event stream as well. The tool agents use to rename sessions talks to the same RealtimeBridge that the client does, it even send the exact same control event to rename the chat session as the client. Tools can render media in the chat, send error/warning/info messages, etc. And then there's the delegation tools which results likely thousands of events being generated as those funnel all of the session events for the delegated agent to the UI to display as well.

While the agents had produced an app far superior to the old client, it was clear to me that there had been fundimental mistakes made in processing the event stream that we leading to many issues with the handling of tool calls. In order to correct these sort of fundamental issues it is CRITICAL that one ensures that the agents actually understand the task at hand.

It's not enough to simply hand them new requirements and say "go fix it", you're risking yet another misunderstanding. My process for this is the same sort of thing I'd do with any junior dev working on their first complex assignment, just with way more grunt work for them.

Starting at the lowest level agent:

  1. Give them the new requirements, and task them with coming up with a detailed deisgn for what needs to be changed, how and why.
  2. Review, really review, their design.
  3. Provide feedback on design, have them refine.
  4. Repeat 2 & 3 as needed.

Work your way downstream

  1. Give the next package dev the same requirements along with the design from the previous agent and task them with refining the design toe ensure accuracy for how their package needs to change.
  2. Review, really review, their design.
  3. Provide feedback on design, have them refine.
  4. Repeat 2 & 3 as needed.

Fixing the event stream handling.

As I said above there had been fundamental mistakes made in the handling of the event stream. They did not become apparent until many of the more advanced events were being rendered in the chat that it became apparent. The notifications that a tool call was about to be made were being "orphaned" and left up long after the tool calls had completed. This made me realize both what mistake had been made and why it had been made.

  • I had laid out the sequence of events involved in a tool call to them tool_select -> tool_call_active -> tool_call_complete.
  • I told them what they should do in response to each event.
  • I did NOT tell them explicitly, "you almost certainly will receive other events in between"
  • I did not tell them explicitly "you will receive events from sub sessions" and all that entails

Because I did not do those things, and in spite of the fact that our events have fields like parent_session_id in them the agents saw what looked like evey other chat with agents app on the planet and wrote incorrect code.

Step 1: Give them new requirements

My opening message to the agent explained the issues I was seeing in the chat display, and how I believed that that made mes suspect there was a fundamental misunderstanding in how events we being handled. I explained the importance of tracking the session id for chat content all the way down to the UI so it could be used to display the tool calls properly. I explained how things could go wrong which would cause the client to not get sent an event and those lead to orphaned tool calls and gave them a simple way to add safety nets to ensure that didn't happen. I spent several hours both writing explanations and gathering example data.

Step 2: Review

Looking at their plan it was clear that there were still some misunderstandings involved. They were still fixated on the sequence of events, and treating things too linearly. The prompt I gave reframed things for them allowing them to both correct and simplify their design :

That’s CLOSE, but it has a critical flaw I think…

Backward attachment should be the norm but it's a little more complicated. The vast majority of the time an agent will start their turn with a message or a thought, followed by tool calls. Then another response, then more tool calls etc.

  • Because each message or thought the precedes the tool calls, explains why the tool calls are being done, the should be attached to that.

  • In some cases, not super common but common enough to warrant attention, the interaction will END with tool calls, so again attaching them to the prior message makes sense.

  • On very rare occasions, the agent will start the interaction with one or more tool calls. THEN says something, in which case it makes sense to attach them to the next thought / message

It’s that last case that we need to handle... The way you have the flow documented isn't quite right, or at least it's lacking specificity. I'm going to try and clarify below because it's largely my fault for not adding more events so things were more concrete...

First, as far as tool calls go, the only event that REALLY matters is that last tool_call with active set to false. The other ones are strictly for UX purposes so that the user doesn't think things are hung when an agent is writing out a long file or something.

Instead of thinking of this as a series of events at ALL really it's really more:

tool_select comes in

  • grab the session_id, the id of the tool call and the name of the tool
  • update the UI to show that the agent is preparing to use the tool.

tool_call with active set to true

  • grab the session_id, the id of the tool call and the name of the tool
  • update the UI to show that the agent is using the tool. (by matching session_id and tool use id)

tool_call (active=false) → Tool completed

  • update the UI to remove the "agent is using" notification (by matching session_id and tool use id)
  • find PREVIOUS assistant/thought message (by matching session_id)
    • attach tool call results to it
    • EMIT message-updated event
  • If there isn't a previous assistant/thought message then hold on it it
    • When the next assistant thought / message come in for that session id, if we have tool call results we've been holding on to attach the results to that message.

The end result

The file below contains the final plan for this work. Once implemented will correct the issues we have now, and ensure that we're on good footing for the events to follow. Only because I am the definitive expert on this event stream an know every event, when certain events are or are not possible, etc was I able to correct the fundamental issues with the original implementations and even then, bcause I've finally gained enough knowledge of the Typescript code in the core package to follow it at least. An ACTUAL Typescript developer with access to me in order to ask questions would have gotten it right the first time.

Tool Call Event Handling - Core Fixes Plan

Date: 2025-10-05

Owner: Eve (Event Stream Specialist)

Status: READY FOR REVIEW


Executive Summary

The tool call notification and display system has fundamental architectural issues in core that require fixes across multiple layers. The primary problems are:

  1. ToolCallManager lacks session tracking - Cannot distinguish tool calls from different sessions (main vs sub-sessions)
  2. No cleanup mechanisms for orphaned notifications - Tools stay "active" after interactions end
  3. Tool calls attached at wrong time - Need to attach to PREVIOUS message (normal case) or buffer for NEXT message (rare case)
  4. Resumed sessions skip tool calls - Explicitly commented out with "skip for now"

These are NOT primarily UI issues. The core event processing logic needs significant fixes.

The Core Flow (Simplified)

Tip

Stop thinking about tool events as a sequence. Only the tool_call(active=false) event matters for results.

When tool_call (active=false) arrives:

1. Remove "Agent is using X" notification (match by session_id + tool_call_id)

2. Try to find PREVIOUS assistant/thought message in SAME session_id
   ✅ Found: Attach tool calls/results → emit message-updated event
   ❌ Not found: Buffer tool calls for this session_id

3. When NEXT assistant/thought arrives for this session_id:
   - Check if we have buffered tool calls
   - If yes, attach them to this message

Why this works:

  • Normal case (99% of time): Agent says something → uses tools → attach to previous message
  • Rare case: Interaction starts with tools → buffer → attach to next message
  • Session isolation: Buffering is per-session, so sub-sessions don't interfere

Critical Understanding: Event Stream Nature

Important

The Agent C event stream is HIGHLY INTERRUPTIBLE by design.

Between any two tool events (tool_select_delta → tool_call active → tool_call complete), the following can and WILL occur:

  • System messages from tools
  • Render media events from tools
  • Delegation tool sub-session events (potentially thousands)
  • Other agent interactions
  • Session events from different sessions

This is BY DESIGN and EXPECTED. The current implementation assumes sequential tool events without interruption - this is fundamentally wrong.

Tool Event Lifecycle - Simplified

Tip

Key Insight: tool_select_delta and tool_call(active=true) are ONLY for UX notifications. The tool_call(active=false) event is what actually matters for displaying tool results.

Event 1: tool_select_delta (UX only)

{
  "type": "tool_select_delta",
  "session_id": "edgar-special",
  "tool_calls": [{"id": "toolu_01...", "name": "workspace_read"}]
}

Action: Show "Agent is preparing to use workspace_read" notification

  • Match by: session_id + tool_call.id
  • Update UI notification only

Event 2: tool_call (active=true) (UX only)

{
  "type": "tool_call",
  "session_id": "edgar-special",
  "active": true,
  "tool_calls": [{"id": "toolu_01...", "name": "workspace_read", "input": "{...}"}]
}

Action: Update to "Agent is using workspace_read" notification

  • Match by: session_id + tool_call.id
  • Update existing notification only

Event 3: tool_call (active=false) ⚠️ THIS IS THE IMPORTANT ONE

{
  "type": "tool_call",
  "session_id": "edgar-special",
  "active": false,
  "tool_calls": [{"id": "toolu_01...", "input": "{...}"}],
  "tool_results": [{"tool_use_id": "toolu_01...", "content": "..."}]
}

Actions (in order):

  1. Remove notification: Clear "Agent is using" UI (match by session_id + tool_call.id)
  2. Try backward attachment: Find PREVIOUS assistant/thought message in SAME session_id
    • If found: Attach tool calls/results to it + emit message-updated event
    • If NOT found: Buffer the tool calls (rare case - interaction starts with tools)
  3. Next message: When next assistant/thought arrives for this session_id, attach any buffered tool calls

Between ANY of these events: System messages, render media, delegation sub-sessions, etc. can and WILL occur. This is BY DESIGN.

Why Session Tracking is Critical

Common Scenario:

1. Main session: Agent says "I'll check the file..."
2. Main session: tool_select_delta for workspace_read
3. Main session: tool_call (active=true)
4. [Tool uses act_oneshot - creates SUB-SESSION]
5. Sub-session: 1000+ events (entire delegated agent interaction)
6. Main session: tool_call (active=false) with results

Without session_id tracking:

  • Can't match step 6 back to step 2 (thousands of events in between)
  • Sub-session tool calls interfere with main session notifications
  • Can't find correct message to attach results to

With session_id tracking:

  • Match by session_id + tool_call_id → works regardless of intervening events
  • Sub-session has different session_id → no interference
  • Search for previous message IN SAME session_id → correct attachment

Session Context

All SessionEvents include:

  • session_id - The chat session this event came from (may be a sub-session)
  • role - Role that triggered the event
  • parent_session_id - Parent session if this is a sub-session (optional)
  • user_session_id - ALWAYS the user's main chat session ID

Why This Matters:

  • Tool calls from delegation tools (sub-sessions) have DIFFERENT session_ids
  • Messages from sub-sessions have DIFFERENT session_ids
  • Must match tool calls to messages in SAME session_id
  • Current code doesn't track session_id at all → everything breaks with sub-sessions

Problems Identified (From Code Review)

Problem 1: ToolCallManager Has No Session Tracking ⚠️⚠️⚠️ CRITICAL

File: packages/core/src/events/ToolCallManager.ts

Current Code:

export class ToolCallManager {
  private activeTools: Map<string, ToolNotification> = new Map();
  //                          ^^^^^^ Only tool_call_id!
  
  onToolSelect(event: ToolSelectDeltaEvent): ToolNotification {
    const toolCall = event.tool_calls[0];
    const notification: ToolNotification = { id: toolCall.id, ... };
    
    this.activeTools.set(toolCall.id, notification);  // ❌ No session_id!
  }
}

Problems This Causes:

  1. Tool calls from main session and sub-sessions collide (same tool_call_id)
  2. Cannot match tool events to correct session when events are interleaved
  3. Cannot clear notifications for a specific session when interaction ends
  4. System messages arriving between tool events can break matching

Evidence From User:

"Currently it seems like SystemMessageEvents at least seem to cause orphaned tool call notifications in the chat."

Why: When a SystemMessage event arrives between tool_select_delta and tool_call, the code can't distinguish which session's tool call it belongs to.

Problem 2: No Cleanup on Interaction End

File: packages/core/src/events/EventStreamProcessor.ts line ~360

Current Code:

private handleInteraction(event: InteractionEvent): void {
  if (event.started) {
    this.messageBuilder.reset();
    this.toolCallManager.reset();  // Only reset on NEW interaction
  } else {
    Logger.info(`[EventStreamProcessor] Interaction ended: ${event.id}`);
    // ❌ NO CLEANUP!
  }
}

Problem: When an interaction ends (event.started === false), any tool calls still in "preparing" or "executing" state are orphaned but never cleared.

User Observation:

"When our last interaction completed there were three orphaned tool call notifications in the list: 'Agent is using workspace_read.', 'Agent is using act_oneshot.', 'Agent is using workspace_read'"

Problem 3: No Safety Net on User Turn Start

File: packages/core/src/events/EventStreamProcessor.ts

Problem: No handler for user_turn_start event.

Why This Matters: If the user's turn is starting, the agent CANNOT be using tools. This is a guaranteed cleanup point that should clear ALL active notifications as a safety net.

Problem 4: Tool Calls Attached at Wrong Time ⚠️⚠️ CRITICAL

File: packages/core/src/events/EventStreamProcessor.ts line ~395

Current Flow (INCORRECT):

1. text_delta events → Building assistant message
2. completion event → Finalize message
   handleCompletion() {
     const completedToolCalls = toolCallManager.getCompletedToolCalls();
     // ❌ Attaches tool calls from PREVIOUS interaction!
     message = messageBuilder.finalize({ toolCalls, toolResults });
   }
3. tool_select_delta → Tool starting
4. tool_call (active=true) → Tool executing
5. tool_call (active=false) → Tool completed
   toolCallManager stores the completed calls
6. Next text_delta → New message starts
7. Next completion → Finalize WITH tool calls from step 5
   // ❌ Tool calls appear on WRONG message!

Root Cause: Tool calls are attached in handleCompletion() which gets tool calls from ToolCallManager.getCompletedToolCalls(). But these are from the PREVIOUS interaction, not the current message.

Correct Flow Should Be:

1-2. Message finalized WITHOUT tool calls (correct - tools haven't run yet)
3-4. Tool events arrive
5. tool_call (active=false) → Tool completed
   → IMMEDIATELY find PREVIOUS assistant/thought message
   → ATTACH tool calls to THAT message
   → EMIT message-updated event
6-7. Next message finalized WITHOUT old tool calls

Problem 5: Resumed Sessions Explicitly Skip Tool Calls

File: packages/core/src/events/EventStreamProcessor.ts line ~1250+

Current Code:

private processAssistantMessageForResume(...) {
  if (isToolUseBlockParam(block)) {
    if (block.name === 'think') {
      // ... handle think tool ...
    }
    if (this.isDelegationTool(block.name)) {
      // ... handle delegation ...
    }
    
    // Regular tool calls - skip for now in resume
    messagesConsumed = 1;  // ❌ EXPLICITLY SKIPPED!
  }
}

Problem: Tool calls in resumed sessions are intentionally skipped with a "TODO" comment. This is why tool calls don't appear in resumed chat sessions.

What's Needed: Extract tool_use blocks, match with tool_result blocks from next user message, attach to message.metadata.

Problem 6: Thought Messages Missing Footer

File: packages/ui/src/components/chat/Message.tsx

Problem: ThoughtMessage component doesn't use MessageFooter, so tool calls aren't displayed even if properly attached.

Note: This is a UI issue, but included here for completeness.


Solution Architecture

Fix 1: Add Session Tracking to ToolCallManager ⚠️ FOUNDATION

File: packages/core/src/events/ToolCallManager.ts

Changes Required:

1.1 Update ToolNotification interface:

export interface ToolNotification {
  id: string;
  sessionId: string;  // ✅ ADD THIS
  toolName: string;
  status: 'preparing' | 'executing' | 'complete';
  timestamp: Date;
  arguments?: string;
}

1.2 Change activeTools key structure:

export class ToolCallManager {
  // Keys are now: `${session_id}:${tool_call_id}`
  private activeTools: Map<string, ToolNotification> = new Map();
  
  private makeKey(sessionId: string, toolCallId: string): string {
    return `${sessionId}:${toolCallId}`;
  }
}

1.3 Update all methods to use session_id:

onToolSelect(event: ToolSelectDeltaEvent): ToolNotification {
  const toolCall = event.tool_calls[0];
  if (!toolCall) throw new Error('ToolSelectDeltaEvent has no tool calls');
  
  const notification: ToolNotification = {
    id: toolCall.id,
    sessionId: event.session_id,  // ✅ Extract from event
    toolName: toolCall.name,
    status: 'preparing',
    timestamp: new Date(),
    arguments: JSON.stringify(toolCall.input)
  };
  
  const key = this.makeKey(event.session_id, toolCall.id);  // ✅ Use session_id
  this.activeTools.set(key, notification);
  
  Logger.info(`[ToolCallManager] Tool selected: ${toolCall.name}`, {
    sessionId: event.session_id,
    id: toolCall.id
  });
  
  return notification;
}

onToolCallActive(event: ToolCallEvent): ToolNotification | null {
  if (!event.active) return null;
  
  const toolCall = event.tool_calls[0];
  if (!toolCall) return null;
  
  const key = this.makeKey(event.session_id, toolCall.id);  // ✅ Use session_id
  const notification = this.activeTools.get(key);
  
  if (notification) {
    notification.status = 'executing';
    Logger.info(`[ToolCallManager] Tool executing: ${notification.toolName}`, {
      sessionId: event.session_id,
      id: toolCall.id
    });
    return notification;
  }
  
  // Create notification if not found (edge case handling)
  const newNotification: ToolNotification = {
    id: toolCall.id,
    sessionId: event.session_id,  // ✅ Extract from event
    toolName: toolCall.name,
    status: 'executing',
    timestamp: new Date(),
    arguments: JSON.stringify(toolCall.input)
  };
  
  this.activeTools.set(key, newNotification);
  return newNotification;
}

onToolCallComplete(event: ToolCallEvent): ToolCallWithResult[] {
  if (event.active) return [];
  
  const newlyCompleted: ToolCallWithResult[] = [];
  
  event.tool_calls.forEach(toolCall => {
    const key = this.makeKey(event.session_id, toolCall.id);  // ✅ Use session_id
    
    // Mark as complete and remove from active
    const notification = this.activeTools.get(key);
    if (notification) {
      notification.status = 'complete';
    }
    this.activeTools.delete(key);
    
    // Find result and add to completed list
    const result = event.tool_results?.find(r => r.tool_use_id === toolCall.id);
    const completedCall: ToolCallWithResult = { ...toolCall, result };
    
    this.completedToolCalls.push(completedCall);
    newlyCompleted.push(completedCall);
    
    Logger.info(`[ToolCallManager] Tool completed: ${toolCall.name}`, {
      sessionId: event.session_id,
      id: toolCall.id,
      hasResult: !!result
    });
  });
  
  return newlyCompleted;
}

1.4 Add session-specific cleanup methods:

/**
 * Clear tool notifications for a specific session
 * Used when an interaction ends or session changes
 */
clearSessionNotifications(sessionId: string): void {
  const keysToDelete: string[] = [];
  
  for (const [key, notification] of this.activeTools) {
    if (notification.sessionId === sessionId) {
      keysToDelete.push(key);
      Logger.debug(`[ToolCallManager] Clearing notification for ${notification.toolName} in session ${sessionId}`);
    }
  }
  
  keysToDelete.forEach(key => this.activeTools.delete(key));
  
  if (keysToDelete.length > 0) {
    Logger.info(`[ToolCallManager] Cleared ${keysToDelete.length} orphaned notifications for session ${sessionId}`);
  }
}

/**
 * Clear all active tool notifications
 * Used as safety net when user_turn_start arrives
 */
clearAllActiveNotifications(): void {
  const count = this.activeTools.size;
  this.activeTools.clear();
  
  if (count > 0) {
    Logger.info(`[ToolCallManager] Cleared ${count} active notifications (user turn start safety net)`);
  }
}

Rationale: Session tracking is the FOUNDATION for all other fixes. Without it, we cannot:

  • Match tool events across interleaved streams
  • Clear notifications for specific sessions
  • Handle sub-sessions correctly

Fix 2: Clear Notifications on Interaction End

File: packages/core/src/events/EventStreamProcessor.ts line ~360

Changes Required:

private handleInteraction(event: InteractionEvent): void {
  if (event.started) {
    Logger.info(`[EventStreamProcessor] Interaction started: ${event.id}`);
    this.messageBuilder.reset();
    this.toolCallManager.reset();
  } else {
    Logger.info(`[EventStreamProcessor] Interaction ended: ${event.id}`);
    
    // ✅ Clear any orphaned tool notifications for this session
    // If interaction ended, any tools still "active" are orphaned
    this.toolCallManager.clearSessionNotifications(event.session_id);
    
    // Emit event for UI updates
    this.sessionManager.emit('interaction-ended', {
      sessionId: event.session_id,
      interactionId: event.id
    });
  }
}

Rationale: When InteractionEvent with started=false arrives, it signals the interaction has definitively ended. Any tools still "active" at this point are orphaned and should be cleaned up.

Fix 3: Add User Turn Start Handler (Safety Net)

File: packages/core/src/events/EventStreamProcessor.ts

Changes Required:

3.1 Add case to processEvent switch:

case 'user_turn_start':
  this.handleUserTurnStart(event as UserTurnStartEvent);
  break;

3.2 Add handler method:

/**
 * Handle user turn start - safety net to clear orphaned notifications
 */
private handleUserTurnStart(event: UserTurnStartEvent): void {
  Logger.debug('[EventStreamProcessor] User turn started - clearing all tool notifications (safety net)');
  
  // ✅ Clear ALL active tool notifications as safety net
  // If user's turn is starting, agent CANNOT be using tools
  this.toolCallManager.clearAllActiveNotifications();
  
  // Emit event for UI
  this.sessionManager.emit('user-turn-start', {});
}

Rationale:

  • User turn start is a GUARANTEED signal that agent is not active
  • Acts as safety net to catch any orphaned notifications
  • User explicitly mentioned this: "If it's the users turn, there's ZERO chance they're ever going to complete"

Fix 4: Attach Tool Calls to Correct Message ⚠️ COMPLEX

File: packages/core/src/events/EventStreamProcessor.ts

Changes Required:

4.1 Modify handleToolCall() - the ONLY place that matters for results:

private handleToolCall(event: ToolCallEvent): void {
  // ... existing code for think tool handling ...
  
  if (event.active) {
    // Tool is executing - UX update only
    const notification = this.toolCallManager.onToolCallActive(event);
    if (notification) {
      this.sessionManager.emit('tool-notification', notification);
    }
  } else {
    // ⚠️ Tool completed - THIS IS THE IMPORTANT PART
    
    // Step 1: Remove the notification
    event.tool_calls.forEach(tc => {
      this.sessionManager.emit('tool-notification-removed', `${event.session_id}:${tc.id}`);
    });
    
    // Step 2: Store completed tool calls in manager
    const completedToolCalls = this.toolCallManager.onToolCallComplete(event);
    
    // Step 3: Try to attach to PREVIOUS assistant/thought in SAME session_id
    const attached = this.attachToolCallsToPreviousMessage(
      event.session_id,  // Match by session_id
      event.tool_calls,
      event.tool_results || []
    );
    
    // Step 4: If no previous message, buffer for next message in this session_id
    if (!attached) {
      Logger.debug(`[EventStreamProcessor] No previous message found - buffering tool calls for next message in session ${event.session_id}`);
      this.sessionManager.bufferPendingToolCalls(event.session_id, completedToolCalls);
    }
    
    // Step 5: Emit completion event for any listeners
    this.sessionManager.emit('tool-call-complete', {
      sessionId: event.session_id,
      toolCalls: event.tool_calls,
      toolResults: event.tool_results
    });
  }
}

4.2 Add method to find and attach to previous message:

/**
 * Find PREVIOUS assistant or thought message in SAME session and attach tool calls
 * Returns true if attached, false if no suitable message found (need to buffer)
 */
private attachToolCallsToPreviousMessage(
  sessionId: string,
  toolCalls: ToolCall[],
  toolResults: ToolResult[]
): boolean {
  const session = this.sessionManager.getCurrentSession();
  if (!session || !session.messages || session.messages.length === 0) {
    return false;
  }
  
  // Search backward for most recent assistant or thought message
  // that matches this session_id
  for (let i = session.messages.length - 1; i >= 0; i--) {
    const msg = session.messages[i];
    
    // Check if message is from the same session (important for sub-sessions)
    const isSameSession = msg.metadata?.sessionId === sessionId;
    if (!isSameSession) continue;
    
    // Check if it's an assistant or thought message
    const isAssistant = msg.role === 'assistant';
    const isThought = msg.role === 'assistant (thought)' || msg.isThought === true;
    
    if (isAssistant || isThought) {
      Logger.info(`[EventStreamProcessor] Attaching ${toolCalls.length} tool calls to previous ${msg.role} message at index ${i}`);
      
      // Attach to message metadata
      msg.metadata = msg.metadata || {};
      msg.metadata.toolCalls = toolCalls;
      msg.metadata.toolResults = toolResults;
      
      // Emit update so UI can refresh this specific message
      this.sessionManager.emit('message-updated', {
        sessionId: sessionId,
        messageIndex: i,
        message: msg
      });
      
      return true;
    }
  }
  
  // No previous assistant/thought found in this session
  return false;
}

4.3 Modify handleCompletion() to attach any buffered tool calls:

private handleCompletion(event: CompletionEvent): void {
  if (!event.running && this.messageBuilder.hasCurrentMessage()) {
    // Check if there are buffered tool calls for THIS session
    // (from rare case where interaction started with tool calls)
    let toolCalls: ToolCall[] | undefined;
    let toolResults: ToolResult[] | undefined;
    
    if (this.sessionManager.hasPendingToolCalls(event.session_id)) {
      const buffered = this.sessionManager.getPendingToolCalls(event.session_id);
      toolCalls = buffered.map(tc => ({
        id: tc.id,
        type: tc.type,
        name: tc.name,
        input: tc.input
      })) as ToolCall[];
      toolResults = buffered.filter(tc => tc.result).map(tc => tc.result!);
      
      Logger.debug(`[EventStreamProcessor] Attaching ${buffered.length} buffered tool calls to current message in session ${event.session_id}`);
    }
    
    // Finalize message with buffered tool calls ONLY (if any exist for this session)
    const message = this.messageBuilder.finalize({
      inputTokens: event.input_tokens,
      outputTokens: event.output_tokens,
      stopReason: event.stop_reason,
      toolCalls: toolCalls,
      toolResults: toolResults
    });
    
    // ... rest of existing completion handling ...
  }
}

4.4 Add session-aware buffering to SessionManager: File: packages/core/src/session/SessionManager.ts

export class SessionManager extends EventEmitter {
  private currentSession: ChatSession | null = null;
  // Buffer tool calls per session (key = session_id)
  private pendingToolCallsBySession: Map<string, ToolCallWithResult[]> = new Map();
  
  /**
   * Buffer tool calls for a specific session
   * Used when tool calls complete but no previous message exists (rare)
   */
  bufferPendingToolCalls(sessionId: string, toolCalls: ToolCallWithResult[]): void {
    const existing = this.pendingToolCallsBySession.get(sessionId) || [];
    existing.push(...toolCalls);
    this.pendingToolCallsBySession.set(sessionId, existing);
    Logger.debug(`[SessionManager] Buffered ${toolCalls.length} tool calls for session ${sessionId}`);
  }
  
  /**
   * Get and clear buffered tool calls for a specific session
   */
  getPendingToolCalls(sessionId: string): ToolCallWithResult[] {
    const buffered = this.pendingToolCallsBySession.get(sessionId) || [];
    this.pendingToolCallsBySession.delete(sessionId);
    return buffered;
  }
  
  /**
   * Check if there are pending buffered tool calls for a specific session
   */
  hasPendingToolCalls(sessionId: string): boolean {
    const buffered = this.pendingToolCallsBySession.get(sessionId);
    return buffered !== undefined && buffered.length > 0;
  }
  
  /**
   * Clear all pending tool calls for a session (on session change)
   */
  clearPendingToolCalls(sessionId: string): void {
    this.pendingToolCallsBySession.delete(sessionId);
  }
  
  /**
   * Clear ALL pending tool calls (on disconnect, etc.)
   */
  clearAllPendingToolCalls(): void {
    this.pendingToolCallsBySession.clear();
  }
}

Rationale:

  • Buffers tool calls PER SESSION (handles sub-sessions correctly)
  • Default: Attach to previous message (agent explains THEN uses tools)
  • Fallback: Attach to next message (rare case where interaction starts with tools)
  • Session-aware: Won't mix up tool calls from different sessions

Fix 5: Extract Tool Calls in Resumed Sessions

File: packages/core/src/events/EventStreamProcessor.ts line ~1250+

Changes Required:

Modify processAssistantMessageForResume():

private processAssistantMessageForResume(
  message: MessageParam,
  nextMessage: MessageParam | undefined,
  sessionId: string
): { messages: Message[], messagesConsumed: number } {
  const messages: Message[] = [];
  let messagesConsumed = 0;
  
  if (message.content && Array.isArray(message.content)) {
    let hasTextContent = false;
    const textParts: string[] = [];
    const toolCalls: ToolCall[] = [];
    const toolResults: ToolResult[] = [];
    
    for (const block of message.content) {
      if (isTextBlockParam(block)) {
        hasTextContent = true;
        textParts.push(block.text);
      } else if (isToolUseBlockParam(block)) {
        // THINK TOOL - existing special handling
        if (block.name === 'think') {
          // ... existing think tool code ...
          continue;
        }
        
        // DELEGATION TOOLS - existing special handling
        if (this.isDelegationTool(block.name)) {
          // ... existing delegation code ...
          continue;
        }
        
        // ✅ REGULAR TOOL CALLS - NOW EXTRACT THEM
        Logger.debug(`[EventStreamProcessor] Extracting tool call: ${block.name} (${block.id})`);
        
        toolCalls.push({
          id: block.id,
          type: block.type as 'tool_use',
          name: block.name,
          input: block.input
        });
        
        // ✅ Find corresponding result in next message
        if (nextMessage && nextMessage.role === 'user' && Array.isArray(nextMessage.content)) {
          for (const resultBlock of nextMessage.content) {
            if ('type' in resultBlock && resultBlock.type === 'tool_result') {
              const toolResult = resultBlock as any;
              if (toolResult.tool_use_id === block.id) {
                Logger.debug(`[EventStreamProcessor] Found matching tool result for ${block.id}`);
                toolResults.push({
                  type: 'tool_result',
                  tool_use_id: toolResult.tool_use_id,
                  content: toolResult.content || '',
                  is_error: toolResult.is_error
                });
              }
            }
          }
        }
        
        // ✅ Consume the tool result message
        messagesConsumed = 1;
      }
    }
    
    // ✅ Create message with tool calls attached
    if (hasTextContent || toolCalls.length > 0) {
      const combinedText = textParts.join('');
      const msg: Message = {
        role: 'assistant',
        content: combinedText || '[Tool execution]',
        timestamp: new Date().toISOString(),
        format: 'text',
        metadata: {}
      };
      
      // ✅ Attach tool calls if present
      if (toolCalls.length > 0) {
        msg.metadata!.toolCalls = toolCalls;
        msg.metadata!.toolResults = toolResults;
        Logger.info(`[EventStreamProcessor] Attached ${toolCalls.length} tool calls to resumed assistant message`);
      }
      
      messages.push(msg);
    }
  } else {
    // ... existing simple text handling ...
  }
  
  return { messages, messagesConsumed };
}

Rationale:

  • Removes the "skip for now" TODO code
  • Extracts tool_use blocks from assistant messages
  • Matches with tool_result blocks from next user message
  • Attaches to message.metadata just like streaming messages

Fix 6: Add Footer to Thought Messages

File: packages/ui/src/components/chat/Message.tsx

Changes Required:

Modify ThoughtMessage component:

const ThoughtMessage: React.FC<ThoughtMessageProps> = ({
  message,
  isStreaming = false
}) => {
  const [isExpanded, setIsExpanded] = React.useState(false);
  
  const firstLine = React.useMemo(() => {
    const textContent = extractTextContent(message.content);
    const lines = textContent.split('\n');
    const first = lines[0] || '';
    return first.length > 80 ? `${first.slice(0, 77)}...` : first;
  }, [message.content]);
  
  return (
    <div className="rounded-lg border border-border/50 my-2 bg-muted/30">
      <button
        className="group/thought flex w-full items-center justify-between gap-4 rounded-lg px-3 py-2"
        onClick={() => setIsExpanded(!isExpanded)}
      >
        {/* ... existing button content ... */}
      </button>
      
      <AnimatePresence>
        {isExpanded && (
          <motion.div
            initial={{ height: 0, opacity: 0 }}
            animate={{ height: "auto", opacity: 1 }}
            exit={{ height: 0, opacity: 0 }}
          >
            <div className="px-3 pb-3 text-sm text-muted-foreground">
              <MarkdownRenderer
                content={extractTextContent(message.content)}
                compact={true}
                className="prose-muted"
              />
              
              {/* ✅ USE MessageFooter instead of custom footer */}
              {!isStreaming && (
                <div className="mt-3 pt-2 border-t border-border/30">
                  <MessageFooter 
                    message={message}
                    showTimestamp={false}
                  />
                </div>
              )}
            </div>
          </motion.div>
        )}
      </AnimatePresence>
    </div>
  );
};

Rationale:

  • Reuses existing MessageFooter component
  • Shows tool calls when thought is expanded
  • Maintains consistency across message types

React Layer Changes

Update useChat Hook

File: packages/react/src/hooks/useChat.ts

Changes Required:

Add listener for message-updated event:

// In useChat hook, add to useEffect that sets up listeners
const handleMessageUpdated = useCallback((event: unknown) => {
  const updateEvent = event as { 
    sessionId: string; 
    messageIndex: number; 
    message: MessageChatItem;
  };
  
  Logger.debug('[useChat] Message updated event:', updateEvent);
  
  setMessages(prev => {
    const newMessages = [...prev];
    if (updateEvent.messageIndex >= 0 && updateEvent.messageIndex < newMessages.length) {
      newMessages[updateEvent.messageIndex] = {
        ...updateEvent.message,
        type: 'message',
        id: updateEvent.message.id || `msg-updated-${Date.now()}`
      };
    }
    return newMessages;
  });
}, []);

// Add to listeners
sessionManager.on('message-updated', handleMessageUpdated);

// Add to cleanup
return () => {
  sessionManager.off('message-updated', handleMessageUpdated);
  // ... other cleanup ...
};

Rationale: Allows UI to reactively update when tool calls are attached to previous messages.


Implementation Order & Risk Assessment

Phase 1: Foundation (Session Tracking) ⚠️ CRITICAL FIRST

Risk: Medium - Changes key data structures but isolated Files:

  • packages/core/src/events/ToolCallManager.ts

Tasks:

  1. Update ToolNotification interface with sessionId
  2. Change activeTools Map key structure
  3. Add makeKey() helper method
  4. Update onToolSelect(), onToolCallActive(), onToolCallComplete()
  5. Add clearSessionNotifications()
  6. Add clearAllActiveNotifications()

Testing:

  • Create unit tests for session-specific tool tracking
  • Test with multiple concurrent sessions
  • Test sub-session tool calls don't interfere with main session

DO NOT PROCEED TO PHASE 2 UNTIL PHASE 1 IS TESTED AND WORKING

Phase 2: Cleanup Mechanisms (Safety Nets)

Risk: Low - Additive changes, no breaking changes Files:

  • packages/core/src/events/EventStreamProcessor.ts

Tasks:

  1. Update handleInteraction() to clear on interaction end
  2. Add handleUserTurnStart() method
  3. Add user_turn_start case to processEvent switch

Testing:

  • Verify notifications clear on interaction end
  • Verify notifications clear on user turn start
  • Test with interrupted tool sequences

Phase 3: Message Attachment (Core Fix)

Risk: HIGH - Changes message flow logic Files:

  • packages/core/src/events/EventStreamProcessor.ts
  • packages/core/src/session/SessionManager.ts
  • packages/react/src/hooks/useChat.ts

Tasks:

  1. Add buffering to SessionManager
  2. Add attachToolCallsToPreviousMessage() method
  3. Modify handleToolCall() to attach immediately
  4. Modify handleCompletion() to use buffered calls only
  5. Add message-updated event handling to useChat

Testing:

  • Test tool calls appear on PREVIOUS message
  • Test buffering works when no previous message
  • Test message-updated event updates UI correctly
  • Test NO duplication of tool calls
  • Test with multiple rapid tool calls

Phase 4: Resumed Session Support

Risk: Low - Fixes existing TODO code Files:

  • packages/core/src/events/EventStreamProcessor.ts

Tasks:

  1. Modify processAssistantMessageForResume()
  2. Extract tool_use blocks
  3. Match with tool_result blocks
  4. Attach to message.metadata

Testing:

  • Create session with tool calls
  • Disconnect and reconnect
  • Verify tool calls appear in UI
  • Verify tool results match correctly

Phase 5: UI Polish

Risk: Low - UI-only change Files:

  • packages/ui/src/components/chat/Message.tsx

Tasks:

  1. Update ThoughtMessage to use MessageFooter

Testing:

  • Verify thought messages show footer
  • Verify tool calls display in thoughts
  • Verify expand/collapse works

Testing Strategy

Unit Tests

ToolCallManager:

  • Test session-specific key generation
  • Test tool tracking across multiple sessions
  • Test clearSessionNotifications() with mixed sessions
  • Test clearAllActiveNotifications()

EventStreamProcessor:

  • Test handleInteraction() cleanup
  • Test handleUserTurnStart() cleanup
  • Test attachToolCallsToPreviousMessage() finds correct message
  • Test tool call buffering when no message exists

Integration Tests

Tool Call Flow:

  1. Start interaction
  2. Send message requiring tool use
  3. Verify tool notifications appear
  4. Verify tool calls attach to CORRECT message
  5. Verify next message doesn't have old tool calls

Sub-session Handling:

  1. Start main session
  2. Use delegation tool (creates sub-session)
  3. Verify sub-session tools don't interfere
  4. Verify main session tools still tracked correctly
  5. End sub-session
  6. Verify main session state intact

Interrupted Sequences:

  1. Start tool call
  2. Send system message (interrupt)
  3. Complete tool call
  4. Verify tool call attaches correctly despite interruption

Resumed Sessions:

  1. Create session with 3+ tool calls
  2. Disconnect
  3. Reconnect
  4. Verify all tool calls visible in MessageFooter dropdowns
  5. Verify tool results displayed correctly

E2E Tests

User Turn Safety Net:

  1. Start agent interaction with tools
  2. Force user turn to start (send message)
  3. Verify ALL tool notifications cleared
  4. Verify no orphaned "Agent is using" in UI

Multiple Tools:

  1. Send message requiring 5+ tools
  2. Verify all tools tracked separately
  3. Verify all attach to correct message
  4. Verify notifications clear properly

Success Criteria

No orphaned notifications: Tool notifications clear when user turn starts ✅ Correct message attachment: Tool calls appear on PREVIOUS assistant/thought message ✅ Sub-session isolation: Tool calls from sub-sessions don't interfere with main session ✅ Interrupted sequences handled: System messages and other events don't break tool tracking ✅ Resumed sessions work: Tool calls visible and correct in resumed chat sessions ✅ Thought messages have footer: Tool calls visible in thought message footers ✅ No regressions: All existing tool call functionality preserved ✅ Build passes: No TypeScript errors, all tests pass ✅ Clean logs: Appropriate logging for debugging without noise


Open Questions & Notes

Q1: What if tool_call complete event is lost?

A: Interaction end and user_turn_start provide cleanup safety nets. Acceptable to leave orphaned until next safety net.

Q2: Should buffering persist across session changes?

A: NO - SessionManager should clearPendingToolCalls() on session change to avoid cross-contamination.

Q3: Can we have nested sub-sessions?

A: YES - session_id tracking handles this. Each session tracked independently by session_id.

Q4: What's the maximum tool concurrency?

A: Unknown - need to confirm with server team. Current design handles any number via Map.

Q5: Should we add tool call analytics?

A: Out of scope for this fix. Can add in separate enhancement task.


Rollback Plan

If issues arise during implementation:

  1. Phase 1 Rollback: Revert ToolCallManager changes, keep old key structure
  2. Phase 2 Rollback: Comment out cleanup calls in handlers
  3. Phase 3 Rollback: Revert to old handleCompletion() logic
  4. Phase 4 Rollback: Re-add "skip for now" comment
  5. Phase 5 Rollback: Remove MessageFooter from ThoughtMessage

Feature Flag Option: Consider adding feature flag for Phase 3 (message attachment) if concerns about stability.


Timeline Estimate

  • Phase 1: 4-6 hours (foundation changes + unit tests)
  • Phase 2: 2-3 hours (cleanup mechanisms + tests)
  • Phase 3: 6-8 hours (complex logic + extensive testing)
  • Phase 4: 3-4 hours (resume logic + tests)
  • Phase 5: 1-2 hours (UI polish)

Total: 16-23 hours of development + testing time

Recommendation: Implement in phases over 3-4 days with thorough testing between phases.


Notes for Implementation Team

  1. DO NOT SKIP PHASE 1 - Session tracking is the foundation. Everything else depends on it.

  2. The event stream IS interruptible - Do not assume sequential tool events. Test with interruptions.

  3. Sub-sessions are real - Delegation tools create sub-sessions with different session_ids. Test this explicitly.

  4. Message updates are the pattern - Don't buffer and hope. Update messages when tool calls complete.

  5. Safety nets are required - Multiple cleanup points (interaction end, user turn start) are not redundant - they're necessary.

  6. Log everything - Tool call tracking is complex. Comprehensive logging is essential for debugging.

  7. Resumed sessions are not optional - This is a user-facing feature that's currently broken.


Plan Status: READY FOR REVIEW Next Step: Review with coordinator, then begin Phase 1 implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment