The integration of OpenAI's reasoning models (o-series) with the Agents SDK presents intriguing possibilities for developers who want to observe an agent's thinking process in real-time. While there are limitations to accessing the complete "train of thought," there are several methods to stream insights into an agent's reasoning as it works.
OpenAI's reasoning models (o1, o3, o4 series) utilize a special type of processing called "reasoning tokens" in addition to standard input and output tokens. These reasoning tokens represent the model's internal thinking process as it breaks down problems and considers multiple approaches1.
A crucial point to understand is that these reasoning tokens are typically invisible to the end user:
"While reasoning tokens are not visible via the API, they still occupy space in the model's context window and are billed as output tokens."1
This means that while the models are indeed performing deep reasoning, by default this process happens behind the scenes.
The OpenAI Agents SDK provides robust streaming functionality through the Runner.run_streamed()
method, which returns a RunResultStreaming
object. This allows developers to subscribe to updates as an agent run proceeds2.
The SDK supports two primary types of streaming events:
RawResponsesStreamEvent
are events passed directly from the LLM in OpenAI Responses API format. These can be used to stream response messages token-by-token as they're generated2:
async for event in result.stream_events():
if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
print(event.data.delta, end="", flush=True)
Higher-level events like RunItemStreamEvent
provide updates when an item has been fully generated, enabling progress updates at the level of "message generated" or "tool ran"2.
While the raw reasoning tokens are not directly exposed through the API, OpenAI provides a mechanism to gain insights into the model's reasoning process through "reasoning summaries."
The reasoning summary feature lets you view a structured overview of the model's thinking process:
"While we don't expose the raw reasoning tokens emitted by the model, you can view a summary of the model's reasoning using the the
summary
parameter."1
Different models support different summarizer types:
- o4-mini supports the "detailed" summarizer
- Computer use model supports the "concise" summarizer
Importantly, this feature works with streaming and is supported across reasoning models including o4-mini
, o3
, o3-mini
and o1
1.
When implementing streaming reasoning with the Agents SDK, there are several factors to consider:
Reasoning tokens consume significant space in the context window. The models may generate anywhere from a few hundred to tens of thousands of reasoning tokens depending on problem complexity1.
The Agents SDK includes built-in tracing that lets you visualize and debug your agentic flows34. The tracing feature records all events during an agent run, which can provide additional insights into the agent's decision-making process5.
If working with sensitive data, be aware that both traces and logs can contain this information. The SDK provides environment variables to disable tracing and logging of sensitive data6:
export OPENAI_AGENTS_DISABLE_TRACING=1
export OPENAI_AGENTS_DONT_LOG_MODEL_DATA=1
export OPENAI_AGENTS_DONT_LOG_TOOL_DATA=1
Access to reasoning models depends on your usage tier with OpenAI. While o1 and o3-mini are available to all API users on tiers 1-5, access to o3 is limited to tiers 4 and 5 with some exceptions, and o4-mini requires organization verification7.
While it's not possible to stream the complete, raw reasoning tokens that constitute an agent's full train of thought, the OpenAI Agents SDK does provide mechanisms to gain insights into the reasoning process. Through reasoning summaries, streaming capabilities, and tracing tools, developers can observe meaningful representations of how agents are approaching problems in real-time.
For applications where understanding the agent's reasoning process is critical, the combination of streaming functionality with reasoning summaries offers a practical solution that balances insight with efficiency.
Footnotes
-
https://platform.openai.com/docs/guides/reasoning ↩ ↩2 ↩3 ↩4 ↩5
-
https://openai.github.io/openai-agents-python/streaming/ ↩ ↩2 ↩3
-
https://adasci.org/building-agentic-ai-applications-using-openai-agents-sdk/ ↩
-
https://community.openai.com/t/sensitive-data-in-logs-agents-adk-configuration/1259351 ↩
-
https://help.openai.com/en/articles/10362446-api-access-to-o1-o3-and-o4-models ↩