We’re seeing a strong wave of AI agent applications increasingly capable of taking on and automating complex workflows. A lot of what is driving this is taking place in the model layer - with frontier models better able to plan and reason. As Sam Altman has outlined, the emergence of better reasoning capabilities in models such as OpenAI o1 provides the foundation for potentially quick advancement to highly capable AI agents that can understand context, plan and reason in order to make decisions, and then take actions in order to achieve goals.
However, improved models alone will not deliver us advanced and autonomous agents, and, as of today, most models remain poor planners and low-level reasoners. This means that most AI agents are still susceptible to errors that become compounded across multi-step processes, meaning they are unable to reliably take on complex end-to-end tasks. As a result, an ecosystem of enabling software and new architectural approaches is emerging that supports agentic workflows through tool use, multi-agent frameworks, chain-of-thought-reasoning, self-reflection, planning, and other methods.
We expect to see this AI agent infrastructure evolve rapidly in the coming years as a result of (i) the abundance of new agent applications being built, and (ii) the changing challenges and opportunities that tooling can address as models improve and agents can become increasingly unconstrained.
We are seeing the emergence of infrastructure and tooling including:
-
Enterprise platforms for building with agents, such as Emergence
-
Frameworks and developer kits for building agents, such as LangChain, Oscar, AgentKit (from BCG X), and Semantic Kernel
-
Platforms for hosting agents, such as LangServe and numerous platforms for enterprise LLM applications
-
Agent evaluations, such as AgentOps, Braintrust, Langfuse, and Context
-
Orchestration between different models and agents
-
Tools supporting the ability to personalise agent memory towards a given user and their current context, such as Letta
-
Tools supporting the ability of agents to take actions, such as NPi, Mindware, and Imprompt
-
Tools supporting agents in browsing the web and extracting web data, such as Reworkd, Tiny Fish, Browse AI, Browerbase, Apify, and Browslerless
-
Authentication for agents to take actions on a user’s behalf, such as Anon and Clerk
-
Multi-agent frameworks supporting effective hierarchies and collaboration between several specialised agents, such as CrewAI, AutoGen, and AgentScope
-
Setting of guardrails and constraints to ensure an agent stays on track and avoids harmful or misleading output