This security research analyzed Microsoft's AutoGen multi-agent framework for vulnerabilities. The analysis identified 8 security findings, including 3 CRITICAL severity issues that could lead to arbitrary code execution.
- Repository: https://github.com/microsoft/autogen
- Packages Analyzed:
autogen-core(Core message passing and agent runtime)autogen-agentchat(High-level multi-agent API)autogen-ext(Extensions: code executors, MCP tools, HTTP tools)autogen-studio(Web UI and API server)- Experimental packages
| ID | Finding | Severity | Type |
|---|---|---|---|
| F001 | Arbitrary Code Execution via FunctionTool Config | CRITICAL | RCE |
| F002 | Weak Code Sanitization in Approval Function | HIGH | Bypass |
| F003 | LocalCommandLineCodeExecutor No Sandbox | CRITICAL | RCE |
| F004 | MCP StdioServerParams Command Injection | CRITICAL | RCE |
| F005 | Malicious Content in Agent Messages | MEDIUM | Injection |
| F006 | HTTP Tool SSRF Risk | MEDIUM | SSRF |
| F007 | Unsafe Pickle Deserialization | CRITICAL | RCE |
| F008 | No Validation of WebSocket Team Config | HIGH | RCE |
The framework has multiple paths to arbitrary code execution:
FunctionTool._from_config()usesexec()on untrusted source code- Memory Bank uses
pickle.load()on potentially untrusted files
Recommendation: Replace with safe alternatives (AST parsing, JSON serialization).
LocalCommandLineCodeExecutorhas no actual command filtering- MCP
StdioServerParamsallows arbitrary commands
Recommendation: Implement command allowlisting and sandboxing.
WebSocket team configs bypass validation and can trigger RCE.
Recommendation: Validate all team configs before loading.
AutoGen is designed with code execution as a core feature. This fundamentally changes the threat model:
- HIGH RISK: Local execution, untrusted configs, sandbox escapes
- MEDIUM RISK: SSRF, prompt injection, context pollution
- DESIGN LIMITATION: Complete isolation is not achievable without disabling features
- Document security boundaries clearly
- Add security warnings to dangerous APIs
- Implement command allowlisting
- Replace pickle with JSON
- Add sandboxed execution modes
- Implement component validation for all config sources
- Add audit logging for sensitive operations
- Create security best practices guide
- Consider AST-based code analysis instead of string matching
- Implement multi-tenant isolation for production deployments
- Add optional security-hardened execution modes
AutoGen is a powerful but inherently high-risk framework due to its code execution capabilities. The identified vulnerabilities primarily stem from:
- Trusting user-provided configurations
- Using unsafe serialization in experimental features
- Inadequate sandboxing by default
Users should NOT deploy AutoGen with untrusted inputs without additional security controls.
Research conducted: 2025-04-14 Target: microsoft/autogen (main branch)