Red team report: Maia agent prompt-level security bypass
Maia runs inside a Claude Code agent with access to an E2B cloud sandbox. The sandbox has shell access, Python, Node, and environment variables ($GRAPHQL_URL, $JWT) that authenticate it against our Rails GraphQL API. Maia interacts with the platform through curated "skills" -- predefined GraphQL queries and mutations for goals, objectives, initiatives, KPIs, updates, dashboards, and stage gates.
We added several layers of prompt-level restrictions to Maia's CLAUDE.md to keep it inside those boundaries: