Agent Substrate implements a virtual actor model on Kubernetes. The core idea: decouple long-lived stateful actors from physical pods so that thousands of suspended actors can share a small pool of warm worker pods. An actor's full memory and filesystem state is checkpointed to object storage (GCS/S3) when idle, and restored onto any available worker pod in milliseconds when traffic arrives.
Here's how each component contributes.
The controller is the entry point for platform operators. It reconciles two CRDs:
WorkerPool — defines a pool of warm compute capacity. The controller creates a Kubernetes Deployment (via server-side apply) with replicas pods, each running the ateom-gvisor container image. It owns the Deployment via owner references so deleting the WorkerPool cascades. It syncs the Deployment's actual replica count back into WorkerPool status.
ActorTemplate — defines a workload specification (container images, environment, snapshot storage location, gVisor binary config). The controller runs a multi-phase initialization state machine:
- Creates a temporary "golden actor" via ateapi's
CreateActorRPC - Resumes it via
ResumeActorto boot the workload fresh - Waits 20 seconds for initialization, then suspends it via
SuspendActor - Stores the resulting snapshot as the golden snapshot — a pre-warmed checkpoint that all future actors of this template clone from, avoiding cold-start costs
The controller is the only component that talks to both the Kubernetes API (for Deployments) and ateapi (for golden actor lifecycle).
ateapi is a stateless gRPC server that manages all actor and worker state in Redis/Valkey (not etcd). This is a deliberate design choice: actors are high-churn, high-volume objects (target: 1 billion per cluster) that would overwhelm the Kubernetes API server.
- Actor records: ID, status (SUSPENDED/RESUMING/RUNNING/SUSPENDING), version, template reference, assigned worker, snapshot paths
- Worker records: namespace, pool, pod name, IP, assigned actor
- Distributed locks for multi-step workflows (actor resume/suspend)
CreateActor— writes a new actor record in SUSPENDED stateResumeActor— orchestrates a multi-step workflow: find a free worker in the right pool → mark it assigned → tell atelet to restore the snapshot → mark actor RUNNINGSuspendActor— tell atelet to checkpoint → upload state → free the worker → mark actor SUSPENDEDDeleteActor— only works on SUSPENDED actors
It runs Kubernetes informers watching worker pods (by label ate.dev/worker-pool), atelet pods (by label app=atelet in ate-system), and ActorTemplate CRDs. When resuming an actor, it looks up which node the target worker pod is on, finds the atelet pod on that node, and dials it directly via gRPC.
Optimistic versioning on Redis records plus distributed locks (Redis SETNX with TTL) for multi-step resume/suspend workflows.
atelet runs as a privileged DaemonSet — one pod per node. It does not create or destroy worker pods (that's the controller's job). Instead, it's the bridge between the control plane and the physical sandbox runtime.
ateapi calls these three RPCs:
-
Run — boot a workload from scratch (no snapshot):
- Downloads and SHA256-verifies the gVisor
runscbinary from GCS/S3 - Pulls container images (with a memory cache) and extracts them into OCI bundle rootfs directories
- Generates OCI
config.jsonspecs with proper namespace configuration - Calls ateom-gvisor's
RunWorkloadRPC via Unix socket
- Downloads and SHA256-verifies the gVisor
-
Restore — resume from a checkpoint:
- Downloads checkpoint files (
checkpoint.img,pages.img,pages_meta.img) from GCS/S3 with zstd decompression - Prepares OCI bundles (same as Run)
- Calls ateom-gvisor's
RestoreWorkloadRPC
- Downloads checkpoint files (
-
Checkpoint — freeze and save state:
- Calls ateom-gvisor's
CheckpointWorkloadRPC - Uploads checkpoint artifacts to GCS/S3 with zstd compression
- Resets the actor's directories for the next workload
- Calls ateom-gvisor's
atelet and ateom-gvisor coordinate via a host-mounted directory at /run/ateom-gvisor/. This contains:
static-files/— downloaded runsc binariesateoms/<pod-uid>/ateom.sock— Unix socket for gRPCactors/<ns>:<template>:<id>/— OCI bundles, checkpoint state, PID files, runsc state
ateom-gvisor is the primary container in each worker pod. It's the only thing that actually calls runsc commands. It runs privileged because it needs to manipulate network namespaces.
- Creates a Unix socket at
/run/ateom-gvisor/ateoms/<pod-uid>/ateom.sock - Captures the pod's eth0 network configuration (addresses and routes)
- Creates an interior network namespace (
ateom:<pod-uid>) for gVisor sandboxes - Starts a child process reaper (since it's not PID 1)
- Serves the
AteomgRPC service
-
RunWorkload: Creates a pause container + application containers via
runsc create+runsc start. Moves eth0 from the pod netns into the interior netns so gVisor can use the pod's network identity. -
CheckpointWorkload: Calls
runsc checkpointon the pause container (which captures the entire sandbox including all application containers). Then deletes all containers and moves eth0 back to the pod netns — leaving the worker clean for the next actor. -
RestoreWorkload: Calls
runsc restorewith flags-background -direct -detachfor fast resume. The-directflag loads checkpoint pages straight into memory;-backgroundreturns immediately while demand-paging continues asynchronously.
ateom-gvisor only executes runsc commands. It does not pull images, download checkpoints, or upload state — that's all atelet's job. atelet prepares everything on the shared filesystem before calling ateom.
atenet provides the magic that makes curl http://<actor-id>.actors.resources.substrate.ate.dev/ work, including auto-resuming suspended actors on first request.
atenet dns — Runs a CoreDNS instance that resolves *.actors.resources.substrate.ate.dev to the router's ClusterIP. It also patches the cluster's kube-dns ConfigMap to add a stub domain so all pods in the cluster can resolve actor hostnames. Reconciles every 10 seconds.
atenet router — The request routing brain, built on Envoy with External Processing:
- Manages an Envoy Deployment + Service via the Kubernetes API
- Runs an xDS server that dynamically configures Envoy's listeners, clusters, and routes
- Runs an ExtProc server (Envoy External Processing filter) that intercepts every request:
- Extracts the actor ID from the
Hostheader - Calls
ateapi.ResumeActor()— this is a no-op if the actor is already running, or triggers a full restore if it's suspended - Uses singleflight to deduplicate concurrent resume calls for the same actor
- Mutates the
Hostheader to the worker pod's IP address - Envoy's dynamic forward proxy then routes the request to that IP
- Extracts the actor ID from the
Client DNS lookup → CoreDNS (stub domain) → returns router ClusterIP
Client HTTP request → Envoy listener (port 8080)
→ ext_proc filter → ExtProc gRPC server → ateapi.ResumeActor()
→ Host header rewritten to worker pod IP
→ dynamic_forward_proxy → worker pod → actor handles request
This means actors are resumed on demand — the first request to a suspended actor triggers restore, and subsequent requests route directly. Error mapping is comprehensive: FAILED_PRECONDITION (no workers) → 503, DEADLINE_EXCEEDED → 504, NOT_FOUND → 404, etc.
┌─────────────────┐
│ atecontroller │
│ (Deployment) │
└────────┬────────┘
│ gRPC: Create/Resume/SuspendActor
│ K8s: Manage Deployments
▼
┌──────────┐ gRPC ┌─────────────┐ K8s Informers ┌──────────────┐
│ atenet │────────▶│ ateapi │◀───────────────▶│ K8s API │
│ router │ Resume │ (Redis) │ Pods, CRDs │ Server │
└──────────┘ Actor └──────┬───────┘ └──────────────┘
│ gRPC: Run/Checkpoint/Restore
▼
┌─────────────┐
│ atelet │
│ (DaemonSet) │
└──────┬───────┘
│ gRPC over Unix socket
│ + shared /run/ateom-gvisor filesystem
▼
┌──────────────┐
│ ateom-gvisor │
│ (in worker │
│ pod) │
└──────┬───────┘
│ exec: runsc create/start/checkpoint/restore
▼
┌──────────────┐
│ gVisor │
│ sandbox │
└──────────────┘
Key design insight: Kubernetes manages infrastructure (pods, deployments, services) through the controller, but actor lifecycle is managed entirely through Redis + direct gRPC calls, bypassing etcd for the hot path. This lets the system scale to millions of actors while keeping resume latency under 100ms.