Skip to content

Instantly share code, notes, and snippets.

@johnlindquist
Created February 20, 2026 16:19
Show Gist options
  • Select an option

  • Save johnlindquist/da649125c487260a8f408be778d0b900 to your computer and use it in GitHub Desktop.

Select an option

Save johnlindquist/da649125c487260a8f408be778d0b900 to your computer and use it in GitHub Desktop.
OpenClaw Cron in Vercel Sandbox: The Pairing Wall — Root Cause Analysis & Fix

OpenClaw Cron in Vercel Sandbox: The Pairing Wall

The User Story

You deploy OpenClaw (an AI gateway) inside a Vercel Sandbox and connect it to Telegram. Users chat with the bot. Everything works great — until someone says:

"Text me a silly joke every 5 minutes"

The bot tries to create a scheduled task (a cron job). It fails:

Cron: 'Silly Joke' failed: gateway closed (1008): pairing required

The bot apologizes and says the cron system isn't available because "the gateway isn't connected in this sandbox environment."

What the user wanted: A recurring scheduled message. What they got: A cryptic pairing error.


The Beginner Explanation

OpenClaw has a security system called "device pairing" — think of it like Bluetooth pairing. Before a device (like a CLI tool or the AI agent) can talk to the gateway, it needs to be "paired" — approved and registered.

On a normal computer, this happens automatically: the gateway sees the connection is coming from localhost (the same machine) and auto-approves it silently. No human intervention needed.

Inside a Vercel Sandbox (a lightweight virtual machine), this breaks. The auto-approval mechanism has a subtle bug that causes it to fail silently, and the gateway permanently rejects the AI agent's attempts to create cron jobs.

The fix is to "pre-pair" the device during setup — like pre-approving a Bluetooth device before you even turn it on.


The Technical Deep Dive

Architecture Context

Browser → Vercel Proxy → Sandbox MicroVM (port 3000)
                              └── OpenClaw Gateway
                                    ├── /v1/chat/completions (HTTP API) ✅ Works
                                    └── WebSocket Gateway Protocol (RPC) ❌ Fails

The HTTP chat completions endpoint uses simple token auth. The WebSocket gateway protocol requires device pairing on top of token auth.

The Call Chain That Fails

  1. User sends message via Telegram
  2. Message queued in Redis, drained to /v1/chat/completions on port 3000
  3. AI agent processes the request, decides to create a cron job
  4. Agent calls callGatewayTool("cron.add", ...) (src/agents/tools/cron-tool.ts)
  5. callGatewayToolcallGateway()new GatewayClient(...) (src/gateway/call.ts)
  6. GatewayClient always calls loadOrCreateDeviceIdentity() — there's no option to skip this
  7. Client opens WebSocket to ws://127.0.0.1:3000 with device identity
  8. Gateway message handler processes the connect frame (src/gateway/server/ws-connection/message-handler.ts)

The Auth & Pairing Decision Tree

Connection arrives with device identity
  │
  ├─ Is client "openclaw-control-ui"? → allowInsecureAuth bypass → SKIP pairing ✅
  │
  └─ Is client "gateway-client"? (callGatewayTool uses this)
       │
       ├─ skipPairing = allowControlUiBypass && sharedAuthOk
       │   └─ allowControlUiBypass = false (not control UI) → skipPairing = false
       │
       └─ Pairing check runs:
            │
            ├─ getPairedDevice(device.id) → Is device already paired?
            │   ├─ YES + publicKey matches → Skip pairing ✅
            │   └─ NO → requestDevicePairing({ silent: isLocalClient })
            │
            └─ requestDevicePairing():
                 │
                 ├─ Existing pending request for this deviceId?
                 │   └─ YES → Return EXISTING request (BUG: silent flag NOT updated)
                 │
                 └─ No existing → Create new request with silent: isLocalClient
                      │
                      ├─ silent: true → Auto-approve, connection continues ✅
                      └─ silent: false → Close with 1008 "pairing required" ❌

Root Cause #1: The Sticky Pending Request Bug

In src/infra/device-pairing.ts, requestDevicePairing() has this code:

const existing = Object.values(state.pendingById).find((p) => p.deviceId === deviceId);
if (existing) {
  return { status: "pending", request: existing, created: false };
}

If the first-ever pairing attempt for a deviceId was non-local (silent: false), it creates a pending request with silent: false. Every subsequent attempt — even local ones with silent: true — returns that stale request without updating the silent flag. The pending request has a 5-minute TTL, but in practice the gateway keeps retrying and the request never expires.

Root Cause #2: Vercel Sandbox Networking Quirks

Even with --bind loopback, isLocalDirectRequest() may return false inside a MicroVM due to:

  • Internal proxy headers injected by the VM infrastructure
  • Non-standard req.socket.remoteAddress values
  • IPv6 edge cases in the VM's network stack

The isLocalDirectRequest() check requires ALL of:

  1. isLoopbackAddress(clientIp) = true
  2. Host header resolves to localhost, 127.0.0.1, or ::1
  3. No proxy headers (or from trusted proxy)

If any of these fail, isLocalClient = falsesilent = false → triggers Root Cause #1.

What We Tried (and Why Each Failed)

Attempt Why It Didn't Work
--bind loopback Correct for self-connection URL, but doesn't fix isLocalClient in VM
allowInsecureAuth: true Only applies to Control UI clients, not gateway-client
dangerouslyDisableDeviceAuth: true Only nullifies device for Control UI, not other clients
mkdir -p devices/ writeJsonAtomic() already creates dirs; not the issue
Clearing paired.json on startup Was already happening; doesn't help if auto-approval never runs

What DOES Work: Pre-Pairing the Device

The gateway's pairing check (line 674-675) does:

const paired = await getPairedDevice(device.id);
const isPaired = paired?.publicKey === devicePublicKey;

If the device is already in paired.json with a matching public key and sufficient scopes, the entire pairing flow is skipped. The connection proceeds immediately.

The Fix

Run a Node.js script before the gateway starts that:

  1. Ensures the device identity exists (~/.openclaw/identity/device.json)
  2. Reads the device's public key
  3. Writes a pre-approved entry to ~/.openclaw/devices/paired.json
  4. Clears ~/.openclaw/devices/pending.json to prevent the sticky-request bug

The script generates the same Ed25519 key pair that OpenClaw uses, derives the deviceId from the public key hash (SHA-256), and writes it with operator role and the three scopes that callGatewayTool requests: operator.admin, operator.approvals, operator.pairing.

Additional Hardening

Add loopback addresses to gateway.trustedProxies in the config:

{
  "gateway": {
    "trustedProxies": ["10.0.0.0/8", "127.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16", "::1"]
  }
}

This ensures isLocalDirectRequest() correctly handles any proxy headers the VM might inject.


Summary

Layer Issue Fix
OpenClaw bug requestDevicePairing() reuses stale pending requests without updating silent flag Clear pending.json at boot
VM networking isLocalDirectRequest() may return false in MicroVM Add loopback to trustedProxies
Architecture callGatewayTool() always sends device identity, no bypass option Pre-pair the device identity on disk
Config allowInsecureAuth only covers Control UI, not gateway-client Pre-pairing makes this irrelevant

The "do this now" plan: Write a pre-pairing script into the sandbox setup that creates the device identity and writes paired.json before the gateway starts. This makes the entire pairing flow irrelevant — the device is already approved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment