Skip to content

Instantly share code, notes, and snippets.

@johnlindquist
Created February 17, 2026 20:08
Show Gist options
  • Select an option

  • Save johnlindquist/bf1defba0912670aa5c1f303c80e5116 to your computer and use it in GitHub Desktop.

Select an option

Save johnlindquist/bf1defba0912670aa5c1f303c80e5116 to your computer and use it in GitHub Desktop.

Messaging Cold-Start UX Plan

tl;dr

  • Users messaging OpenClaw bots via Telegram/Slack hit stopped sandboxes — current proxy returns HTML waiting pages, which webhooks can't use
  • URL stability is already solved: stable subdomain proxy ({key}.basedomain.com) survives sandbox restarts
  • Build dedicated /api/webhooks/telegram (and /slack) endpoints that return 200 OK immediately and queue messages in Redis
  • Send "booting up..." progress messages via platform API, editing them every ~30s until sandbox is ready
  • Call OpenClaw directly via sandbox.domain(3000) + gateway token — skip the proxy and password gate entirely
  • Add a cron-based messaging pump to send progress updates and drain queues
  • Extend idle timeout from 10min to 30-60min for sandboxes with active conversations
  • Open question: OpenClaw has --skip-channels flag — should we enable built-in Telegram/Slack support instead of building our own adapter?

Problem

When users message our OpenClaw bots via Telegram/Slack/etc., the sandbox is often stopped. The current proxy returns HTML waiting pages — useless for webhook senders that expect a fast 200 OK.

Key Insight: URL Stability Is Already Solved

Our stable subdomain proxy ({key}.basedomain.com) survives sandbox restarts. Raw sandbox hostnames are ephemeral, but the proxy re-resolves them via Redis. Webhook URLs registered with Telegram will keep working across restarts.

What We Need to Build

1. Dedicated webhook endpoints

POST /api/webhooks/telegram (and /slack, etc.)

  • Lives under /api/* so proxy.ts never touches it
  • Parses sandboxKey from Host header via existing parseSubdomain
  • Validates webhook (Telegram secret token, Slack signature)
  • Deduplicates retries with SET NX EX in Redis
  • Enqueues message in Redis LIST, triggers restore via after()
  • Sends "booting up..." message via platform API
  • Returns 200 OK immediately

2. Redis message queue

Per-conversation LIST + dedup keys:

  • msgq:{platform}:{sandboxKey}:{conversationId} — message queue
  • msgq:dedup:{platform}:{eventId} — retry dedup
  • msgq:drain-lock:{platform}:{sandboxKey}:{conversationId} — drain lock
  • msgq:warm:{platform}:{sandboxKey}:{conversationId} — warmup job state

Atomic batch drain via EVAL (LRANGE + LTRIM).

3. Shared restore logic

Extract the proxy route's restore-from-snapshot into src/server/sandboxes/restore-runtime.ts. Reuse from:

  • Proxy route (browser traffic)
  • Webhook routes (messaging)
  • Admin restore route (fixes inconsistency where admin restore currently skips .on-restore.sh)

4. Platform progress messages

Send "booting up..." via platform API, edit the same message with updates:

  • T=0: "Starting up..."
  • T=30s: "Still starting..."
  • T=60-90s: "Still starting; I'll respond when ready."
  • T=180s: "Still not ready; I'll keep trying."
  • When ready: Drain queue, send actual responses

5. Cron-based messaging pump

/api/cron/messaging-pump — same locking pattern as existing check-sandboxes:

  • Send progress updates for sandboxes still restoring
  • Drain queues for sandboxes that are now running
  • Expire stale warmup jobs and notify users

6. Smarter keepalive

Extend idle threshold from 10min → 30-60min for sandboxes with active messaging conversations. Reduces unnecessary cold starts during normal chat pauses.

Architecture Decision: Call OpenClaw Directly

Skip the proxy for messaging. Load sandbox meta from Redis, then:

const sandbox = await Sandbox.get({ sandboxId: meta.sandboxId })
const baseUrl = sandbox.domain(3000)
fetch(baseUrl + '/api/...', {
  headers: { Authorization: `Bearer ${meta.gatewayToken}` }
})

This avoids the password gate entirely and keeps OpenClaw private (only our server knows the token).

Open Question

OpenClaw setup currently passes --skip-channels. Does OpenClaw have built-in Telegram/Slack channel support we should enable instead of building our own adapter layer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment