Files
clawdbot/docs/concepts/architecture.md
2026-01-22 22:02:06 +00:00

123 lines
4.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
summary: "WebSocket gateway architecture, components, and client flows"
read_when:
- Working on gateway protocol, clients, or transports
---
# Gateway architecture
Last updated: 2026-01-22
## Overview
- A single longlived **Gateway** owns all messaging surfaces (WhatsApp via
Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat).
- Control-plane clients (macOS app, CLI, web UI, automations) connect to the
Gateway over **WebSocket** on the configured bind host (default
`127.0.0.1:18789`).
- **Nodes** (macOS/iOS/Android/headless) also connect over **WebSocket**, but
declare `role: node` with explicit caps/commands.
- One Gateway per host; it is the only place that opens a WhatsApp session.
- A **canvas host** (default `18793`) serves agenteditable HTML and A2UI.
## Components and flows
### Gateway (daemon)
- Maintains provider connections.
- Exposes a typed WS API (requests, responses, serverpush events).
- Validates inbound frames against JSON Schema.
- Emits events like `agent`, `chat`, `presence`, `health`, `heartbeat`, `cron`.
### Clients (mac app / CLI / web admin)
- One WS connection per client.
- Send requests (`health`, `status`, `send`, `agent`, `system-presence`).
- Subscribe to events (`tick`, `agent`, `presence`, `shutdown`).
### Nodes (macOS / iOS / Android / headless)
- Connect to the **same WS server** with `role: node`.
- Provide a device identity in `connect`; pairing is **devicebased** (role `node`) and
approval lives in the device pairing store.
- Expose commands like `canvas.*`, `camera.*`, `screen.record`, `location.get`.
Protocol details:
- [Gateway protocol](/gateway/protocol)
### WebChat
- Static UI that uses the Gateway WS API for chat history and sends.
- In remote setups, connects through the same SSH/Tailscale tunnel as other
clients.
## Connection lifecycle (single client)
```
Client Gateway
| |
|---- req:connect -------->|
|<------ res (ok) ---------| (or res error + close)
| (payload=hello-ok carries snapshot: presence + health)
| |
|<------ event:presence ---|
|<------ event:tick -------|
| |
|------- req:agent ------->|
|<------ res:agent --------| (ack: {runId,status:"accepted"})
|<------ event:agent ------| (streaming)
|<------ res:agent --------| (final: {runId,status,summary})
| |
```
## Wire protocol (summary)
- Transport: WebSocket, text frames with JSON payloads.
- First frame **must** be `connect`.
- After handshake:
- Requests: `{type:"req", id, method, params}``{type:"res", id, ok, payload|error}`
- Events: `{type:"event", event, payload, seq?, stateVersion?}`
- If `CLAWDBOT_GATEWAY_TOKEN` (or `--token`) is set, `connect.params.auth.token`
must match or the socket closes.
- Idempotency keys are required for sideeffecting methods (`send`, `agent`) to
safely retry; the server keeps a shortlived dedupe cache.
- Nodes must include `role: "node"` plus caps/commands/permissions in `connect`.
## Pairing + local trust
- All WS clients (operators + nodes) include a **device identity** on `connect`.
- New device IDs require pairing approval; the Gateway issues a **device token**
for subsequent connects.
- **Local** connects (loopback or the gateway hosts own tailnet address) can be
autoapproved to keep samehost UX smooth.
- **Nonlocal** connects must sign the `connect.challenge` nonce and require
explicit approval.
- Gateway auth (`gateway.auth.*`) still applies to **all** connections, local or
remote.
Details: [Gateway protocol](/gateway/protocol), [Pairing](/start/pairing),
[Security](/gateway/security).
## Protocol typing and codegen
- TypeBox schemas define the protocol.
- JSON Schema is generated from those schemas.
- Swift models are generated from the JSON Schema.
## Remote access
- Preferred: Tailscale or VPN.
- Alternative: SSH tunnel
```bash
ssh -N -L 18789:127.0.0.1:18789 user@host
```
- The same handshake + auth token apply over the tunnel.
- TLS + optional pinning can be enabled for WS in remote setups.
## Operations snapshot
- Start: `clawdbot gateway` (foreground, logs to stdout).
- Health: `health` over WS (also included in `hello-ok`).
- Supervision: launchd/systemd for autorestart.
## Invariants
- Exactly one Gateway controls a single Baileys session per host.
- Handshake is mandatory; any nonJSON or nonconnect first frame is a hard close.
- Events are not replayed; clients must refresh on gaps.