Files
clawdbot/docs/concepts/architecture.md
2026-01-08 23:07:20 +01:00

101 lines
3.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
summary: "WebSocket gateway architecture, components, and client flows"
read_when:
- Working on gateway protocol, clients, or transports
---
# Gateway architecture
Last updated: 2026-01-05
## Overview
- A single longlived **Gateway** owns all messaging surfaces (WhatsApp via
Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat).
- All clients (macOS app, CLI, web UI, automations) connect to the Gateway over
**one transport: WebSocket** on the configured bind host (default
`127.0.0.1:18789`).
- One Gateway per host; it is the only place that opens a WhatsApp session.
- A **bridge** (default `18790`) is used for nodes (macOS/iOS/Android).
- A **canvas host** (default `18793`) serves agenteditable HTML and A2UI.
## Components and flows
### Gateway (daemon)
- Maintains provider connections.
- Exposes a typed WS API (requests, responses, serverpush events).
- Validates inbound frames against JSON Schema.
- Emits events like `agent`, `chat`, `presence`, `health`, `heartbeat`, `cron`.
### Clients (mac app / CLI / web admin)
- One WS connection per client.
- Send requests (`health`, `status`, `send`, `agent`, `system-presence`).
- Subscribe to events (`tick`, `agent`, `presence`, `shutdown`).
### Nodes (macOS / iOS / Android)
- Connect to the **bridge** (TCP JSONL) rather than the WS server.
- Pair with the Gateway to receive a token.
- Expose commands like `canvas.*`, `camera.*`, `screen.record`, `location.get`.
### WebChat
- Static UI that uses the Gateway WS API for chat history and sends.
- In remote setups, connects through the same SSH/Tailscale tunnel as other
clients.
## Connection lifecycle (single client)
```
Client Gateway
| |
|---- req:connect -------->|
|<------ res (ok) ---------| (or res error + close)
| (payload=hello-ok carries snapshot: presence + health)
| |
|<------ event:presence ---|
|<------ event:tick -------|
| |
|------- req:agent ------->|
|<------ res:agent --------| (ack: {runId,status:"accepted"})
|<------ event:agent ------| (streaming)
|<------ res:agent --------| (final: {runId,status,summary})
| |
```
## Wire protocol (summary)
- Transport: WebSocket, text frames with JSON payloads.
- First frame **must** be `connect`.
- After handshake:
- Requests: `{type:"req", id, method, params}``{type:"res", id, ok, payload|error}`
- Events: `{type:"event", event, payload, seq?, stateVersion?}`
- If `CLAWDBOT_GATEWAY_TOKEN` (or `--token`) is set, `connect.params.auth.token`
must match or the socket closes.
- Idempotency keys are required for sideeffecting methods (`send`, `agent`) to
safely retry; the server keeps a shortlived dedupe cache.
## Protocol typing and codegen
- TypeBox schemas define the protocol.
- JSON Schema is generated from those schemas.
- Swift models are generated from the JSON Schema.
## Remote access
- Preferred: Tailscale or VPN.
- Alternative: SSH tunnel
```bash
ssh -N -L 18789:127.0.0.1:18789 user@host
```
- The same handshake + auth token apply over the tunnel.
## Operations snapshot
- Start: `clawdbot gateway` (foreground, logs to stdout).
- Health: `health` over WS (also included in `hello-ok`).
- Supervision: launchd/systemd for autorestart.
## Invariants
- Exactly one Gateway controls a single Baileys session per host.
- Handshake is mandatory; any nonJSON or nonconnect first frame is a hard close.
- Events are not replayed; clients must refresh on gaps.