docs: refresh and simplify docs

This commit is contained in:
Peter Steinberger
2026-01-08 23:06:56 +01:00
parent 88dca1afdf
commit a6c309824e
46 changed files with 1117 additions and 2155 deletions

View File

@@ -5,11 +5,11 @@ read_when:
---
# Agent Loop (Clawdis)
Short, exact flow of one agent run. Source of truth: current code in `src/`.
Short, exact flow of one agent run.
## Entry points
- Gateway RPC: `agent` and `agent.wait` in [`src/gateway/server-methods/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/agent.ts).
- CLI: `agentCommand` in [`src/commands/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/agent.ts).
- Gateway RPC: `agent` and `agent.wait`.
- CLI: `agent` command.
## High-level flow
1) `agent` RPC validates params, resolves session (sessionKey/sessionId), persists session metadata, returns `{ runId, acceptedAt }` immediately.
@@ -37,10 +37,8 @@ Short, exact flow of one agent run. Source of truth: current code in `src/`.
- `tool`: streamed tool events from pi-agent-core
## Chat provider handling
- `createAgentEventHandler` in [`src/gateway/server-chat.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-chat.ts):
- buffers assistant deltas
- emits chat `delta` messages
- emits chat `final` when **lifecycle end/error** arrives
- Assistant deltas are buffered into chat `delta` messages.
- A chat `final` is emitted on **lifecycle end/error**.
## Timeouts
- `agent.wait` default: 30s (just the wait). `timeoutMs` param overrides.
@@ -51,11 +49,3 @@ Short, exact flow of one agent run. Source of truth: current code in `src/`.
- AbortSignal (cancel)
- Gateway disconnect or RPC timeout
- `agent.wait` timeout (wait-only, does not stop agent)
## Files
- [`src/gateway/server-methods/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/agent.ts)
- [`src/gateway/server-methods/agent-job.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/agent-job.ts)
- [`src/commands/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/agent.ts)
- [`src/agents/pi-embedded-runner.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-embedded-runner.ts)
- [`src/agents/pi-embedded-subscribe.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-embedded-subscribe.ts)
- [`src/gateway/server-chat.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-chat.ts)

View File

@@ -5,7 +5,7 @@ read_when:
---
# Agent Runtime 🤖
CLAWDBOT runs a single embedded agent runtime derived from **p-mono** (internal name: **p**).
CLAWDBOT runs a single embedded agent runtime derived from **p-mono**.
## Workspace (required)
@@ -43,9 +43,9 @@ To disable bootstrap file creation entirely (for pre-seeded workspaces), set:
{ agent: { skipBootstrap: true } }
```
## Built-in tools (internal)
## Built-in tools
ps embedded core tools (read/bash/edit/write and related internals) are defined in code and always available. `TOOLS.md` does **not** control which tools exist; its guidance for how *you* want them used.
Core tools (read/bash/edit/write and related system tools) are always available. `TOOLS.md` does **not** control which tools exist; its guidance for how *you* want them used.
## Skills
@@ -63,18 +63,6 @@ Clawdbot reuses pieces of the p-mono codebase (models/tools), but **session mana
- No p-coding agent runtime.
- No `~/.pi/agent` or `<workspace>/.pi` settings are consulted.
## Peter @ steipete (only)
Apply these notes **only** when the user is Peter Steinberger at steipete.
- Gateway runs on the **Mac Studio in London**.
- Primary work computer: **MacBook Pro**.
- Peter travels between **Vienna** and **London**; there are two networks bridged via **Tailscale**.
- For debugging, connect to the Mac Studio (London) or MacBook Pro (primary).
- There is also an **M1 MacBook Pro** on the Vienna tailnet you can use to access the Vienna network.
- Nodes can be accessed via the `clawdbot` binary (`pnpm clawdbot` in `~/Projects/clawdbot`).
- See also `skills/clawdbot*` for node/browser/canvas/cron usage.
## Sessions
Session transcripts are stored as JSONL at:

View File

@@ -3,67 +3,55 @@ summary: "WebSocket gateway architecture, components, and client flows"
read_when:
- Working on gateway protocol, clients, or transports
---
# Gateway Architecture
# Gateway architecture
Last updated: 2026-01-05
## Overview
- A single long-lived **Gateway** process owns all messaging surfaces (WhatsApp via Baileys, Telegram via grammY, Slack via Bolt, Discord via discord.js, Signal via signal-cli, iMessage via imsg, WebChat) and the control/event plane.
- All clients (macOS app, CLI, web UI, automations) connect to the Gateway over one transport: **WebSocket on the configured bind host** (default `127.0.0.1:18789`; tunnel or VPN for remote).
- One Gateway per host; it is the only place that is allowed to open a WhatsApp session. All sends/agent runs go through it.
- By default: the Gateway exposes a Canvas host on `canvasHost.port` (default `18793`), serving `~/clawd/canvas` at `/__clawdbot__/canvas/` with live-reload; disable via `canvasHost.enabled=false` or `CLAWDBOT_SKIP_CANVAS_HOST=1`.
## Implementation snapshot (current code)
### TypeScript Gateway ([`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts))
- Single HTTP + WebSocket server (default `18789`); bind policy `loopback|lan|tailnet|auto`. Refuses non-loopback binds without auth; Tailscale serve/funnel requires loopback.
- Handshake: first frame must be a `connect` request; AJV validates request + params against TypeBox schemas; protocol negotiated via `minProtocol`/`maxProtocol`.
- `hello-ok` includes snapshot (presence/health/stateVersion/uptime/configPath/stateDir), features (methods/events), policy (max payload/buffer/tick), and `canvasHostUrl` when available.
- Events emitted: `agent`, `chat`, `presence`, `tick`, `health`, `heartbeat`, `cron`, `talk.mode`, `node.pair.requested`, `node.pair.resolved`, `voicewake.changed`, `shutdown`.
- Idempotency keys are required for `send`, `agent`, `chat.send`, and node invokes; the dedupe cache avoids double-sends on reconnects. Payload sizes are capped per connection.
- Optional node bridge ([`src/infra/bridge/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/bridge/server.ts)): TCP JSONL frames (`hello`, `pair-request`, `req/res`, `event`, `invoke`, `ping`). Node connect/disconnect updates presence and flows into the session bus.
- Control UI + Canvas host: HTTP serves Control UI (base path configurable) and can host the A2UI canvas via [`src/canvas-host/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/canvas-host/server.ts) (live reload). Canvas host URL is advertised to nodes + clients.
### iOS node (`apps/ios`)
- Discovery + pairing: `BridgeDiscoveryModel` uses `NWBrowser` Bonjour discovery and reads TXT fields for LAN/tailnet host hints plus gateway/bridge/canvas ports.
- Auto-connect: `BridgeConnectionController` uses stored `node.instanceId` + Keychain token; supports manual host/port; performs `pair-and-hello`.
- Bridge runtime: `BridgeSession` actor owns an `NWConnection`, JSONL frames, `hello`/`hello-ok`, ping/pong, `req/res`, server `event`s, and `invoke` callbacks; stores `canvasHostUrl`.
- Commands: `NodeAppModel` executes `canvas.*`, `canvas.a2ui.*`, `camera.*`, `screen.record`, `location.get`. Canvas/camera/screen are blocked when backgrounded.
- Canvas + actions: `WKWebView` with A2UI action bridge; accepts actions from local-network or trusted file URLs; intercepts `clawdbot://` deep links and forwards `agent.request` to the bridge.
- Voice/talk: voice wake sends `voice.transcript` events and syncs triggers via `voicewake.get` + `voicewake.changed`; Talk Mode attaches to the bridge.
### Android node (`apps/android`)
- Discovery + pairing: `BridgeDiscovery` uses mDNS/NSD to find `_clawdbot-bridge._tcp`, with manual host/port fallback.
- Auto-connect: `NodeRuntime` restores a stored token, performs `pair-and-hello`, and reconnects to the last discovered or manual bridge.
- Bridge runtime: `BridgeSession` owns the TCP JSONL session (`hello`/`hello-ok`, ping/pong, `req/res`, `event`, `invoke`); stores `canvasHostUrl`.
- Commands: `NodeRuntime` executes `canvas.*`, `canvas.a2ui.*`, `camera.*`, and chat/session events; foreground-only for canvas/camera.
- A single longlived **Gateway** owns all messaging surfaces (WhatsApp via
Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat).
- All clients (macOS app, CLI, web UI, automations) connect to the Gateway over
**one transport: WebSocket** on the configured bind host (default
`127.0.0.1:18789`).
- One Gateway per host; it is the only place that opens a WhatsApp session.
- A **bridge** (default `18790`) is used for nodes (macOS/iOS/Android).
- A **canvas host** (default `18793`) serves agenteditable HTML and A2UI.
## Components and flows
- **Gateway (daemon)**
- Maintains WhatsApp (Baileys), Telegram (grammY), Slack (Bolt), Discord (discord.js), Signal (signal-cli), and iMessage (imsg) connections.
- Exposes a typed WS API (req/resp + server push events).
- Validates every inbound frame against JSON Schema; rejects anything before a mandatory `connect`.
- **Clients (mac app / CLI / web admin)**
- One WS connection per client.
- Send requests (`health`, `status`, `send`, `agent`, `system-presence`, toggles) and subscribe to events (`tick`, `agent`, `presence`, `shutdown`).
- On macOS, the app can also be invoked via deep links (`clawdbot://agent?...`) which translate into the same Gateway `agent` request path (see [`docs/macos.md`](/platforms/macos)).
- **Agent process (Pi)**
- Spawned by the Gateway on demand for `agent` calls; streams events back over the same WS connection.
- **WebChat**
- Serves static assets locally.
- Holds a single WS connection to the Gateway for control/data; all sends/agent runs go through the Gateway WS.
- Remote use goes through the same SSH/Tailscale tunnel as other clients.
### Gateway (daemon)
- Maintains provider connections.
- Exposes a typed WS API (requests, responses, serverpush events).
- Validates inbound frames against JSON Schema.
- Emits events like `agent`, `chat`, `presence`, `health`, `heartbeat`, `cron`.
### Clients (mac app / CLI / web admin)
- One WS connection per client.
- Send requests (`health`, `status`, `send`, `agent`, `system-presence`).
- Subscribe to events (`tick`, `agent`, `presence`, `shutdown`).
### Nodes (macOS / iOS / Android)
- Connect to the **bridge** (TCP JSONL) rather than the WS server.
- Pair with the Gateway to receive a token.
- Expose commands like `canvas.*`, `camera.*`, `screen.record`, `location.get`.
### WebChat
- Static UI that uses the Gateway WS API for chat history and sends.
- In remote setups, connects through the same SSH/Tailscale tunnel as other
clients.
## Connection lifecycle (single client)
```
Client Gateway
| |
|---- req:connect -------->|
|<------ res (ok) ---------| (or res error + close)
| (payload=hello-ok carries snapshot: presence + health)
| (payload=hello-ok carries snapshot: presence + health)
| |
|<------ event:presence ---| (deltas)
|<------ event:tick -------| (keepalive/no-op)
|<------ event:presence ---|
|<------ event:tick -------|
| |
|------- req:agent ------->|
|<------ res:agent --------| (ack: {runId,status:"accepted"})
@@ -71,44 +59,42 @@ Client Gateway
|<------ res:agent --------| (final: {runId,status,summary})
| |
```
## Wire protocol (summary)
- Transport: WebSocket, text frames with JSON payloads.
- First frame must be `req {type:"req", id, method:"connect", params:{minProtocol, maxProtocol, client:{name,version,platform,mode,instanceId}, caps, auth?, locale?, userAgent? } }`.
- Server replies `res {type:"res", id, ok:true, payload: hello-ok }` or `ok:false` then closes.
- After handshake:
- Requests: `{type:"req", id, method, params}``{type:"res", id, ok, payload|error}`
- Events: `{type:"event", event:"agent"|"presence"|"tick"|"shutdown", payload, seq?, stateVersion?}`
- If `CLAWDBOT_GATEWAY_TOKEN` (or `--token`) is set, `connect.params.auth.token` must match; otherwise the socket closes with policy violation.
- Presence payload is structured, not free text: `{host, ip, version, mode, lastInputSeconds?, ts, reason?, tags?[], instanceId? }`.
- Agent runs are acked `{runId,status:"accepted"}` then complete with a final res `{runId,status,summary}`; streamed output arrives as `event:"agent"`.
- Protocol versions are bumped on breaking changes; clients must match `minClient`; Gateway chooses within clients min/max.
- Idempotency keys are required for side-effecting methods (`send`, `agent`) to safely retry; server keeps a short-lived dedupe cache.
- Policy in `hello-ok` communicates payload/queue limits and tick interval.
- First frame **must** be `connect`.
- After handshake:
- Requests: `{type:"req", id, method, params}``{type:"res", id, ok, payload|error}`
- Events: `{type:"event", event, payload, seq?, stateVersion?}`
- If `CLAWDBOT_GATEWAY_TOKEN` (or `--token`) is set, `connect.params.auth.token`
must match or the socket closes.
- Idempotency keys are required for sideeffecting methods (`send`, `agent`) to
safely retry; the server keeps a shortlived dedupe cache.
## Type system and codegen
- Source of truth: TypeBox (or ArkType) definitions in `protocol/` on the server.
- Build step emits JSON Schema.
- Clients:
- TypeScript: uses the same TypeBox types directly.
- Swift: generated `Codable` models via quicktype from the JSON Schema.
- Validation: AJV on the server for every inbound frame; optional client-side validation for defensive programming.
## Protocol typing and codegen
## Invariants
- Exactly one Gateway controls a single Baileys session per host. No fallbacks to ad-hoc direct Baileys sends.
- Handshake is mandatory; any non-JSON or non-connect first frame is a hard close.
- All methods and events are versioned; new fields are additive; breaking changes increment `protocol`.
- No event replay: on seq gaps, clients must refresh (`health` + `system-presence`) and continue; presence is bounded via TTL/max entries.
- TypeBox schemas define the protocol.
- JSON Schema is generated from those schemas.
- Swift models are generated from the JSON Schema.
## Remote access
- Preferred: Tailscale or VPN; alternate: SSH tunnel `ssh -N -L 18789:127.0.0.1:18789 user@host`.
- Same protocol over the tunnel; same handshake. If a shared token is configured, clients must send it in `connect.params.auth.token` even over the tunnel.
- Same protocol over the tunnel; same handshake. If a shared token is configured, clients must send it in `connect.params.auth.token` even over the tunnel.
- Preferred: Tailscale or VPN.
- Alternative: SSH tunnel
```bash
ssh -N -L 18789:127.0.0.1:18789 user@host
```
- The same handshake + auth token apply over the tunnel.
## Operations snapshot
- Start: `clawdbot gateway` (foreground, logs to stdout).
Supervise with launchd/systemd for restarts.
- Health: request `health` over WS; also surfaced in `hello-ok.health`.
- Metrics/logging: keep outside this spec; gateway should expose Prometheus text or structured logs separately.
## Migration notes
- This architecture supersedes the legacy stdin RPC and the ad-hoc TCP control port. New clients should speak only the WS protocol. Legacy compatibility is intentionally dropped.
- Start: `clawdbot gateway` (foreground, logs to stdout).
- Health: `health` over WS (also included in `hello-ok`).
- Supervision: launchd/systemd for autorestart.
## Invariants
- Exactly one Gateway controls a single Baileys session per host.
- Handshake is mandatory; any nonJSON or nonconnect first frame is a hard close.
- Events are not replayed; clients must refresh on gaps.

View File

@@ -61,7 +61,6 @@ Only the owner number (from `whatsapp.allowFrom`, or the bots own E.164 when
4) Session-level directives (`/verbose on`, `/think high`, `/new` or `/reset`, `/compact`) apply only to that groups session; send them as standalone messages so they register. Your personal DM session remains independent.
## Testing / verification
- Automated: `pnpm test -- src/web/auto-reply.test.ts --runInBand` (covers mention gating, history injection, sender suffix).
- Manual smoke:
- Send an `@clawd` ping in the group and confirm a reply that references the sender name.
- Send a second ping and verify the history block is included then cleared on the next turn.

View File

@@ -1,157 +1,114 @@
---
summary: "Plan for models CLI: scan, list, aliases, fallbacks, status"
summary: "Models CLI: list, set, aliases, fallbacks, scan, status"
read_when:
- Adding or modifying models CLI (models list/set/scan/aliases/fallbacks)
- Changing model fallback behavior or selection UX
- Updating model scan probes (tools/images)
---
# Models CLI plan
# Models CLI
See [`docs/model-failover.md`](/concepts/model-failover) for how auth profiles rotate (OAuth vs API keys), cooldowns, and how that interacts with model fallbacks.
See [/concepts/model-failover](/concepts/model-failover) for auth profile
rotation, cooldowns, and how that interacts with fallbacks.
Goal: give clear model visibility + control (configured vs available), plus scan tooling
that prefers tool-call + image-capable models and maintains ordered fallbacks.
## How Clawdbot models work (quick explainer)
## How model selection works
Clawdbot selects models in this order:
1) The configured **primary** model (`agent.model.primary`).
2) If it fails, fallbacks in `agent.model.fallbacks` (in order).
3) Auth failover happens **inside** the provider first (see [/concepts/model-failover](/concepts/model-failover)).
Key pieces:
- `provider/model` is the canonical model id (e.g. `anthropic/claude-opus-4-5`).
- `agent.models` is the **allowlist/catalog** of models Clawdbot can use, with optional aliases and provider params.
- `agent.imageModel` is only used when the primary model **cant** accept images.
- `models.providers` lets you add custom providers + models (written to `models.json`).
- `/model <id>` switches the active model for the current session; `/model list` shows whats allowed.
1) **Primary** model (`agent.model.primary` or `agent.model`).
2) **Fallbacks** in `agent.model.fallbacks` (in order).
3) **Provider auth failover** happens inside a provider before moving to the
next model.
Related:
- Context limits are model-specific; long sessions may trigger compaction. See [/concepts/compaction](/concepts/compaction).
- `agent.models` is the allowlist/catalog of models Clawdbot can use (plus aliases).
- `agent.imageModel` is used **only when** the primary model cant accept images.
## Model recommendations
## Config keys (overview)
- [Claude Opus 4.5](https://www.anthropic.com/claude/opus): default primary for assistant + general work. Its pricey and cap-prone, so consider the [Claude Max $200 subscription](https://www.anthropic.com/pricing/) if you live here.
- [Claude Sonnet 4.5](https://www.anthropic.com/claude/sonnet): default fallback when Opus caps out. Similar behavior with fewer limit headaches.
- [GPT-5.2-Codex](https://developers.openai.com/codex/models): recommended for coding and sub-agents. Prefer the [Codex CLI](https://developers.openai.com/codex/cli) if you want the strongest feel.
- `agent.model.primary` and `agent.model.fallbacks`
- `agent.imageModel.primary` and `agent.imageModel.fallbacks`
- `agent.models` (allowlist + aliases + provider params)
- `models.providers` (custom providers written into `models.json`)
Suggested stacks:
- Assistant-first: Opus 4.5 primary → Sonnet 4.5 fallback.
- Agentic coding: Opus 4.5 primary → GPT-5.2-Codex for sub-agents → Sonnet 4.5 fallback.
Model refs are normalized to lowercase. Provider aliases like `z.ai/*` normalize
to `zai/*`.
## Model discussions (community notes)
## CLI commands
Anecdotal notes from the Discord thread on January 45, 2026. Treat as “reported by users,” not a benchmark.
```bash
clawdbot models list
clawdbot models status
clawdbot models set <provider/model>
clawdbot models set-image <provider/model>
**Reported working well**
- [Claude Opus 4.5](https://www.anthropic.com/claude/opus): best overall quality in Clawdbot, especially for “assistant” work. Tradeoff is cost and hitting usage limits quickly.
- [Claude Sonnet 4.5](https://www.anthropic.com/claude/sonnet): common fallback when Opus caps out. Similar behavior with fewer limit headaches.
- [Gemini 3 Pro](https://deepmind.google/en/models/gemini/pro/): some users felt it maps well to Clawdbots structure. Vibe was “fits the framework” more than “best at everything.”
- [GLM](https://www.zhipuai.cn/en/): used successfully as a worker model under orchestration. Seen as strong for delegated/secondary tasks, not the primary brain.
- [MiniMax M2.1](https://platform.minimax.io/docs/guides/models-intro): “good enough” for grunt work or a cheap fallback. Community nickname was “Temu-Sonnet,” i.e. usable but not Sonnet-level polish.
clawdbot models aliases list
clawdbot models aliases add <alias> <provider/model>
clawdbot models aliases remove <alias>
**Mixed / unclear**
- [Antigravity](https://blog.google/technology/ai/google-ai-updates-november-2025/) (Claude Opus access): some reported extra Opus quota. Pricing/limits were unclear, so the value is hard to predict.
clawdbot models fallbacks list
clawdbot models fallbacks add <provider/model>
clawdbot models fallbacks remove <provider/model>
clawdbot models fallbacks clear
**Reported weak in Clawdbot**
- [GPT-5.2-Codex](https://developers.openai.com/codex/models) inside Clawdbot: reported as rough for conversation/assistant tasks when embedded. Same notes said Codex felt stronger via the [Codex CLI](https://developers.openai.com/codex/cli) than embedded use.
- [Grok](https://docs.x.ai/docs/models/grok-4): people tried it and then abandoned it. No strong upside showed up in the notes.
clawdbot models image-fallbacks list
clawdbot models image-fallbacks add <provider/model>
clawdbot models image-fallbacks remove <provider/model>
clawdbot models image-fallbacks clear
```
**Theme**
- Token burn feels higher than expected in long sessions; people suspect context buildup + tool outputs. Pruning/compaction helps. Check session logs before blaming providers. See [/concepts/session](/concepts/session) and [/concepts/model-failover](/concepts/model-failover).
`clawdbot models` (no subcommand) is a shortcut for `models status`.
Want a tailored stack? Share whether youre using Clawdbot or Clawdis and your main workload (agentic coding vs “assistant” work), and we can suggest a primary + fallback set based on these reports.
### `models list`
## Models CLI
Shows configured models by default. Useful flags:
See [/cli](/cli) for the full command tree and CLI flags.
- `--all`: full catalog
- `--local`: local providers only
- `--provider <name>`: filter by provider
- `--plain`: one model per line
- `--json`: machinereadable output
### CLI output (list + status)
### `models status`
`clawdbot models list` (default) prints a table with these columns:
- `Model`: `provider/model` key (truncated in TTY).
- `Input`: `text` or `text+image`.
- `Ctx`: context window in K tokens (from the model registry).
- `Local`: `yes/no` when the provider base URL is local.
- `Auth`: `yes/no` when the provider has usable auth.
- `Tags`: origin + role hints.
Shows the resolved primary model, fallbacks, image model, and an auth overview
of configured providers. `--plain` prints only the resolved primary model.
Common tags:
- `default` — resolved default model.
- `fallback#N``agent.model.fallbacks` order.
- `image``agent.imageModel.primary`.
- `img-fallback#N``agent.imageModel.fallbacks` order.
- `configured` — present in `agent.models`.
- `alias:<name>` — alias from `agent.models.*.alias`.
- `missing` — referenced in config but not found in the registry.
## Scanning (OpenRouter free models)
Output formats:
- `--plain`: prints only `provider/model` keys (one per line).
- `--json`: `{ count, models: [{ key, name, input, contextWindow, local, available, tags, missing }] }`.
`clawdbot models scan` inspects OpenRouters **free model catalog** and can
optionally probe models for tool and image support.
`clawdbot models status` prints the resolved defaults, fallbacks, image model, aliases,
and an **Auth overview** section showing which providers have profiles/env/models.json keys.
`--plain` prints the resolved default model only; `--json` returns a structured object for tooling.
Key flags:
## Config changes
- `--no-probe`: skip live probes (metadata only)
- `--min-params <b>`: minimum parameter size (billions)
- `--max-age-days <days>`: skip older models
- `--provider <name>`: provider prefix filter
- `--max-candidates <n>`: fallback list size
- `--set-default`: set `agent.model.primary` to the first selection
- `--set-image`: set `agent.imageModel.primary` to the first image selection
- `agent.models` (configured model catalog + aliases).
- `agent.models.*.params` (provider-specific API params passed through to requests).
- `agent.model.primary` + `agent.model.fallbacks`.
- `agent.imageModel.primary` + `agent.imageModel.fallbacks` (optional).
- `auth.profiles` + `auth.order` for per-provider auth failover.
Probing requires an OpenRouter API key (from auth profiles or
`OPENROUTER_API_KEY`). Without a key, use `--no-probe` to list candidates only.
## Scan behavior (models scan)
Scan results are ranked by:
1) Image support
2) Tool latency
3) Context size
4) Parameter count
<<<<<<< HEAD
Input
- OpenRouter `/models` list (filter `:free`)
- Requires OpenRouter API key from auth profiles or `OPENROUTER_API_KEY` (see [/environment](/environment))
- Optional filters: `--max-age-days`, `--min-params`, `--provider`, `--max-candidates`
- Probe controls: `--timeout`, `--concurrency`
Probes (direct pi-ai complete)
- Tool-call probe (required):
- Provide a dummy tool, verify tool call emitted.
- Image probe (preferred):
- Prompt includes 1x1 PNG; success if no "unsupported image" error.
When run in a TTY, you can select fallbacks interactively. In noninteractive
mode, pass `--yes` to accept defaults.
Scoring/selection
- Prefer models passing tool + image for text/tool fallbacks.
- Prefer image-only models for image tool fallback (even if tool probe fails).
- Rank by: image ok, then lower tool latency, then larger context, then params.
## Models registry (`models.json`)
Interactive selection (TTY)
- Multiselect list with per-model stats:
- model id, tool ok, image ok, median latency, context, inferred params.
- Pre-select top N (default 6).
- Non-TTY: auto-select; require `--yes`/`--no-input` to apply.
Output
- Writes `agent.model.fallbacks` ordered.
- Writes `agent.imageModel.fallbacks` ordered (image-capable models).
- Ensures `agent.models` entries exist for selected models.
- Optional `--set-default` to set `agent.model.primary`.
- Optional `--set-image` to set `agent.imageModel.primary`.
## Runtime fallback
- On model failure: try `agent.model.fallbacks` in order.
- Per-provider auth failover uses `auth.order` (or stored profile order) **before**
moving to the next model.
- Image routing uses `agent.imageModel` **only when configured** and the primary
model lacks image input.
- Persist last successful provider/model to session entry; auth profile success is global.
- See [`docs/model-failover.md`](/concepts/model-failover) for auth profile rotation, cooldowns, and timeout handling.
## Tests
- Unit: scan selection ordering + probe classification.
- CLI: list/aliases/fallbacks add/remove + scan writes config.
- Status: shows last used model + fallbacks.
## Docs
- Update [`docs/configuration.md`](/gateway/configuration) with `agent.models` + `agent.model` + `agent.imageModel`.
- Keep this doc current when CLI surface or scan logic changes.
- Note provider aliases like `z.ai/*` -> `zai/*` when relevant.
- Provider ids in model refs are normalized to lowercase.
Custom providers in `models.providers` are written into `models.json` under the
agent directory (default `~/.clawdbot/agents/<agentId>/models.json`). This file
is merged by default unless `models.mode` is set to `replace`.

View File

@@ -97,7 +97,7 @@ At runtime:
- if `expires` is in the future → use the stored access token
- if expired → refresh (under a file lock) and overwrite the stored credentials
See implementation: `src/agents/auth-profiles.ts`.
The refresh flow is automatic; you generally dont need to manage tokens manually.
## Multiple accounts (profiles) + routing

View File

@@ -7,127 +7,93 @@ read_when:
---
# Presence
Clawdbot “presence” is a lightweight, best-effort view of:
- The **Gateway** itself (one per host), and
- The **clients connected to the Gateway** (mac app, WebChat, CLI, etc.).
Clawdbot “presence” is a lightweight, besteffort view of:
- the **Gateway** itself, and
- **clients connected to the Gateway** (mac app, WebChat, CLI, etc.)
Presence is used primarily to render the mac apps **Instances** tab and to provide quick operator visibility.
Presence is used primarily to render the macOS apps **Instances** tab and to
provide quick operator visibility.
## The data model
## Presence fields (what shows up)
Presence entries are structured objects with (some) fields:
- `instanceId` (optional but strongly recommended): stable client identity used for dedupe
- `host`: a human-readable name (often the machine name)
- `ip`: best-effort IP address (may be missing or stale)
Presence entries are structured objects with fields like:
- `instanceId` (optional but strongly recommended): stable client identity
- `host`: humanfriendly host name
- `ip`: besteffort IP address
- `version`: client version string
- `deviceFamily` (optional): hardware family like `iPad`, `iPhone`, `Mac`
- `modelIdentifier` (optional): hardware model identifier like `iPad16,6` or `Mac16,6`
- `mode`: e.g. `gateway`, `app`, `webchat`, `cli`
- `lastInputSeconds` (optional): “seconds since last user input” for that client machine
- `reason`: a short marker like `self`, `connect`, `node-connected`, `node-disconnected`, `periodic`, `instances-refresh`
- `text`: legacy/debug summary string (kept for backwards compatibility and UI display)
- `deviceFamily` / `modelIdentifier`: hardware hints
- `mode`: `gateway`, `app`, `webchat`, `cli`, `node`, ...
- `lastInputSeconds`: “seconds since last user input” (if known)
- `reason`: `self`, `connect`, `node-connected`, `periodic`, ...
- `ts`: last update timestamp (ms since epoch)
## Producers (where presence comes from)
Presence entries are produced by multiple sources and then **merged**.
Presence entries are produced by multiple sources and **merged**.
### 1) Gateway self entry
The Gateway seeds a “self” entry at startup so UIs always show at least the current gateway host.
The Gateway always seeds a “self” entry at startup so UIs show the gateway host
even before any clients connect.
Implementation: [`src/infra/system-presence.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/system-presence.ts) (`initSelfPresence()`).
### 2) WebSocket connect
### 2) WebSocket connect (connection-derived presence)
Every WS client begins with a `connect` request. On successful handshake the
Gateway upserts a presence entry for that connection.
Every WS client must begin with a `connect` request. On successful handshake, the Gateway upserts a presence entry for that connection.
#### Why oneoff CLI commands dont show up
This is meant to answer: “Which clients are currently connected?”
The CLI often connects for short, oneoff commands. To avoid spamming the
Instances list, `client.mode === "cli"` is **not** turned into a presence entry.
Implementation: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) (connect handling uses `connect.params.client.instanceId` when provided; otherwise falls back to `connId`).
### 3) `system-event` beacons
#### Why one-off CLI commands do not show up
Clients can send richer periodic beacons via the `system-event` method. The mac
app uses this to report host name, IP, and `lastInputSeconds`.
The CLI connects to the Gateway to execute one-off commands (health/status/send/agent/etc.). These are not “nodes” and would spam the Instances list, so the Gateway does not create presence entries for clients with `client.mode === "cli"`.
### 3) `system-event` beacons (client-reported presence)
Clients can publish richer periodic beacons via the `system-event` method. The mac app uses this to report:
- a human-friendly host name
- its best-known IP address
- `lastInputSeconds`
Implementation:
- Gateway: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) handles method `system-event` by calling `updateSystemPresence(...)`.
- mac app beaconing: [`apps/macos/Sources/Clawdbot/PresenceReporter.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/PresenceReporter.swift).
### 4) Node bridge beacons (gateway-owned presence)
### 4) Node bridge beacons
When a node bridge connection authenticates, the Gateway emits a presence entry
for that node and starts periodic refresh beacons so it does not expire.
- Connect/disconnect markers: `node-connected`, `node-disconnected`
- Periodic heartbeat: every 3 minutes (`reason: periodic`)
Implementation: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) (node bridge handlers + timer beacons).
for that node and refreshes it periodically so it doesnt expire.
## Merge + dedupe rules (why `instanceId` matters)
All producers write into a single in-memory presence map.
Presence entries are stored in a single inmemory map:
Key points:
- Entries are **keyed** by a “presence key”. If two producers use the same key, they update the same entry.
- The best key is a stable, opaque `instanceId` that does not change across restarts.
- Keys are treated case-insensitively.
- Entries are keyed by a **presence key**.
- The best key is a stable `instanceId` that survives restarts.
- Keys are caseinsensitive.
Implementation: [`src/infra/system-presence.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/system-presence.ts) (`normalizePresenceKey()`).
If a client reconnects without a stable `instanceId`, it may show up as a
**duplicate** row.
### mac app identity (stable UUID)
## TTL and bounded size
The mac app uses a persisted UUID as `instanceId` so:
- restarts/reconnects do not create duplicates
- renaming the Mac does not create a new “instance”
- debug/release builds can share the same identity
Presence is intentionally ephemeral:
Implementation: [`apps/macos/Sources/Clawdbot/InstanceIdentity.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/InstanceIdentity.swift).
- **TTL:** entries older than 5 minutes are pruned
- **Max entries:** 200 (oldest dropped first)
`displayName` (machine name) is used for UI, while `instanceId` is used for dedupe.
## TTL and bounded size (why stale rows disappear)
Presence entries are not permanent:
- TTL: entries older than 5 minutes are pruned
- Max: map is capped at 200 entries (LRU by `ts`)
Implementation: [`src/infra/system-presence.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/system-presence.ts) (`TTL_MS`, `MAX_ENTRIES`, pruning in `listSystemPresence()`).
This keeps the list fresh and avoids unbounded memory growth.
## Remote/tunnel caveat (loopback IPs)
When a client connects over an SSH tunnel / local port forward, the Gateway may see the remote address as loopback (`127.0.0.1`).
When a client connects over an SSH tunnel / local port forward, the Gateway may
see the remote address as `127.0.0.1`. To avoid overwriting a good clientreported
IP, loopback remote addresses are ignored.
To avoid degrading an otherwise-correct client beacon IP, the Gateway avoids writing loopback remote addresses into presence entries.
Implementation: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) (`isLoopbackAddress()`).
## Consumers (who reads presence)
## Consumers
### macOS Instances tab
The mac apps Instances tab renders the result of `system-presence`.
Implementation:
- View: [`apps/macos/Sources/Clawdbot/InstancesSettings.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/InstancesSettings.swift)
- Store: [`apps/macos/Sources/Clawdbot/InstancesStore.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/InstancesStore.swift)
The Instances rows show a small presence indicator (Active/Idle/Stale) based on
the last beacon age. The label is derived from the entry timestamp (`ts`).
The store refreshes periodically and also applies `presence` WS events.
The macOS app renders the output of `system-presence` and applies a small status
indicator (Active/Idle/Stale) based on the age of the last update.
## Debugging tips
- To see the raw list, call `system-presence` against the gateway.
- To see the raw list, call `system-presence` against the Gateway.
- If you see duplicates:
- confirm clients send a stable `instanceId` in the handshake (`connect.params.client.instanceId`)
- confirm beaconing uses the same `instanceId`
- check whether the connection-derived entry is missing `instanceId` (then it will be keyed by `connId` and duplicates are expected on reconnect)
- confirm clients send a stable `instanceId` in the handshake
- confirm periodic beacons use the same `instanceId`
- check whether the connectionderived entry is missing `instanceId` (duplicates are expected)

View File

@@ -3,24 +3,97 @@ summary: "Routing rules per provider (WhatsApp, Telegram, Discord, web) and shar
read_when:
- Changing provider routing or inbox behavior
---
# Providers & Routing
# Providers & routing
Updated: 2026-01-06
Goal: deterministic replies per provider, while supporting multi-agent + multi-account routing.
Clawdbot routes replies **back to the provider where a message came from**. The
model does not choose a provider; routing is deterministic and controlled by the
host configuration.
- **Provider**: provider label (`whatsapp`, `webchat`, `telegram`, `discord`, `signal`, `imessage`, …). Routing is fixed: replies go back to the origin provider; the model doesnt choose.
- **AccountId**: provider account instance (e.g. WhatsApp account `"default"` vs `"work"`). Not every provider supports multi-account yet.
- **AgentId**: one isolated “brain” (workspace + per-agent agentDir + per-agent session store).
- **Reply context:** inbound replies include `ReplyToId`, `ReplyToBody`, and `ReplyToSender`, and the quoted context is appended to `Body` as a `[Replying to ...]` block.
- **Canonical direct session (per agent):** direct chats collapse to `agent:<agentId>:<mainKey>` (default `main`). Groups/channels stay isolated per agent:
- group: `agent:<agentId>:<provider>:group:<id>`
- channel/room: `agent:<agentId>:<provider>:channel:<id>`
- Telegram forum topics: `agent:<agentId>:telegram:group:<chatId>:topic:<threadId>`
- **Session store:** per-agent store lives under `~/.clawdbot/agents/<agentId>/sessions/sessions.json` (override via `session.store` with `{agentId}` templating). JSONL transcripts live next to it.
- **WebChat:** attaches to the selected agents main session (so desktop reflects cross-provider history for that agent).
- **Implementation hints:**
- Set `Provider` + `AccountId` in each ingress.
- Route inbound to an agent via `routing.bindings` (match on `provider`, `accountId`, plus optional peer/guild/team).
- Keep routing deterministic: originate → same provider. Use the gateway WebSocket for sends; avoid side channels.
- Do not let the agent emit “send to X” decisions; keep that policy in the host code.
## Key terms
- **Provider**: `whatsapp`, `telegram`, `discord`, `slack`, `signal`, `imessage`, `webchat`.
- **AccountId**: perprovider account instance (when supported).
- **AgentId**: an isolated workspace + session store (“brain”).
- **SessionKey**: the internal bucket key used to store context and control concurrency.
## Session key shapes (examples)
Direct messages collapse to the agents **main** session:
- `agent:<agentId>:<mainKey>` (default: `agent:main:main`)
Groups and channels remain isolated per provider:
- Groups: `agent:<agentId>:<provider>:group:<id>`
- Channels/rooms: `agent:<agentId>:<provider>:channel:<id>`
Threads:
- Slack/Discord threads append `:thread:<threadId>` to the base key.
- Telegram forum topics embed `:topic:<topicId>` in the group key.
Examples:
- `agent:main:telegram:group:-1001234567890:topic:42`
- `agent:main:discord:channel:123456:thread:987654`
## Routing rules (how an agent is chosen)
Routing picks **one agent** for each inbound message:
1. **Exact peer match** (`routing.bindings` with `peer.kind` + `peer.id`).
2. **Guild match** (Discord) via `guildId`.
3. **Team match** (Slack) via `teamId`.
4. **Account match** (`accountId` on the provider).
5. **Provider match** (any account on that provider).
6. **Default agent** (`routing.defaultAgentId`, fallback to `main`).
The matched agent determines which workspace and session store are used.
## Config overview
- `routing.defaultAgentId`: default agent when no binding matches.
- `routing.agents`: named agent definitions (workspace, model, etc.).
- `routing.bindings`: map inbound providers/accounts/peers to agents.
Example:
```json5
{
routing: {
defaultAgentId: "main",
agents: {
support: { name: "Support", workspace: "~/clawd-support" }
},
bindings: [
{ match: { provider: "slack", teamId: "T123" }, agentId: "support" },
{ match: { provider: "telegram", peer: { kind: "group", id: "-100123" } }, agentId: "support" }
]
}
}
```
## Session storage
Session stores live under the state directory (default `~/.clawdbot`):
- `~/.clawdbot/agents/<agentId>/sessions/sessions.json`
- JSONL transcripts live alongside the store
You can override the store path via `session.store` and `{agentId}` templating.
## WebChat behavior
WebChat attaches to the **selected agent** and defaults to the agents main
session. Because of this, WebChat lets you see crossprovider context for that
agent in one place.
## Reply context
Inbound replies include:
- `ReplyToId`, `ReplyToBody`, and `ReplyToSender` when available.
- Quoted context is appended to `Body` as a `[Replying to ...]` block.
This is consistent across providers.

View File

@@ -12,7 +12,7 @@ We now serialize command-based auto-replies (WhatsApp Web listener) through a ti
- Serializing avoids competing for terminal/stdin, keeps logs readable, and reduces the chance of rate limits from upstream tools.
## How it works
- [`src/process/command-queue.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/process/command-queue.ts) holds a lane-aware FIFO queue and drains each lane synchronously.
- A lane-aware FIFO queue drains each lane synchronously.
- `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session.
- Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agent.maxConcurrent`.
- When verbose logging is enabled, queued commands emit a short notice if they waited more than ~2s before starting.
@@ -74,4 +74,4 @@ Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.
## Troubleshooting
- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
- `enqueueCommand` exposes a lightweight `getQueueSize()` helper if you need to surface queue depth in future diagnostics.
- If you need queue depth, enable verbose logs and watch for queue timing lines.

View File

@@ -21,7 +21,7 @@ Goal: small, hard-to-misuse tool set so agents can list sessions, fetch history,
- Hooks use `hook:<uuid>` unless explicitly set.
- Node bridge uses `node-<nodeId>` unless explicitly set.
`global` and `unknown` are internal-only and never listed. If `session.scope = "global"`, we alias it to `main` for all tools so callers never see `global`.
`global` and `unknown` are reserved values and are never listed. If `session.scope = "global"`, we alias it to `main` for all tools so callers never see `global`.
## sessions_list
List sessions as an array of rows.

View File

@@ -8,7 +8,7 @@ read_when:
ClaudeBot builds a custom system prompt for every agent run. The prompt is **Clawdbot-owned** and does not use the p-coding-agent default prompt.
The prompt is assembled in `src/agents/system-prompt.ts` and injected by `src/agents/pi-embedded-runner.ts`.
The prompt is assembled by Clawdbot and injected into each agent run.
## Structure
@@ -56,9 +56,3 @@ Skills are **not** auto-injected. Instead, the prompt instructs the model to use
```
This keeps the base prompt small while still enabling targeted skill usage.
## Code references
- Prompt text: `src/agents/system-prompt.ts`
- Prompt assembly + injection: `src/agents/pi-embedded-runner.ts`
- Bootstrap trimming: `src/agents/pi-embedded-helpers.ts`

View File

@@ -3,40 +3,34 @@ summary: "TypeBox schemas as the single source of truth for the gateway protocol
read_when:
- Updating protocol schemas or codegen
---
# TypeBox as Protocol Source of Truth
# TypeBox as protocol source of truth
Last updated: 2025-12-09
Last updated: 2026-01-08
We use TypeBox schemas in [`src/gateway/protocol/schema.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/protocol/schema.ts) as the single source of truth for the Gateway control plane (connect/req/res/event frames and payloads). All derived artifacts should be generated from these schemas, not edited by hand.
TypeBox schemas define the Gateway control plane (connect/req/res/event frames and
payloads). All generated artifacts must come from these schemas.
## Current pipeline
- **TypeBox → JSON Schema**: `pnpm protocol:gen` writes [`dist/protocol.schema.json`](https://github.com/clawdbot/clawdbot/blob/main/dist/protocol.schema.json) (draft-07) and runs AJV in the server tests.
- **TypeBox → Swift**: `pnpm protocol:gen:swift` generates [`apps/macos/Sources/ClawdbotProtocol/GatewayModels.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/ClawdbotProtocol/GatewayModels.swift).
- `pnpm protocol:gen`
- writes the JSON Schema output (draft07)
- `pnpm protocol:gen:swift`
- generates Swift gateway models
- `pnpm protocol:check`
- runs both generators and verifies the output is committed
## Problem
## Swift codegen behavior
- We want strong typing in Swift, including a sealed `GatewayFrame` enum with a discriminator and a forward-compatible `unknown` case.
The Swift generator emits:
## Preferred plan (next step)
- `GatewayFrame` enum with `req`, `res`, `event`, and `unknown` cases
- Strongly typed payload structs/enums
- `ErrorCode` values and `GATEWAY_PROTOCOL_VERSION`
- Add a small, custom Swift generator driven directly by the TypeBox schemas:
- Emit a sealed `enum GatewayFrame: Codable { case req(RequestFrame), res(ResponseFrame), event(EventFrame) }`.
- Emit strongly typed payload structs/enums (`ConnectParams`, `HelloOk`, `RequestFrame`, `ResponseFrame`, `EventFrame`, `PresenceEntry`, `Snapshot`, `StateVersion`, `ErrorShape`, `AgentEvent`, `TickEvent`, `ShutdownEvent`, `SendParams`, `AgentParams`, `ErrorCode`, `PROTOCOL_VERSION`).
- Custom `init(from:)` / `encode(to:)` enforces the `type` discriminator and can include an `unknown` case for forward compatibility.
- Wire a new script (e.g., `pnpm protocol:gen:swift`) into `protocol:check` so CI fails if the generated Swift is stale.
Unknown frame types are preserved as raw payloads for forward compatibility.
Why this path:
- Single source of truth stays TypeBox; no new IDL to maintain.
- Predictable, strongly typed Swift (no optional soup).
- Small deterministic codegen (~150200 LOC script) we control.
## When you change schemas
## Alternative (if we want off-the-shelf codegen)
- Wrap the existing JSON Schema into an OpenAPI 3.1 doc (auto-generated) and use **swift-openapi-generator** or **openapi-generator swift5**. More moving parts, but also yields enums with discriminator support. Keep this as a fallback if we dont want a custom emitter.
## Action items
- Implement `protocol:gen:swift` that reads the TypeBox schemas and emits the sealed Swift enum + payload structs.
- Update `protocol:check` to include the Swift generator output in the diff check.
- Remove quicktype output once the custom generator is in place (or keep it for docs only).
1) Update the TypeBox schemas.
2) Run `pnpm protocol:check`.
3) Commit the regenerated schema + Swift models.