docs: refresh and simplify docs

This commit is contained in:
Peter Steinberger
2026-01-08 23:06:56 +01:00
parent 88dca1afdf
commit a6c309824e
46 changed files with 1117 additions and 2155 deletions

View File

@@ -171,9 +171,9 @@ Notes:
Recommended: `clawdbot hooks gmail run` wraps the same flow and auto-renews the watch. Recommended: `clawdbot hooks gmail run` wraps the same flow and auto-renews the watch.
## Expose the handler (dev, unsupported hack) ## Expose the handler (advanced, unsupported)
If you insist on a non-Tailscale tunnel, wire it manually and use the public URL in the push If you need a non-Tailscale tunnel, wire it manually and use the public URL in the push
subscription (unsupported, no guardrails): subscription (unsupported, no guardrails):
```bash ```bash

View File

@@ -7,8 +7,7 @@ read_when:
# CLI reference # CLI reference
This page mirrors `src/cli/*` and is the source of truth for CLI behavior. This page describes the current CLI behavior. If commands change, update this doc.
If you change the CLI code, update this doc.
## Global flags ## Global flags
@@ -25,7 +24,7 @@ If you change the CLI code, update this doc.
## Color palette ## Color palette
Clawdbot uses a lobster palette for CLI output. Source of truth: `src/terminal/theme.ts`. Clawdbot uses a lobster palette for CLI output.
- `accent` (#FF5A2D): headings, provider labels, primary highlights. - `accent` (#FF5A2D): headings, provider labels, primary highlights.
- `accentBright` (#FF7A3D): command names, emphasis. - `accentBright` (#FF7A3D): command names, emphasis.

View File

@@ -5,11 +5,11 @@ read_when:
--- ---
# Agent Loop (Clawdis) # Agent Loop (Clawdis)
Short, exact flow of one agent run. Source of truth: current code in `src/`. Short, exact flow of one agent run.
## Entry points ## Entry points
- Gateway RPC: `agent` and `agent.wait` in [`src/gateway/server-methods/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/agent.ts). - Gateway RPC: `agent` and `agent.wait`.
- CLI: `agentCommand` in [`src/commands/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/agent.ts). - CLI: `agent` command.
## High-level flow ## High-level flow
1) `agent` RPC validates params, resolves session (sessionKey/sessionId), persists session metadata, returns `{ runId, acceptedAt }` immediately. 1) `agent` RPC validates params, resolves session (sessionKey/sessionId), persists session metadata, returns `{ runId, acceptedAt }` immediately.
@@ -37,10 +37,8 @@ Short, exact flow of one agent run. Source of truth: current code in `src/`.
- `tool`: streamed tool events from pi-agent-core - `tool`: streamed tool events from pi-agent-core
## Chat provider handling ## Chat provider handling
- `createAgentEventHandler` in [`src/gateway/server-chat.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-chat.ts): - Assistant deltas are buffered into chat `delta` messages.
- buffers assistant deltas - A chat `final` is emitted on **lifecycle end/error**.
- emits chat `delta` messages
- emits chat `final` when **lifecycle end/error** arrives
## Timeouts ## Timeouts
- `agent.wait` default: 30s (just the wait). `timeoutMs` param overrides. - `agent.wait` default: 30s (just the wait). `timeoutMs` param overrides.
@@ -51,11 +49,3 @@ Short, exact flow of one agent run. Source of truth: current code in `src/`.
- AbortSignal (cancel) - AbortSignal (cancel)
- Gateway disconnect or RPC timeout - Gateway disconnect or RPC timeout
- `agent.wait` timeout (wait-only, does not stop agent) - `agent.wait` timeout (wait-only, does not stop agent)
## Files
- [`src/gateway/server-methods/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/agent.ts)
- [`src/gateway/server-methods/agent-job.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/agent-job.ts)
- [`src/commands/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/agent.ts)
- [`src/agents/pi-embedded-runner.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-embedded-runner.ts)
- [`src/agents/pi-embedded-subscribe.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-embedded-subscribe.ts)
- [`src/gateway/server-chat.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-chat.ts)

View File

@@ -5,7 +5,7 @@ read_when:
--- ---
# Agent Runtime 🤖 # Agent Runtime 🤖
CLAWDBOT runs a single embedded agent runtime derived from **p-mono** (internal name: **p**). CLAWDBOT runs a single embedded agent runtime derived from **p-mono**.
## Workspace (required) ## Workspace (required)
@@ -43,9 +43,9 @@ To disable bootstrap file creation entirely (for pre-seeded workspaces), set:
{ agent: { skipBootstrap: true } } { agent: { skipBootstrap: true } }
``` ```
## Built-in tools (internal) ## Built-in tools
ps embedded core tools (read/bash/edit/write and related internals) are defined in code and always available. `TOOLS.md` does **not** control which tools exist; its guidance for how *you* want them used. Core tools (read/bash/edit/write and related system tools) are always available. `TOOLS.md` does **not** control which tools exist; its guidance for how *you* want them used.
## Skills ## Skills
@@ -63,18 +63,6 @@ Clawdbot reuses pieces of the p-mono codebase (models/tools), but **session mana
- No p-coding agent runtime. - No p-coding agent runtime.
- No `~/.pi/agent` or `<workspace>/.pi` settings are consulted. - No `~/.pi/agent` or `<workspace>/.pi` settings are consulted.
## Peter @ steipete (only)
Apply these notes **only** when the user is Peter Steinberger at steipete.
- Gateway runs on the **Mac Studio in London**.
- Primary work computer: **MacBook Pro**.
- Peter travels between **Vienna** and **London**; there are two networks bridged via **Tailscale**.
- For debugging, connect to the Mac Studio (London) or MacBook Pro (primary).
- There is also an **M1 MacBook Pro** on the Vienna tailnet you can use to access the Vienna network.
- Nodes can be accessed via the `clawdbot` binary (`pnpm clawdbot` in `~/Projects/clawdbot`).
- See also `skills/clawdbot*` for node/browser/canvas/cron usage.
## Sessions ## Sessions
Session transcripts are stored as JSONL at: Session transcripts are stored as JSONL at:

View File

@@ -3,58 +3,46 @@ summary: "WebSocket gateway architecture, components, and client flows"
read_when: read_when:
- Working on gateway protocol, clients, or transports - Working on gateway protocol, clients, or transports
--- ---
# Gateway Architecture # Gateway architecture
Last updated: 2026-01-05 Last updated: 2026-01-05
## Overview ## Overview
- A single long-lived **Gateway** process owns all messaging surfaces (WhatsApp via Baileys, Telegram via grammY, Slack via Bolt, Discord via discord.js, Signal via signal-cli, iMessage via imsg, WebChat) and the control/event plane.
- All clients (macOS app, CLI, web UI, automations) connect to the Gateway over one transport: **WebSocket on the configured bind host** (default `127.0.0.1:18789`; tunnel or VPN for remote).
- One Gateway per host; it is the only place that is allowed to open a WhatsApp session. All sends/agent runs go through it.
- By default: the Gateway exposes a Canvas host on `canvasHost.port` (default `18793`), serving `~/clawd/canvas` at `/__clawdbot__/canvas/` with live-reload; disable via `canvasHost.enabled=false` or `CLAWDBOT_SKIP_CANVAS_HOST=1`.
## Implementation snapshot (current code) - A single longlived **Gateway** owns all messaging surfaces (WhatsApp via
Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat).
### TypeScript Gateway ([`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts)) - All clients (macOS app, CLI, web UI, automations) connect to the Gateway over
- Single HTTP + WebSocket server (default `18789`); bind policy `loopback|lan|tailnet|auto`. Refuses non-loopback binds without auth; Tailscale serve/funnel requires loopback. **one transport: WebSocket** on the configured bind host (default
- Handshake: first frame must be a `connect` request; AJV validates request + params against TypeBox schemas; protocol negotiated via `minProtocol`/`maxProtocol`. `127.0.0.1:18789`).
- `hello-ok` includes snapshot (presence/health/stateVersion/uptime/configPath/stateDir), features (methods/events), policy (max payload/buffer/tick), and `canvasHostUrl` when available. - One Gateway per host; it is the only place that opens a WhatsApp session.
- Events emitted: `agent`, `chat`, `presence`, `tick`, `health`, `heartbeat`, `cron`, `talk.mode`, `node.pair.requested`, `node.pair.resolved`, `voicewake.changed`, `shutdown`. - A **bridge** (default `18790`) is used for nodes (macOS/iOS/Android).
- Idempotency keys are required for `send`, `agent`, `chat.send`, and node invokes; the dedupe cache avoids double-sends on reconnects. Payload sizes are capped per connection. - A **canvas host** (default `18793`) serves agenteditable HTML and A2UI.
- Optional node bridge ([`src/infra/bridge/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/bridge/server.ts)): TCP JSONL frames (`hello`, `pair-request`, `req/res`, `event`, `invoke`, `ping`). Node connect/disconnect updates presence and flows into the session bus.
- Control UI + Canvas host: HTTP serves Control UI (base path configurable) and can host the A2UI canvas via [`src/canvas-host/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/canvas-host/server.ts) (live reload). Canvas host URL is advertised to nodes + clients.
### iOS node (`apps/ios`)
- Discovery + pairing: `BridgeDiscoveryModel` uses `NWBrowser` Bonjour discovery and reads TXT fields for LAN/tailnet host hints plus gateway/bridge/canvas ports.
- Auto-connect: `BridgeConnectionController` uses stored `node.instanceId` + Keychain token; supports manual host/port; performs `pair-and-hello`.
- Bridge runtime: `BridgeSession` actor owns an `NWConnection`, JSONL frames, `hello`/`hello-ok`, ping/pong, `req/res`, server `event`s, and `invoke` callbacks; stores `canvasHostUrl`.
- Commands: `NodeAppModel` executes `canvas.*`, `canvas.a2ui.*`, `camera.*`, `screen.record`, `location.get`. Canvas/camera/screen are blocked when backgrounded.
- Canvas + actions: `WKWebView` with A2UI action bridge; accepts actions from local-network or trusted file URLs; intercepts `clawdbot://` deep links and forwards `agent.request` to the bridge.
- Voice/talk: voice wake sends `voice.transcript` events and syncs triggers via `voicewake.get` + `voicewake.changed`; Talk Mode attaches to the bridge.
### Android node (`apps/android`)
- Discovery + pairing: `BridgeDiscovery` uses mDNS/NSD to find `_clawdbot-bridge._tcp`, with manual host/port fallback.
- Auto-connect: `NodeRuntime` restores a stored token, performs `pair-and-hello`, and reconnects to the last discovered or manual bridge.
- Bridge runtime: `BridgeSession` owns the TCP JSONL session (`hello`/`hello-ok`, ping/pong, `req/res`, `event`, `invoke`); stores `canvasHostUrl`.
- Commands: `NodeRuntime` executes `canvas.*`, `canvas.a2ui.*`, `camera.*`, and chat/session events; foreground-only for canvas/camera.
## Components and flows ## Components and flows
- **Gateway (daemon)**
- Maintains WhatsApp (Baileys), Telegram (grammY), Slack (Bolt), Discord (discord.js), Signal (signal-cli), and iMessage (imsg) connections. ### Gateway (daemon)
- Exposes a typed WS API (req/resp + server push events). - Maintains provider connections.
- Validates every inbound frame against JSON Schema; rejects anything before a mandatory `connect`. - Exposes a typed WS API (requests, responses, serverpush events).
- **Clients (mac app / CLI / web admin)** - Validates inbound frames against JSON Schema.
- One WS connection per client. - Emits events like `agent`, `chat`, `presence`, `health`, `heartbeat`, `cron`.
- Send requests (`health`, `status`, `send`, `agent`, `system-presence`, toggles) and subscribe to events (`tick`, `agent`, `presence`, `shutdown`).
- On macOS, the app can also be invoked via deep links (`clawdbot://agent?...`) which translate into the same Gateway `agent` request path (see [`docs/macos.md`](/platforms/macos)). ### Clients (mac app / CLI / web admin)
- **Agent process (Pi)** - One WS connection per client.
- Spawned by the Gateway on demand for `agent` calls; streams events back over the same WS connection. - Send requests (`health`, `status`, `send`, `agent`, `system-presence`).
- **WebChat** - Subscribe to events (`tick`, `agent`, `presence`, `shutdown`).
- Serves static assets locally.
- Holds a single WS connection to the Gateway for control/data; all sends/agent runs go through the Gateway WS. ### Nodes (macOS / iOS / Android)
- Remote use goes through the same SSH/Tailscale tunnel as other clients. - Connect to the **bridge** (TCP JSONL) rather than the WS server.
- Pair with the Gateway to receive a token.
- Expose commands like `canvas.*`, `camera.*`, `screen.record`, `location.get`.
### WebChat
- Static UI that uses the Gateway WS API for chat history and sends.
- In remote setups, connects through the same SSH/Tailscale tunnel as other
clients.
## Connection lifecycle (single client) ## Connection lifecycle (single client)
``` ```
Client Gateway Client Gateway
| | | |
@@ -62,8 +50,8 @@ Client Gateway
|<------ res (ok) ---------| (or res error + close) |<------ res (ok) ---------| (or res error + close)
| (payload=hello-ok carries snapshot: presence + health) | (payload=hello-ok carries snapshot: presence + health)
| | | |
|<------ event:presence ---| (deltas) |<------ event:presence ---|
|<------ event:tick -------| (keepalive/no-op) |<------ event:tick -------|
| | | |
|------- req:agent ------->| |------- req:agent ------->|
|<------ res:agent --------| (ack: {runId,status:"accepted"}) |<------ res:agent --------| (ack: {runId,status:"accepted"})
@@ -71,44 +59,42 @@ Client Gateway
|<------ res:agent --------| (final: {runId,status,summary}) |<------ res:agent --------| (final: {runId,status,summary})
| | | |
``` ```
## Wire protocol (summary) ## Wire protocol (summary)
- Transport: WebSocket, text frames with JSON payloads. - Transport: WebSocket, text frames with JSON payloads.
- First frame must be `req {type:"req", id, method:"connect", params:{minProtocol, maxProtocol, client:{name,version,platform,mode,instanceId}, caps, auth?, locale?, userAgent? } }`. - First frame **must** be `connect`.
- Server replies `res {type:"res", id, ok:true, payload: hello-ok }` or `ok:false` then closes.
- After handshake: - After handshake:
- Requests: `{type:"req", id, method, params}``{type:"res", id, ok, payload|error}` - Requests: `{type:"req", id, method, params}``{type:"res", id, ok, payload|error}`
- Events: `{type:"event", event:"agent"|"presence"|"tick"|"shutdown", payload, seq?, stateVersion?}` - Events: `{type:"event", event, payload, seq?, stateVersion?}`
- If `CLAWDBOT_GATEWAY_TOKEN` (or `--token`) is set, `connect.params.auth.token` must match; otherwise the socket closes with policy violation. - If `CLAWDBOT_GATEWAY_TOKEN` (or `--token`) is set, `connect.params.auth.token`
- Presence payload is structured, not free text: `{host, ip, version, mode, lastInputSeconds?, ts, reason?, tags?[], instanceId? }`. must match or the socket closes.
- Agent runs are acked `{runId,status:"accepted"}` then complete with a final res `{runId,status,summary}`; streamed output arrives as `event:"agent"`. - Idempotency keys are required for sideeffecting methods (`send`, `agent`) to
- Protocol versions are bumped on breaking changes; clients must match `minClient`; Gateway chooses within clients min/max. safely retry; the server keeps a shortlived dedupe cache.
- Idempotency keys are required for side-effecting methods (`send`, `agent`) to safely retry; server keeps a short-lived dedupe cache.
- Policy in `hello-ok` communicates payload/queue limits and tick interval.
## Type system and codegen ## Protocol typing and codegen
- Source of truth: TypeBox (or ArkType) definitions in `protocol/` on the server.
- Build step emits JSON Schema.
- Clients:
- TypeScript: uses the same TypeBox types directly.
- Swift: generated `Codable` models via quicktype from the JSON Schema.
- Validation: AJV on the server for every inbound frame; optional client-side validation for defensive programming.
## Invariants - TypeBox schemas define the protocol.
- Exactly one Gateway controls a single Baileys session per host. No fallbacks to ad-hoc direct Baileys sends. - JSON Schema is generated from those schemas.
- Handshake is mandatory; any non-JSON or non-connect first frame is a hard close. - Swift models are generated from the JSON Schema.
- All methods and events are versioned; new fields are additive; breaking changes increment `protocol`.
- No event replay: on seq gaps, clients must refresh (`health` + `system-presence`) and continue; presence is bounded via TTL/max entries.
## Remote access ## Remote access
- Preferred: Tailscale or VPN; alternate: SSH tunnel `ssh -N -L 18789:127.0.0.1:18789 user@host`.
- Same protocol over the tunnel; same handshake. If a shared token is configured, clients must send it in `connect.params.auth.token` even over the tunnel. - Preferred: Tailscale or VPN.
- Same protocol over the tunnel; same handshake. If a shared token is configured, clients must send it in `connect.params.auth.token` even over the tunnel. - Alternative: SSH tunnel
```bash
ssh -N -L 18789:127.0.0.1:18789 user@host
```
- The same handshake + auth token apply over the tunnel.
## Operations snapshot ## Operations snapshot
- Start: `clawdbot gateway` (foreground, logs to stdout).
Supervise with launchd/systemd for restarts.
- Health: request `health` over WS; also surfaced in `hello-ok.health`.
- Metrics/logging: keep outside this spec; gateway should expose Prometheus text or structured logs separately.
## Migration notes - Start: `clawdbot gateway` (foreground, logs to stdout).
- This architecture supersedes the legacy stdin RPC and the ad-hoc TCP control port. New clients should speak only the WS protocol. Legacy compatibility is intentionally dropped. - Health: `health` over WS (also included in `hello-ok`).
- Supervision: launchd/systemd for autorestart.
## Invariants
- Exactly one Gateway controls a single Baileys session per host.
- Handshake is mandatory; any nonJSON or nonconnect first frame is a hard close.
- Events are not replayed; clients must refresh on gaps.

View File

@@ -61,7 +61,6 @@ Only the owner number (from `whatsapp.allowFrom`, or the bots own E.164 when
4) Session-level directives (`/verbose on`, `/think high`, `/new` or `/reset`, `/compact`) apply only to that groups session; send them as standalone messages so they register. Your personal DM session remains independent. 4) Session-level directives (`/verbose on`, `/think high`, `/new` or `/reset`, `/compact`) apply only to that groups session; send them as standalone messages so they register. Your personal DM session remains independent.
## Testing / verification ## Testing / verification
- Automated: `pnpm test -- src/web/auto-reply.test.ts --runInBand` (covers mention gating, history injection, sender suffix).
- Manual smoke: - Manual smoke:
- Send an `@clawd` ping in the group and confirm a reply that references the sender name. - Send an `@clawd` ping in the group and confirm a reply that references the sender name.
- Send a second ping and verify the history block is included then cleared on the next turn. - Send a second ping and verify the history block is included then cleared on the next turn.

View File

@@ -1,157 +1,114 @@
--- ---
summary: "Plan for models CLI: scan, list, aliases, fallbacks, status" summary: "Models CLI: list, set, aliases, fallbacks, scan, status"
read_when: read_when:
- Adding or modifying models CLI (models list/set/scan/aliases/fallbacks) - Adding or modifying models CLI (models list/set/scan/aliases/fallbacks)
- Changing model fallback behavior or selection UX - Changing model fallback behavior or selection UX
- Updating model scan probes (tools/images) - Updating model scan probes (tools/images)
--- ---
# Models CLI plan # Models CLI
See [`docs/model-failover.md`](/concepts/model-failover) for how auth profiles rotate (OAuth vs API keys), cooldowns, and how that interacts with model fallbacks. See [/concepts/model-failover](/concepts/model-failover) for auth profile
rotation, cooldowns, and how that interacts with fallbacks.
Goal: give clear model visibility + control (configured vs available), plus scan tooling ## How model selection works
that prefers tool-call + image-capable models and maintains ordered fallbacks.
## How Clawdbot models work (quick explainer)
Clawdbot selects models in this order: Clawdbot selects models in this order:
1) The configured **primary** model (`agent.model.primary`).
2) If it fails, fallbacks in `agent.model.fallbacks` (in order).
3) Auth failover happens **inside** the provider first (see [/concepts/model-failover](/concepts/model-failover)).
Key pieces: 1) **Primary** model (`agent.model.primary` or `agent.model`).
- `provider/model` is the canonical model id (e.g. `anthropic/claude-opus-4-5`). 2) **Fallbacks** in `agent.model.fallbacks` (in order).
- `agent.models` is the **allowlist/catalog** of models Clawdbot can use, with optional aliases and provider params. 3) **Provider auth failover** happens inside a provider before moving to the
- `agent.imageModel` is only used when the primary model **cant** accept images. next model.
- `models.providers` lets you add custom providers + models (written to `models.json`).
- `/model <id>` switches the active model for the current session; `/model list` shows whats allowed.
Related: Related:
- Context limits are model-specific; long sessions may trigger compaction. See [/concepts/compaction](/concepts/compaction). - `agent.models` is the allowlist/catalog of models Clawdbot can use (plus aliases).
- `agent.imageModel` is used **only when** the primary model cant accept images.
## Model recommendations ## Config keys (overview)
- [Claude Opus 4.5](https://www.anthropic.com/claude/opus): default primary for assistant + general work. Its pricey and cap-prone, so consider the [Claude Max $200 subscription](https://www.anthropic.com/pricing/) if you live here. - `agent.model.primary` and `agent.model.fallbacks`
- [Claude Sonnet 4.5](https://www.anthropic.com/claude/sonnet): default fallback when Opus caps out. Similar behavior with fewer limit headaches. - `agent.imageModel.primary` and `agent.imageModel.fallbacks`
- [GPT-5.2-Codex](https://developers.openai.com/codex/models): recommended for coding and sub-agents. Prefer the [Codex CLI](https://developers.openai.com/codex/cli) if you want the strongest feel. - `agent.models` (allowlist + aliases + provider params)
- `models.providers` (custom providers written into `models.json`)
Suggested stacks: Model refs are normalized to lowercase. Provider aliases like `z.ai/*` normalize
- Assistant-first: Opus 4.5 primary → Sonnet 4.5 fallback. to `zai/*`.
- Agentic coding: Opus 4.5 primary → GPT-5.2-Codex for sub-agents → Sonnet 4.5 fallback.
## Model discussions (community notes) ## CLI commands
Anecdotal notes from the Discord thread on January 45, 2026. Treat as “reported by users,” not a benchmark. ```bash
clawdbot models list
clawdbot models status
clawdbot models set <provider/model>
clawdbot models set-image <provider/model>
**Reported working well** clawdbot models aliases list
- [Claude Opus 4.5](https://www.anthropic.com/claude/opus): best overall quality in Clawdbot, especially for “assistant” work. Tradeoff is cost and hitting usage limits quickly. clawdbot models aliases add <alias> <provider/model>
- [Claude Sonnet 4.5](https://www.anthropic.com/claude/sonnet): common fallback when Opus caps out. Similar behavior with fewer limit headaches. clawdbot models aliases remove <alias>
- [Gemini 3 Pro](https://deepmind.google/en/models/gemini/pro/): some users felt it maps well to Clawdbots structure. Vibe was “fits the framework” more than “best at everything.”
- [GLM](https://www.zhipuai.cn/en/): used successfully as a worker model under orchestration. Seen as strong for delegated/secondary tasks, not the primary brain.
- [MiniMax M2.1](https://platform.minimax.io/docs/guides/models-intro): “good enough” for grunt work or a cheap fallback. Community nickname was “Temu-Sonnet,” i.e. usable but not Sonnet-level polish.
**Mixed / unclear** clawdbot models fallbacks list
- [Antigravity](https://blog.google/technology/ai/google-ai-updates-november-2025/) (Claude Opus access): some reported extra Opus quota. Pricing/limits were unclear, so the value is hard to predict. clawdbot models fallbacks add <provider/model>
clawdbot models fallbacks remove <provider/model>
clawdbot models fallbacks clear
**Reported weak in Clawdbot** clawdbot models image-fallbacks list
- [GPT-5.2-Codex](https://developers.openai.com/codex/models) inside Clawdbot: reported as rough for conversation/assistant tasks when embedded. Same notes said Codex felt stronger via the [Codex CLI](https://developers.openai.com/codex/cli) than embedded use. clawdbot models image-fallbacks add <provider/model>
- [Grok](https://docs.x.ai/docs/models/grok-4): people tried it and then abandoned it. No strong upside showed up in the notes. clawdbot models image-fallbacks remove <provider/model>
clawdbot models image-fallbacks clear
```
**Theme** `clawdbot models` (no subcommand) is a shortcut for `models status`.
- Token burn feels higher than expected in long sessions; people suspect context buildup + tool outputs. Pruning/compaction helps. Check session logs before blaming providers. See [/concepts/session](/concepts/session) and [/concepts/model-failover](/concepts/model-failover).
Want a tailored stack? Share whether youre using Clawdbot or Clawdis and your main workload (agentic coding vs “assistant” work), and we can suggest a primary + fallback set based on these reports. ### `models list`
## Models CLI Shows configured models by default. Useful flags:
See [/cli](/cli) for the full command tree and CLI flags. - `--all`: full catalog
- `--local`: local providers only
- `--provider <name>`: filter by provider
- `--plain`: one model per line
- `--json`: machinereadable output
### CLI output (list + status) ### `models status`
`clawdbot models list` (default) prints a table with these columns: Shows the resolved primary model, fallbacks, image model, and an auth overview
- `Model`: `provider/model` key (truncated in TTY). of configured providers. `--plain` prints only the resolved primary model.
- `Input`: `text` or `text+image`.
- `Ctx`: context window in K tokens (from the model registry).
- `Local`: `yes/no` when the provider base URL is local.
- `Auth`: `yes/no` when the provider has usable auth.
- `Tags`: origin + role hints.
Common tags: ## Scanning (OpenRouter free models)
- `default` — resolved default model.
- `fallback#N``agent.model.fallbacks` order.
- `image``agent.imageModel.primary`.
- `img-fallback#N``agent.imageModel.fallbacks` order.
- `configured` — present in `agent.models`.
- `alias:<name>` — alias from `agent.models.*.alias`.
- `missing` — referenced in config but not found in the registry.
Output formats: `clawdbot models scan` inspects OpenRouters **free model catalog** and can
- `--plain`: prints only `provider/model` keys (one per line). optionally probe models for tool and image support.
- `--json`: `{ count, models: [{ key, name, input, contextWindow, local, available, tags, missing }] }`.
`clawdbot models status` prints the resolved defaults, fallbacks, image model, aliases, Key flags:
and an **Auth overview** section showing which providers have profiles/env/models.json keys.
`--plain` prints the resolved default model only; `--json` returns a structured object for tooling.
## Config changes - `--no-probe`: skip live probes (metadata only)
- `--min-params <b>`: minimum parameter size (billions)
- `--max-age-days <days>`: skip older models
- `--provider <name>`: provider prefix filter
- `--max-candidates <n>`: fallback list size
- `--set-default`: set `agent.model.primary` to the first selection
- `--set-image`: set `agent.imageModel.primary` to the first image selection
- `agent.models` (configured model catalog + aliases). Probing requires an OpenRouter API key (from auth profiles or
- `agent.models.*.params` (provider-specific API params passed through to requests). `OPENROUTER_API_KEY`). Without a key, use `--no-probe` to list candidates only.
- `agent.model.primary` + `agent.model.fallbacks`.
- `agent.imageModel.primary` + `agent.imageModel.fallbacks` (optional).
- `auth.profiles` + `auth.order` for per-provider auth failover.
## Scan behavior (models scan) Scan results are ranked by:
1) Image support
2) Tool latency
3) Context size
4) Parameter count
<<<<<<< HEAD
Input Input
- OpenRouter `/models` list (filter `:free`) - OpenRouter `/models` list (filter `:free`)
- Requires OpenRouter API key from auth profiles or `OPENROUTER_API_KEY` (see [/environment](/environment)) - Requires OpenRouter API key from auth profiles or `OPENROUTER_API_KEY` (see [/environment](/environment))
- Optional filters: `--max-age-days`, `--min-params`, `--provider`, `--max-candidates` - Optional filters: `--max-age-days`, `--min-params`, `--provider`, `--max-candidates`
- Probe controls: `--timeout`, `--concurrency` - Probe controls: `--timeout`, `--concurrency`
Probes (direct pi-ai complete) When run in a TTY, you can select fallbacks interactively. In noninteractive
- Tool-call probe (required): mode, pass `--yes` to accept defaults.
- Provide a dummy tool, verify tool call emitted.
- Image probe (preferred):
- Prompt includes 1x1 PNG; success if no "unsupported image" error.
Scoring/selection ## Models registry (`models.json`)
- Prefer models passing tool + image for text/tool fallbacks.
- Prefer image-only models for image tool fallback (even if tool probe fails).
- Rank by: image ok, then lower tool latency, then larger context, then params.
Interactive selection (TTY) Custom providers in `models.providers` are written into `models.json` under the
- Multiselect list with per-model stats: agent directory (default `~/.clawdbot/agents/<agentId>/models.json`). This file
- model id, tool ok, image ok, median latency, context, inferred params. is merged by default unless `models.mode` is set to `replace`.
- Pre-select top N (default 6).
- Non-TTY: auto-select; require `--yes`/`--no-input` to apply.
Output
- Writes `agent.model.fallbacks` ordered.
- Writes `agent.imageModel.fallbacks` ordered (image-capable models).
- Ensures `agent.models` entries exist for selected models.
- Optional `--set-default` to set `agent.model.primary`.
- Optional `--set-image` to set `agent.imageModel.primary`.
## Runtime fallback
- On model failure: try `agent.model.fallbacks` in order.
- Per-provider auth failover uses `auth.order` (or stored profile order) **before**
moving to the next model.
- Image routing uses `agent.imageModel` **only when configured** and the primary
model lacks image input.
- Persist last successful provider/model to session entry; auth profile success is global.
- See [`docs/model-failover.md`](/concepts/model-failover) for auth profile rotation, cooldowns, and timeout handling.
## Tests
- Unit: scan selection ordering + probe classification.
- CLI: list/aliases/fallbacks add/remove + scan writes config.
- Status: shows last used model + fallbacks.
## Docs
- Update [`docs/configuration.md`](/gateway/configuration) with `agent.models` + `agent.model` + `agent.imageModel`.
- Keep this doc current when CLI surface or scan logic changes.
- Note provider aliases like `z.ai/*` -> `zai/*` when relevant.
- Provider ids in model refs are normalized to lowercase.

View File

@@ -97,7 +97,7 @@ At runtime:
- if `expires` is in the future → use the stored access token - if `expires` is in the future → use the stored access token
- if expired → refresh (under a file lock) and overwrite the stored credentials - if expired → refresh (under a file lock) and overwrite the stored credentials
See implementation: `src/agents/auth-profiles.ts`. The refresh flow is automatic; you generally dont need to manage tokens manually.
## Multiple accounts (profiles) + routing ## Multiple accounts (profiles) + routing

View File

@@ -7,127 +7,93 @@ read_when:
--- ---
# Presence # Presence
Clawdbot “presence” is a lightweight, best-effort view of: Clawdbot “presence” is a lightweight, besteffort view of:
- The **Gateway** itself (one per host), and - the **Gateway** itself, and
- The **clients connected to the Gateway** (mac app, WebChat, CLI, etc.). - **clients connected to the Gateway** (mac app, WebChat, CLI, etc.)
Presence is used primarily to render the mac apps **Instances** tab and to provide quick operator visibility. Presence is used primarily to render the macOS apps **Instances** tab and to
provide quick operator visibility.
## The data model ## Presence fields (what shows up)
Presence entries are structured objects with (some) fields: Presence entries are structured objects with fields like:
- `instanceId` (optional but strongly recommended): stable client identity used for dedupe
- `host`: a human-readable name (often the machine name) - `instanceId` (optional but strongly recommended): stable client identity
- `ip`: best-effort IP address (may be missing or stale) - `host`: humanfriendly host name
- `ip`: besteffort IP address
- `version`: client version string - `version`: client version string
- `deviceFamily` (optional): hardware family like `iPad`, `iPhone`, `Mac` - `deviceFamily` / `modelIdentifier`: hardware hints
- `modelIdentifier` (optional): hardware model identifier like `iPad16,6` or `Mac16,6` - `mode`: `gateway`, `app`, `webchat`, `cli`, `node`, ...
- `mode`: e.g. `gateway`, `app`, `webchat`, `cli` - `lastInputSeconds`: “seconds since last user input” (if known)
- `lastInputSeconds` (optional): “seconds since last user input” for that client machine - `reason`: `self`, `connect`, `node-connected`, `periodic`, ...
- `reason`: a short marker like `self`, `connect`, `node-connected`, `node-disconnected`, `periodic`, `instances-refresh`
- `text`: legacy/debug summary string (kept for backwards compatibility and UI display)
- `ts`: last update timestamp (ms since epoch) - `ts`: last update timestamp (ms since epoch)
## Producers (where presence comes from) ## Producers (where presence comes from)
Presence entries are produced by multiple sources and then **merged**. Presence entries are produced by multiple sources and **merged**.
### 1) Gateway self entry ### 1) Gateway self entry
The Gateway seeds a “self” entry at startup so UIs always show at least the current gateway host. The Gateway always seeds a “self” entry at startup so UIs show the gateway host
even before any clients connect.
Implementation: [`src/infra/system-presence.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/system-presence.ts) (`initSelfPresence()`). ### 2) WebSocket connect
### 2) WebSocket connect (connection-derived presence) Every WS client begins with a `connect` request. On successful handshake the
Gateway upserts a presence entry for that connection.
Every WS client must begin with a `connect` request. On successful handshake, the Gateway upserts a presence entry for that connection. #### Why oneoff CLI commands dont show up
This is meant to answer: “Which clients are currently connected?” The CLI often connects for short, oneoff commands. To avoid spamming the
Instances list, `client.mode === "cli"` is **not** turned into a presence entry.
Implementation: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) (connect handling uses `connect.params.client.instanceId` when provided; otherwise falls back to `connId`). ### 3) `system-event` beacons
#### Why one-off CLI commands do not show up Clients can send richer periodic beacons via the `system-event` method. The mac
app uses this to report host name, IP, and `lastInputSeconds`.
The CLI connects to the Gateway to execute one-off commands (health/status/send/agent/etc.). These are not “nodes” and would spam the Instances list, so the Gateway does not create presence entries for clients with `client.mode === "cli"`. ### 4) Node bridge beacons
### 3) `system-event` beacons (client-reported presence)
Clients can publish richer periodic beacons via the `system-event` method. The mac app uses this to report:
- a human-friendly host name
- its best-known IP address
- `lastInputSeconds`
Implementation:
- Gateway: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) handles method `system-event` by calling `updateSystemPresence(...)`.
- mac app beaconing: [`apps/macos/Sources/Clawdbot/PresenceReporter.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/PresenceReporter.swift).
### 4) Node bridge beacons (gateway-owned presence)
When a node bridge connection authenticates, the Gateway emits a presence entry When a node bridge connection authenticates, the Gateway emits a presence entry
for that node and starts periodic refresh beacons so it does not expire. for that node and refreshes it periodically so it doesnt expire.
- Connect/disconnect markers: `node-connected`, `node-disconnected`
- Periodic heartbeat: every 3 minutes (`reason: periodic`)
Implementation: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) (node bridge handlers + timer beacons).
## Merge + dedupe rules (why `instanceId` matters) ## Merge + dedupe rules (why `instanceId` matters)
All producers write into a single in-memory presence map. Presence entries are stored in a single inmemory map:
Key points: - Entries are keyed by a **presence key**.
- Entries are **keyed** by a “presence key”. If two producers use the same key, they update the same entry. - The best key is a stable `instanceId` that survives restarts.
- The best key is a stable, opaque `instanceId` that does not change across restarts. - Keys are caseinsensitive.
- Keys are treated case-insensitively.
Implementation: [`src/infra/system-presence.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/system-presence.ts) (`normalizePresenceKey()`). If a client reconnects without a stable `instanceId`, it may show up as a
**duplicate** row.
### mac app identity (stable UUID) ## TTL and bounded size
The mac app uses a persisted UUID as `instanceId` so: Presence is intentionally ephemeral:
- restarts/reconnects do not create duplicates
- renaming the Mac does not create a new “instance”
- debug/release builds can share the same identity
Implementation: [`apps/macos/Sources/Clawdbot/InstanceIdentity.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/InstanceIdentity.swift). - **TTL:** entries older than 5 minutes are pruned
- **Max entries:** 200 (oldest dropped first)
`displayName` (machine name) is used for UI, while `instanceId` is used for dedupe. This keeps the list fresh and avoids unbounded memory growth.
## TTL and bounded size (why stale rows disappear)
Presence entries are not permanent:
- TTL: entries older than 5 minutes are pruned
- Max: map is capped at 200 entries (LRU by `ts`)
Implementation: [`src/infra/system-presence.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/system-presence.ts) (`TTL_MS`, `MAX_ENTRIES`, pruning in `listSystemPresence()`).
## Remote/tunnel caveat (loopback IPs) ## Remote/tunnel caveat (loopback IPs)
When a client connects over an SSH tunnel / local port forward, the Gateway may see the remote address as loopback (`127.0.0.1`). When a client connects over an SSH tunnel / local port forward, the Gateway may
see the remote address as `127.0.0.1`. To avoid overwriting a good clientreported
IP, loopback remote addresses are ignored.
To avoid degrading an otherwise-correct client beacon IP, the Gateway avoids writing loopback remote addresses into presence entries. ## Consumers
Implementation: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) (`isLoopbackAddress()`).
## Consumers (who reads presence)
### macOS Instances tab ### macOS Instances tab
The mac apps Instances tab renders the result of `system-presence`. The macOS app renders the output of `system-presence` and applies a small status
indicator (Active/Idle/Stale) based on the age of the last update.
Implementation:
- View: [`apps/macos/Sources/Clawdbot/InstancesSettings.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/InstancesSettings.swift)
- Store: [`apps/macos/Sources/Clawdbot/InstancesStore.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/InstancesStore.swift)
The Instances rows show a small presence indicator (Active/Idle/Stale) based on
the last beacon age. The label is derived from the entry timestamp (`ts`).
The store refreshes periodically and also applies `presence` WS events.
## Debugging tips ## Debugging tips
- To see the raw list, call `system-presence` against the gateway. - To see the raw list, call `system-presence` against the Gateway.
- If you see duplicates: - If you see duplicates:
- confirm clients send a stable `instanceId` in the handshake (`connect.params.client.instanceId`) - confirm clients send a stable `instanceId` in the handshake
- confirm beaconing uses the same `instanceId` - confirm periodic beacons use the same `instanceId`
- check whether the connection-derived entry is missing `instanceId` (then it will be keyed by `connId` and duplicates are expected on reconnect) - check whether the connectionderived entry is missing `instanceId` (duplicates are expected)

View File

@@ -3,24 +3,97 @@ summary: "Routing rules per provider (WhatsApp, Telegram, Discord, web) and shar
read_when: read_when:
- Changing provider routing or inbox behavior - Changing provider routing or inbox behavior
--- ---
# Providers & Routing # Providers & routing
Updated: 2026-01-06 Updated: 2026-01-06
Goal: deterministic replies per provider, while supporting multi-agent + multi-account routing. Clawdbot routes replies **back to the provider where a message came from**. The
model does not choose a provider; routing is deterministic and controlled by the
host configuration.
- **Provider**: provider label (`whatsapp`, `webchat`, `telegram`, `discord`, `signal`, `imessage`, …). Routing is fixed: replies go back to the origin provider; the model doesnt choose. ## Key terms
- **AccountId**: provider account instance (e.g. WhatsApp account `"default"` vs `"work"`). Not every provider supports multi-account yet.
- **AgentId**: one isolated “brain” (workspace + per-agent agentDir + per-agent session store). - **Provider**: `whatsapp`, `telegram`, `discord`, `slack`, `signal`, `imessage`, `webchat`.
- **Reply context:** inbound replies include `ReplyToId`, `ReplyToBody`, and `ReplyToSender`, and the quoted context is appended to `Body` as a `[Replying to ...]` block. - **AccountId**: perprovider account instance (when supported).
- **Canonical direct session (per agent):** direct chats collapse to `agent:<agentId>:<mainKey>` (default `main`). Groups/channels stay isolated per agent: - **AgentId**: an isolated workspace + session store (“brain”).
- group: `agent:<agentId>:<provider>:group:<id>` - **SessionKey**: the internal bucket key used to store context and control concurrency.
- channel/room: `agent:<agentId>:<provider>:channel:<id>`
- Telegram forum topics: `agent:<agentId>:telegram:group:<chatId>:topic:<threadId>` ## Session key shapes (examples)
- **Session store:** per-agent store lives under `~/.clawdbot/agents/<agentId>/sessions/sessions.json` (override via `session.store` with `{agentId}` templating). JSONL transcripts live next to it.
- **WebChat:** attaches to the selected agents main session (so desktop reflects cross-provider history for that agent). Direct messages collapse to the agents **main** session:
- **Implementation hints:**
- Set `Provider` + `AccountId` in each ingress. - `agent:<agentId>:<mainKey>` (default: `agent:main:main`)
- Route inbound to an agent via `routing.bindings` (match on `provider`, `accountId`, plus optional peer/guild/team).
- Keep routing deterministic: originate → same provider. Use the gateway WebSocket for sends; avoid side channels. Groups and channels remain isolated per provider:
- Do not let the agent emit “send to X” decisions; keep that policy in the host code.
- Groups: `agent:<agentId>:<provider>:group:<id>`
- Channels/rooms: `agent:<agentId>:<provider>:channel:<id>`
Threads:
- Slack/Discord threads append `:thread:<threadId>` to the base key.
- Telegram forum topics embed `:topic:<topicId>` in the group key.
Examples:
- `agent:main:telegram:group:-1001234567890:topic:42`
- `agent:main:discord:channel:123456:thread:987654`
## Routing rules (how an agent is chosen)
Routing picks **one agent** for each inbound message:
1. **Exact peer match** (`routing.bindings` with `peer.kind` + `peer.id`).
2. **Guild match** (Discord) via `guildId`.
3. **Team match** (Slack) via `teamId`.
4. **Account match** (`accountId` on the provider).
5. **Provider match** (any account on that provider).
6. **Default agent** (`routing.defaultAgentId`, fallback to `main`).
The matched agent determines which workspace and session store are used.
## Config overview
- `routing.defaultAgentId`: default agent when no binding matches.
- `routing.agents`: named agent definitions (workspace, model, etc.).
- `routing.bindings`: map inbound providers/accounts/peers to agents.
Example:
```json5
{
routing: {
defaultAgentId: "main",
agents: {
support: { name: "Support", workspace: "~/clawd-support" }
},
bindings: [
{ match: { provider: "slack", teamId: "T123" }, agentId: "support" },
{ match: { provider: "telegram", peer: { kind: "group", id: "-100123" } }, agentId: "support" }
]
}
}
```
## Session storage
Session stores live under the state directory (default `~/.clawdbot`):
- `~/.clawdbot/agents/<agentId>/sessions/sessions.json`
- JSONL transcripts live alongside the store
You can override the store path via `session.store` and `{agentId}` templating.
## WebChat behavior
WebChat attaches to the **selected agent** and defaults to the agents main
session. Because of this, WebChat lets you see crossprovider context for that
agent in one place.
## Reply context
Inbound replies include:
- `ReplyToId`, `ReplyToBody`, and `ReplyToSender` when available.
- Quoted context is appended to `Body` as a `[Replying to ...]` block.
This is consistent across providers.

View File

@@ -12,7 +12,7 @@ We now serialize command-based auto-replies (WhatsApp Web listener) through a ti
- Serializing avoids competing for terminal/stdin, keeps logs readable, and reduces the chance of rate limits from upstream tools. - Serializing avoids competing for terminal/stdin, keeps logs readable, and reduces the chance of rate limits from upstream tools.
## How it works ## How it works
- [`src/process/command-queue.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/process/command-queue.ts) holds a lane-aware FIFO queue and drains each lane synchronously. - A lane-aware FIFO queue drains each lane synchronously.
- `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session. - `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session.
- Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agent.maxConcurrent`. - Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agent.maxConcurrent`.
- When verbose logging is enabled, queued commands emit a short notice if they waited more than ~2s before starting. - When verbose logging is enabled, queued commands emit a short notice if they waited more than ~2s before starting.
@@ -74,4 +74,4 @@ Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.
## Troubleshooting ## Troubleshooting
- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining. - If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
- `enqueueCommand` exposes a lightweight `getQueueSize()` helper if you need to surface queue depth in future diagnostics. - If you need queue depth, enable verbose logs and watch for queue timing lines.

View File

@@ -21,7 +21,7 @@ Goal: small, hard-to-misuse tool set so agents can list sessions, fetch history,
- Hooks use `hook:<uuid>` unless explicitly set. - Hooks use `hook:<uuid>` unless explicitly set.
- Node bridge uses `node-<nodeId>` unless explicitly set. - Node bridge uses `node-<nodeId>` unless explicitly set.
`global` and `unknown` are internal-only and never listed. If `session.scope = "global"`, we alias it to `main` for all tools so callers never see `global`. `global` and `unknown` are reserved values and are never listed. If `session.scope = "global"`, we alias it to `main` for all tools so callers never see `global`.
## sessions_list ## sessions_list
List sessions as an array of rows. List sessions as an array of rows.

View File

@@ -8,7 +8,7 @@ read_when:
ClaudeBot builds a custom system prompt for every agent run. The prompt is **Clawdbot-owned** and does not use the p-coding-agent default prompt. ClaudeBot builds a custom system prompt for every agent run. The prompt is **Clawdbot-owned** and does not use the p-coding-agent default prompt.
The prompt is assembled in `src/agents/system-prompt.ts` and injected by `src/agents/pi-embedded-runner.ts`. The prompt is assembled by Clawdbot and injected into each agent run.
## Structure ## Structure
@@ -56,9 +56,3 @@ Skills are **not** auto-injected. Instead, the prompt instructs the model to use
``` ```
This keeps the base prompt small while still enabling targeted skill usage. This keeps the base prompt small while still enabling targeted skill usage.
## Code references
- Prompt text: `src/agents/system-prompt.ts`
- Prompt assembly + injection: `src/agents/pi-embedded-runner.ts`
- Bootstrap trimming: `src/agents/pi-embedded-helpers.ts`

View File

@@ -3,40 +3,34 @@ summary: "TypeBox schemas as the single source of truth for the gateway protocol
read_when: read_when:
- Updating protocol schemas or codegen - Updating protocol schemas or codegen
--- ---
# TypeBox as Protocol Source of Truth # TypeBox as protocol source of truth
Last updated: 2025-12-09 Last updated: 2026-01-08
We use TypeBox schemas in [`src/gateway/protocol/schema.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/protocol/schema.ts) as the single source of truth for the Gateway control plane (connect/req/res/event frames and payloads). All derived artifacts should be generated from these schemas, not edited by hand. TypeBox schemas define the Gateway control plane (connect/req/res/event frames and
payloads). All generated artifacts must come from these schemas.
## Current pipeline ## Current pipeline
- **TypeBox → JSON Schema**: `pnpm protocol:gen` writes [`dist/protocol.schema.json`](https://github.com/clawdbot/clawdbot/blob/main/dist/protocol.schema.json) (draft-07) and runs AJV in the server tests. - `pnpm protocol:gen`
- **TypeBox → Swift**: `pnpm protocol:gen:swift` generates [`apps/macos/Sources/ClawdbotProtocol/GatewayModels.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/ClawdbotProtocol/GatewayModels.swift). - writes the JSON Schema output (draft07)
- `pnpm protocol:gen:swift`
- generates Swift gateway models
- `pnpm protocol:check`
- runs both generators and verifies the output is committed
## Problem ## Swift codegen behavior
- We want strong typing in Swift, including a sealed `GatewayFrame` enum with a discriminator and a forward-compatible `unknown` case. The Swift generator emits:
## Preferred plan (next step) - `GatewayFrame` enum with `req`, `res`, `event`, and `unknown` cases
- Strongly typed payload structs/enums
- `ErrorCode` values and `GATEWAY_PROTOCOL_VERSION`
- Add a small, custom Swift generator driven directly by the TypeBox schemas: Unknown frame types are preserved as raw payloads for forward compatibility.
- Emit a sealed `enum GatewayFrame: Codable { case req(RequestFrame), res(ResponseFrame), event(EventFrame) }`.
- Emit strongly typed payload structs/enums (`ConnectParams`, `HelloOk`, `RequestFrame`, `ResponseFrame`, `EventFrame`, `PresenceEntry`, `Snapshot`, `StateVersion`, `ErrorShape`, `AgentEvent`, `TickEvent`, `ShutdownEvent`, `SendParams`, `AgentParams`, `ErrorCode`, `PROTOCOL_VERSION`).
- Custom `init(from:)` / `encode(to:)` enforces the `type` discriminator and can include an `unknown` case for forward compatibility.
- Wire a new script (e.g., `pnpm protocol:gen:swift`) into `protocol:check` so CI fails if the generated Swift is stale.
Why this path: ## When you change schemas
- Single source of truth stays TypeBox; no new IDL to maintain.
- Predictable, strongly typed Swift (no optional soup).
- Small deterministic codegen (~150200 LOC script) we control.
## Alternative (if we want off-the-shelf codegen) 1) Update the TypeBox schemas.
2) Run `pnpm protocol:check`.
- Wrap the existing JSON Schema into an OpenAPI 3.1 doc (auto-generated) and use **swift-openapi-generator** or **openapi-generator swift5**. More moving parts, but also yields enums with discriminator support. Keep this as a fallback if we dont want a custom emitter. 3) Commit the regenerated schema + Swift models.
## Action items
- Implement `protocol:gen:swift` that reads the TypeBox schemas and emits the sealed Swift enum + payload structs.
- Update `protocol:check` to include the Swift generator output in the diff check.
- Remove quicktype output once the custom generator is in place (or keep it for docs only).

View File

@@ -8,11 +8,11 @@ read_when: "Changing onboarding wizard steps or config schema endpoints"
Purpose: shared onboarding + config surfaces across CLI, macOS app, and Web UI. Purpose: shared onboarding + config surfaces across CLI, macOS app, and Web UI.
## Components ## Components
- Wizard engine: `src/wizard` (session + prompts + onboarding state). - Wizard engine (shared session + prompts + onboarding state).
- CLI: [`src/commands/onboard-*.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/onboard-*.ts) uses the wizard with the CLI prompter. - CLI onboarding uses the same wizard flow as the UI clients.
- Gateway RPC: wizard + config schema endpoints serve UI clients. - Gateway RPC exposes wizard + config schema endpoints.
- macOS: SwiftUI onboarding uses the wizard step model. - macOS onboarding uses the wizard step model.
- Web UI: config form renders from JSON Schema + hints. - Web UI renders config forms from JSON Schema + UI hints.
## Gateway RPC ## Gateway RPC
- `wizard.start` params: `{ mode?: "local"|"remote", workspace?: string }` - `wizard.start` params: `{ mode?: "local"|"remote", workspace?: string }`

View File

@@ -28,44 +28,29 @@ Recent gateway logs show repeated `cron.add` failures with invalid parameters (m
- Agent cron tool schema allows arbitrary `job` objects, enabling malformed inputs. - Agent cron tool schema allows arbitrary `job` objects, enabling malformed inputs.
- Gateway strictly validates `cron.add` with no normalization, so wrapped payloads fail. - Gateway strictly validates `cron.add` with no normalization, so wrapped payloads fail.
## Proposed Approach ## What changed
1. **Normalize** incoming `cron.add` payloads (unwrap `data`/`job`, infer `schedule.kind` and `payload.kind`, default `wakeMode` + `sessionTarget` when safe).
2. **Harden** the agent cron tool schema using the canonical gateway `CronAddParamsSchema` and normalize before sending to the gateway.
3. **Align** provider enums and cron status fields across gateway schema, TS types, CLI descriptions, and UI form controls.
4. **Test** normalization in gateway tests and tool behavior in agent tests.
## Multi-phase Execution Plan - `cron.add` and `cron.update` now normalize common wrapper shapes and infer missing `kind` fields.
- Agent cron tool schema matches the gateway schema, which reduces invalid payloads.
- Provider enums are aligned across gateway, CLI, UI, and macOS picker.
- Control UI uses the gateways `jobs` count field for status.
### Phase 1 — Schema + type alignment ## Current behavior
- [x] Expand gateway `CronPayloadSchema` provider enum to include `signal` and `imessage`.
- [x] Update CLI `--provider` descriptions to include `slack` (already supported by gateway).
- [x] Update UI Cron payload/provider union types to include all supported providers.
- [x] Fix UI CronStatus type to match gateway (`jobs` instead of `jobCount`).
- [x] Update cron UI provider select to include Discord/Slack/Signal/iMessage.
- [x] Update macOS CronJobEditor provider picker + enum to include Slack/Signal/iMessage.
- [x] Document cron compatibility normalization policy in [`docs/cron-jobs.md`](/automation/cron-jobs).
### Phase 2 — Input normalization + tooling hardening - **Normalization:** wrapped `data`/`job` payloads are unwrapped; `schedule.kind` and `payload.kind` are inferred when safe.
- [x] Add shared cron input normalization helpers (`normalizeCronJobCreate`/`normalizeCronJobPatch`). - **Defaults:** safe defaults are applied for `wakeMode` and `sessionTarget` when missing.
- [x] Apply normalization in gateway `cron.add` (and patch normalization in `cron.update`). - **Providers:** Discord/Slack/Signal/iMessage are now consistently surfaced across CLI/UI.
- [x] Tighten agent cron tool schema to `CronAddParamsSchema` and normalize job/patch before sending.
### Phase 3 — Tests See [`docs/cron-jobs.md`](/automation/cron-jobs) for the normalized shape and examples.
- [x] Add gateway test covering wrapped `cron.add` payload normalization.
- [x] Add cron tool test to assert normalization and defaulting for `cron.add`.
- [x] Add gateway test covering `cron.update` normalization.
- [x] Add UI + Swift conformance test for cron channels + status fields.
### Phase 4 — Verification ## Verification
- [x] Run tests (full suite executed via `pnpm test -- cron-tool`).
## Rollout/Monitoring
- Watch gateway logs for reduced `cron.add` INVALID_REQUEST errors. - Watch gateway logs for reduced `cron.add` INVALID_REQUEST errors.
- Confirm Control UI cron status shows job count after refresh. - Confirm Control UI cron status shows job count after refresh.
- If errors persist, extend normalization for additional common shapes (e.g., `schedule.at`, `payload.message` without `kind`).
## Optional Follow-ups ## Optional Follow-ups
- Manual Control UI smoke: add cron job per provider + verify status job count.
- Manual Control UI smoke: add a cron job per provider + verify status job count.
## Open Questions ## Open Questions
- Should `cron.add` accept explicit `state` from clients (currently disallowed by schema)? - Should `cron.add` accept explicit `state` from clients (currently disallowed by schema)?

View File

@@ -1,126 +1,38 @@
--- ---
summary: "Spec: groupPolicy hardening for Telegram allowlist parity" summary: "Telegram allowlist hardening: prefix + whitespace normalization"
read_when: read_when:
- Reviewing historical Telegram allowlist normalization changes - Reviewing historical Telegram allowlist changes
--- ---
# Engineering Execution Spec: groupPolicy Hardening (Telegram Allowlist Parity) # Telegram Allowlist Hardening
**Date**: 2026-01-05 **Date**: 2026-01-05
**Status**: Complete **Status**: Complete
**PR**: #216 (feat/whatsapp-group-policy) **PR**: #216
--- ## Summary
## Executive Summary Telegram allowlists now accept `telegram:` and `tg:` prefixes case-insensitively, and tolerate
accidental whitespace. This aligns inbound allowlist checks with outbound send normalization.
Follow-up hardening work ensures Telegram allowlists behave consistently across inbound group/DM filtering and outbound send normalization. The focus is on prefix parity (`telegram:` / `tg:`), case-insensitive matching for prefixes, and resilience to accidental whitespace in config entries. Documentation and tests were updated to reflect and lock in this behavior. ## What changed
--- - Prefixes `telegram:` and `tg:` are treated the same (case-insensitive).
- Allowlist entries are trimmed; empty entries are ignored.
## Findings Analysis ## Examples
### [MED] F1: Telegram Allowlist Prefix Handling Is Case-Sensitive and Excludes `tg:` All of these are accepted for the same ID:
**Location**: [`src/telegram/bot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.ts) - `telegram:123456`
- `TG:123456`
- ` tg:123456 `
**Problem**: Inbound allowlist normalization only stripped a lowercase `telegram:` prefix. This rejected `TG:123` / `Telegram:123` and did not accept the `tg:` shorthand even though outbound send normalization already accepts `tg:` and case-insensitive prefixes. ## Why it matters
**Impact**: Copy/paste from logs or chat IDs often includes prefixes and whitespace. Normalizing avoids
- DMs and group allowlists fail when users copy/paste prefixed IDs from logs or existing send format. false negatives when deciding whether to respond in DMs or groups.
- Behavior is inconsistent between inbound filtering and outbound send normalization.
**Fix**: Normalize allowlist entries by trimming whitespace and stripping `telegram:` / `tg:` prefixes case-insensitively at pre-compute time. ## Related docs
--- - [Group Chats](/concepts/groups)
- [Telegram Provider](/providers/telegram)
### [LOW] F2: Allowlist Entries Are Not Trimmed
**Location**: [`src/telegram/bot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.ts)
**Problem**: Allowlist entries are not trimmed; accidental whitespace causes mismatches.
**Fix**: Trim and drop empty entries while normalizing allowlist inputs.
---
## Implementation Phases
### Phase 1: Normalize Telegram Allowlist Inputs
**File**: [`src/telegram/bot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.ts)
**Changes**:
1. Trim allowlist entries and drop empty values.
2. Strip `telegram:` / `tg:` prefixes case-insensitively.
3. Simplify DM allowlist check to rely on normalized values.
---
### Phase 2: Add Coverage for Prefix + Whitespace
**File**: [`src/telegram/bot.test.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.test.ts)
**Add Tests**:
- DM allowlist accepts `TG:` prefix with surrounding whitespace.
- Group allowlist accepts `TG:` prefix case-insensitively.
---
### Phase 3: Documentation Updates
**Files**:
- [`docs/groups.md`](/concepts/groups)
- [`docs/telegram.md`](/providers/telegram)
**Changes**:
- Document `tg:` alias and case-insensitive prefixes for Telegram allowlists.
---
### Phase 4: Verification
1. Run targeted Telegram tests (`pnpm test -- src/telegram/bot.test.ts`).
2. If time allows, run full suite (`pnpm test`).
---
## Files Modified
| File | Change Type | Description |
|------|-------------|-------------|
| [`src/telegram/bot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.ts) | Fix | Trim allowlist values; strip `telegram:` / `tg:` prefixes case-insensitively |
| [`src/telegram/bot.test.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.test.ts) | Test | Add DM + group allowlist coverage for `TG:` prefix + whitespace |
| [`docs/groups.md`](/concepts/groups) | Docs | Mention `tg:` alias + case-insensitive prefixes |
| [`docs/telegram.md`](/providers/telegram) | Docs | Mention `tg:` alias + case-insensitive prefixes |
---
## Success Criteria
- [x] Telegram allowlist accepts `telegram:` / `tg:` prefixes case-insensitively.
- [x] Telegram allowlist tolerates whitespace in config entries.
- [x] DM and group allowlist tests cover prefixed cases.
- [x] Docs updated to reflect allowlist formats.
- [x] Targeted tests pass.
- [x] Full test suite passes.
---
## Risk Assessment
| Risk | Severity | Mitigation |
|------|----------|------------|
| Behavior change for malformed entries | Low | Normalization is additive and trims only whitespace |
| Test fragility | Low | Isolated unit tests; no external dependencies |
| Doc drift | Low | Updated docs alongside code |
---
## Estimated Complexity
- **Phase 1**: Low (normalization helpers)
- **Phase 2**: Low (2 new tests)
- **Phase 3**: Low (doc edits)
- **Phase 4**: Low (verification)
**Total**: ~20 minutes

View File

@@ -1,147 +1,32 @@
--- ---
summary: "Proposal: model config, auth profiles, and fallback behavior" summary: "Exploration: model config, auth profiles, and fallback behavior"
read_when: read_when:
- Designing model selection, auth profiles, or fallback behavior - Exploring future model selection + auth profile ideas
- Migrating model config schema
--- ---
# Model Config (Exploration)
# Model config proposal This document captures **ideas** for future model configuration. It is not a
shipping spec. For current behavior, see:
- [Models](/concepts/models)
- [Model failover](/concepts/model-failover)
- [OAuth + profiles](/concepts/oauth)
Goals ## Motivation
- Multi OAuth + multi API key per provider
- Model selection via `/model` with sensible fallback
- Global (not per-session) fallback logic
- Keep last-known-good auth profile when switching models
- Profile override only when explicitly requested
- Image routing override only when explicitly configured
Non-goals (v1) Operators want:
- Auto-discovery of provider capabilities beyond catalog input tags - Multiple auth profiles per provider (personal vs work).
- Per-model auth profile order (see open questions) - Simple `/model` selection with predictable fallbacks.
- Clear separation between text models and image-capable models.
## Proposed config shape ## Possible direction (high level)
```json - Keep model selection simple: `provider/model` with optional aliases.
{ - Let providers have multiple auth profiles, with an explicit order.
"auth": { - Use a global fallback list so all sessions fail over consistently.
"profiles": { - Only override image routing when explicitly configured.
"anthropic:default": {
"provider": "anthropic",
"mode": "oauth"
},
"anthropic:work": {
"provider": "anthropic",
"mode": "api_key"
},
"openai:default": {
"provider": "openai",
"mode": "oauth"
}
},
"order": {
"anthropic": ["anthropic:default", "anthropic:work"],
"openai": ["openai:default"]
}
},
"agent": {
"models": {
"anthropic/claude-opus-4-5": {
"alias": "Opus"
},
"openai/gpt-5.2": {
"alias": "gpt52"
}
},
"model": {
"primary": "anthropic/claude-opus-4-5",
"fallbacks": ["openai/gpt-5.2"]
},
"imageModel": {
"primary": "openai/gpt-5.2",
"fallbacks": ["anthropic/claude-opus-4-5"]
}
}
}
```
Notes ## Open questions
- Canonical model keys are full `provider/model`.
- `alias` optional; used by `/model` resolution.
- `auth.profiles` is keyed. Default CLI login creates `provider:default`.
- `auth.order[provider]` controls rotation order for that provider.
## CLI / UX - Should profile rotation be per-provider or per-model?
- How should the UI surface profile selection for a session?
Login - What is the safest migration path from legacy config keys?
- `clawdbot login anthropic` → create/replace `anthropic:default`.
- `clawdbot login anthropic --profile work` → create/replace `anthropic:work`.
Model selection
- `/model Opus` → resolve alias to full id.
- `/model anthropic/claude-opus-4-5` → explicit.
- Optional: `/model Opus@anthropic:work` (explicit profile override for session only).
Model listing
- `/model` list shows:
- model id
- alias
- provider
- auth order (from `auth.order`)
- auth source for the current provider (auth-profiles.json/env/shell env/models.json)
## Fallback behavior (global)
Fallback list
- Use `agent.model.fallbacks` globally.
- No per-session fallback list; last-known-good is per-session but uses global ordering.
Auth profile rotation
- If provider auth error (401/403/invalid refresh):
- advance to next profile in `auth.order[provider]`.
- if all fail, fall back to next model.
Rate limiting
- If rate limit / quota error:
- rotate auth profile first (same provider)
- if still failing, fall back to next model.
Model not found / capability mismatch
- immediate model fallback.
## Image routing
Rule
- Only use `agent.imageModel` when explicitly configured.
- If `agent.imageModel` is configured and the current text model lacks image input, use it.
Support detection
- From model catalog `input` tags when available (e.g. `image` in models.json).
- If unknown: treat as text-only and use `agent.imageModel`.
## Migration (doctor + gateway auto-run)
Inputs
- Legacy keys (pre-migration):
- `agent.model` (string)
- `agent.modelFallbacks` (string[])
- `agent.imageModel` (string)
- `agent.imageModelFallbacks` (string[])
- `agent.allowedModels` (string[])
- `agent.modelAliases` (record)
Outputs
- `agent.models` map with keys for all referenced models
- `agent.model.primary/fallbacks`
- `agent.imageModel.primary/fallbacks`
- Auth profile store seeded from current auth-profiles.json/auth.json + oauth.json + env (as `provider:default`)
- `auth.order` seeded with `["provider:default"]` when config is updated
Auto-run
- Gateway start detects legacy keys and runs doctor migration.
## Decisions
- Auth order is per-provider (`auth.order`).
- Doctor migration is required; gateway will auto-run on startup when legacy keys detected.
- `/model Opus@profile` is explicit session override only.
- Image routing override only when `agent.imageModel` is explicitly configured.

View File

@@ -1,12 +1,12 @@
--- ---
summary: "Proposal + research notes: offline memory system for Clawd workspaces (Markdown source-of-truth + derived index)" summary: "Research notes: offline memory system for Clawd workspaces (Markdown source-of-truth + derived index)"
read_when: read_when:
- Designing workspace memory (~/clawd) beyond daily Markdown logs - Designing workspace memory (~/clawd) beyond daily Markdown logs
- Deciding: standalone CLI vs deep Clawdbot integration - Deciding: standalone CLI vs deep Clawdbot integration
- Adding offline recall + reflection (retain/recall/reflect) - Adding offline recall + reflection (retain/recall/reflect)
--- ---
# Workspace Memory v2 (offline): proposal + research # Workspace Memory v2 (offline): research notes
Target: Clawd-style workspace (`agent.workspace`, default `~/clawd`) where “memory” is stored as one Markdown file per day (`memory/YYYY-MM-DD.md`) plus a small set of stable files (e.g. `memory.md`, `SOUL.md`). Target: Clawd-style workspace (`agent.workspace`, default `~/clawd`) where “memory” is stored as one Markdown file per day (`memory/YYYY-MM-DD.md`) plus a small set of stable files (e.g. `memory.md`, `SOUL.md`).
@@ -171,8 +171,7 @@ Recommendation: **deep integration in Clawdbot**, but keep a separable core libr
- reuse from other contexts (local scripts, future desktop app, etc.) - reuse from other contexts (local scripts, future desktop app, etc.)
Shape: Shape:
- `src/memory/*` (library-ish core; pure functions + sqlite adapter) The memory tooling is intended to be a small CLI + library layer, but this is exploratory only.
- [`src/commands/memory/*.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/memory/*.ts) (CLI glue)
## “S-Collide” / SuCo: when to use it (research) ## “S-Collide” / SuCo: when to use it (research)
@@ -196,29 +195,13 @@ Open question:
- whats the **best** offline embedding model for “personal assistant memory” on your machines (MacBook + Castle)? - whats the **best** offline embedding model for “personal assistant memory” on your machines (MacBook + Castle)?
- if you already have Ollama: embed with a local model; otherwise ship a small embedding model in the toolchain. - if you already have Ollama: embed with a local model; otherwise ship a small embedding model in the toolchain.
## Implementation plan (phased, shippable) ## Smallest useful pilot
### Phase 0: workspace conventions (no code) If you want a minimal, still-useful version:
- add `bank/` files + entity pages
- add `## Retain` convention to daily logs
### Phase 1: `clawdbot memory index|recall` (FTS-only) - Add `bank/` entity pages and a `## Retain` section in daily logs.
- parse Markdown (`memory/*.md`, `bank/*.md`) into chunks - Use SQLite FTS for recall with citations (path + line numbers).
- write to SQLite: `facts`, `entities`, `fact_entities`, `opinions` - Add embeddings only if recall quality or scale demands it.
- FTS5 table over `facts.content`
- `recall` returns citations (path + line) + trimmed content budget
### Phase 2: entity summaries + opinion tracking
- `reflect` updates `bank/entities/*.md`
- opinion confidence updates with evidence pointers (no embeddings required yet)
### Phase 3: semantic recall (offline embeddings)
- compute embeddings during indexing (incremental)
- retrieval = `hybrid(FTS, vector)` with simple fusion
### Phase 4: “graph-ish” traversal (still simple)
- entity links enable multi-hop: “related to Peter via warelay”
- optional: “topic” nodes, lightweight edges (not a full KG)
## References ## References

View File

@@ -6,24 +6,29 @@ read_when:
--- ---
# Bonjour / mDNS discovery # Bonjour / mDNS discovery
Clawdbot uses Bonjour (mDNS / DNS-SD) as a **LAN-only convenience** to discover a running Gateway bridge transport. It is best-effort and does **not** replace SSH or Tailnet-based connectivity. Clawdbot uses Bonjour (mDNS / DNSSD) as a **LANonly convenience** to discover
an active Gateway bridge. It is besteffort and does **not** replace SSH or
Tailnet-based connectivity.
## Wide-Area Bonjour (Unicast DNS-SD) over Tailscale ## Widearea Bonjour (Unicast DNSSD) over Tailscale
If you want iOS node auto-discovery while the Gateway is on another network (e.g. Vienna ⇄ London), you can keep the `NWBrowser` UX but switch discovery from multicast mDNS (`local.`) to **unicast DNS-SD** (“Wide-Area Bonjour”) over Tailscale. If the node and gateway are on different networks, multicast mDNS wont cross the
boundary. You can keep the same discovery UX by switching to **unicast DNSSD**
("WideArea Bonjour") over Tailscale.
High level: Highlevel steps:
1) Run a DNS server on the gateway host (reachable via tailnet IP). 1) Run a DNS server on the gateway host (reachable over Tailnet).
2) Publish DNS-SD records for `_clawdbot-bridge._tcp` in a dedicated zone (example: `clawdbot.internal.`). 2) Publish DNSSD records for `_clawdbot-bridge._tcp` under a dedicated zone
3) Configure Tailscale **split DNS** so `clawdbot.internal` resolves via that DNS server for clients (including iOS). (example: `clawdbot.internal.`).
3) Configure Tailscale **split DNS** so `clawdbot.internal` resolves via that
DNS server for clients (including iOS).
Clawdbot standardizes on the discovery domain `clawdbot.internal.` for this mode. iOS/Android nodes browse both `local.` and `clawdbot.internal.` automatically (no per-device knob). Clawdbot standardizes on `clawdbot.internal.` for this mode. iOS/Android nodes
browse both `local.` and `clawdbot.internal.` automatically.
### Gateway config (recommended) ### Gateway config (recommended)
On the gateway host (the machine running the Gateway bridge), add to `~/.clawdbot/clawdbot.json` (JSON5):
```json5 ```json5
{ {
bridge: { bind: "tailnet" }, // tailnet-only (recommended) bridge: { bind: "tailnet" }, // tailnet-only (recommended)
@@ -31,21 +36,17 @@ On the gateway host (the machine running the Gateway bridge), add to `~/.clawdbo
} }
``` ```
### One-time DNS server setup (gateway host) ### Onetime DNS server setup (gateway host)
On the gateway host (macOS), run:
```bash ```bash
clawdbot dns setup --apply clawdbot dns setup --apply
``` ```
This installs CoreDNS and configures it to: This installs CoreDNS and configures it to:
- listen on port 53 **only** on the gateways Tailscale interface IPs - listen on port 53 only on the gateways Tailscale interfaces
- serve the zone `clawdbot.internal.` from the gateway-owned zone file `~/.clawdbot/dns/clawdbot.internal.db` - serve `clawdbot.internal.` from `~/.clawdbot/dns/clawdbot.internal.db`
The Gateway writes/updates that zone file when `discovery.wideArea.enabled` is true. Validate from a tailnetconnected machine:
Validate from any tailnet-connected machine:
```bash ```bash
dns-sd -B _clawdbot-bridge._tcp clawdbot.internal. dns-sd -B _clawdbot-bridge._tcp clawdbot.internal.
@@ -59,99 +60,102 @@ In the Tailscale admin console:
- Add a nameserver pointing at the gateways tailnet IP (UDP/TCP 53). - Add a nameserver pointing at the gateways tailnet IP (UDP/TCP 53).
- Add split DNS so the domain `clawdbot.internal` uses that nameserver. - Add split DNS so the domain `clawdbot.internal` uses that nameserver.
Once clients accept tailnet DNS, iOS nodes can browse `_clawdbot-bridge._tcp` in `clawdbot.internal.` without multicast. Once clients accept tailnet DNS, iOS nodes can browse
Wide-area beacons also include `tailnetDns` (when available) so the macOS app can auto-fill SSH targets off-LAN. `_clawdbot-bridge._tcp` in `clawdbot.internal.` without multicast.
### Bridge listener security (recommended) ### Bridge listener security (recommended)
The bridge port (default `18790`) is a plain TCP service. By default it binds to `0.0.0.0`, which makes it reachable from *any* interface on the gateway machine (LAN/WiFi/Tailscale). The bridge port (default `18790`) is a plain TCP service. By default it binds to
`0.0.0.0`, which makes it reachable from any interface on the gateway host.
For a tailnet-only setup, bind it to the Tailscale IP instead:
For tailnetonly setups:
- Set `bridge.bind: "tailnet"` in `~/.clawdbot/clawdbot.json`. - Set `bridge.bind: "tailnet"` in `~/.clawdbot/clawdbot.json`.
- Restart the Gateway (or restart the macOS menubar app via [`./scripts/restart-mac.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/restart-mac.sh) on that machine). - Restart the Gateway (or restart the macOS menubar app).
This keeps the bridge reachable only from devices on your tailnet (while still listening on loopback for local/SSH port-forwards).
## What advertises ## What advertises
Only the **Node Gateway** (`clawd` / `clawdbot gateway`) advertises Bonjour beacons. Only the Gateway (when the **bridge is enabled**) advertises `_clawdbot-bridge._tcp`.
- Implementation: [`src/infra/bonjour.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/bonjour.ts)
- Gateway wiring: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts)
## Service types ## Service types
- `_clawdbot-bridge._tcp` — bridge transport beacon (used by macOS/iOS/Android nodes). - `_clawdbot-bridge._tcp` — bridge transport beacon (used by macOS/iOS/Android nodes).
## TXT keys (non-secret hints) ## TXT keys (nonsecret hints)
The Gateway advertises small non-secret hints to make UI flows convenient: The Gateway advertises small nonsecret hints to make UI flows convenient:
- `role=gateway` - `role=gateway`
- `displayName=<friendly name>`
- `lanHost=<hostname>.local` - `lanHost=<hostname>.local`
- `sshPort=<port>` (defaults to 22 when not overridden) - `gatewayPort=<port>` (informational; Gateway WS is usually loopbackonly)
- `gatewayPort=<port>` (informational; the Gateway WS is typically loopback-only)
- `bridgePort=<port>` (only when bridge is enabled) - `bridgePort=<port>` (only when bridge is enabled)
- `canvasPort=<port>` (only when the canvas host is enabled + reachable; default `18793`; serves `/__clawdbot__/canvas/`) - `canvasPort=<port>` (only when the canvas host is enabled; default `18793`)
- `cliPath=<path>` (optional; absolute path to a runnable `clawdbot` entrypoint or binary) - `sshPort=<port>` (defaults to 22 when not overridden)
- `tailnetDns=<magicdns>` (optional hint; auto-detected from Tailscale when available; may be absent) - `transport=bridge`
- `cliPath=<path>` (optional; absolute path to a runnable `clawdbot` entrypoint)
- `tailnetDns=<magicdns>` (optional hint when Tailnet is available)
## Debugging on macOS ## Debugging on macOS
Useful built-in tools: Useful builtin tools:
- Browse instances: - Browse instances:
- `dns-sd -B _clawdbot-bridge._tcp local.` ```bash
dns-sd -B _clawdbot-bridge._tcp local.
```
- Resolve one instance (replace `<instance>`): - Resolve one instance (replace `<instance>`):
- `dns-sd -L "<instance>" _clawdbot-bridge._tcp local.` ```bash
dns-sd -L "<instance>" _clawdbot-bridge._tcp local.
```
If browsing shows instances but resolving fails, youre usually hitting a LAN policy / multicast issue. If browsing works but resolving fails, youre usually hitting a LAN policy or
mDNS resolver issue.
## Debugging in Gateway logs ## Debugging in Gateway logs
The Gateway writes a rolling log file (printed on startup as `gateway log file: ...`). The Gateway writes a rolling log file (printed on startup as
`gateway log file: ...`). Look for `bonjour:` lines, especially:
Look for `bonjour:` lines, especially: - `bonjour: advertise failed ...`
- `bonjour: advertise failed ...` (probing/announce failure)
- `bonjour: ... name conflict resolved` / `hostname conflict resolved` - `bonjour: ... name conflict resolved` / `hostname conflict resolved`
- `bonjour: watchdog detected non-announced service; attempting re-advertise ...` (self-heal attempt after sleep/interface churn) - `bonjour: watchdog detected non-announced service ...`
## Debugging on iOS node ## Debugging on iOS node
The iOS node app discovers bridges via `NWBrowser` browsing `_clawdbot-bridge._tcp`. The iOS node uses `NWBrowser` to discover `_clawdbot-bridge._tcp`.
To capture what the browser is doing: To capture logs:
- Settings → Bridge → Advanced → **Discovery Debug Logs**
- Settings → Bridge → Advanced → **Discovery Logs** → reproduce → **Copy**
- Settings → Bridge → Advanced → enable **Discovery Debug Logs** The log includes browser state transitions and resultset changes.
- Settings → Bridge → Advanced → open **Discovery Logs** → reproduce the “Searching…” / “No bridges found” case → **Copy**
The log includes browser state transitions (`ready`, `waiting`, `failed`, `cancelled`) and result-set changes (added/removed counts).
## Common failure modes ## Common failure modes
- **Bonjour doesnt cross networks**: London/Vienna style setups require Tailnet (MagicDNS/IP) or SSH. - **Bonjour doesnt cross networks**: use Tailnet or SSH.
- **Multicast blocked**: some WiFi networks (enterprise/hotels) disable mDNS; expect “no results”. - **Multicast blocked**: some WiFi networks disable mDNS.
- **Sleep / interface churn**: macOS may temporarily drop mDNS results when switching networks; retry. - **Sleep / interface churn**: macOS may temporarily drop mDNS results; retry.
- **Browse works but resolve fails (iOS “NoSuchRecord”)**: make sure the advertiser publishes a valid SRV target hostname. - **Browse works but resolve fails**: keep machine names simple (avoid emojis or
- Implementation detail: `@homebridge/ciao` defaults `hostname` to the *service instance name* when `hostname` is omitted. If your instance name contains spaces/parentheses, some resolvers can fail to resolve the implied A/AAAA record. punctuation), then restart the Gateway. The bridge instance name derives from
- Fix: set an explicit DNS-safe `hostname` (single label; no `.local`) in [`src/infra/bonjour.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/bonjour.ts). the host name, so overly complex names can confuse some resolvers.
## Escaped instance names (`\\032`) ## Escaped instance names (`\032`)
Bonjour/DNS-SD often escapes bytes in service instance names as decimal `\\DDD` sequences (e.g. spaces become `\\032`).
Bonjour/DNSSD often escapes bytes in service instance names as decimal `\DDD`
sequences (e.g. spaces become `\032`).
- This is normal at the protocol level. - This is normal at the protocol level.
- UIs should decode for display (iOS uses `BonjourEscapes.decode` in `apps/shared/ClawdbotKit`). - UIs should decode for display (iOS uses `BonjourEscapes.decode`).
## Disabling / configuration ## Disabling / configuration
- `CLAWDBOT_DISABLE_BONJOUR=1` disables advertising. - `CLAWDBOT_DISABLE_BONJOUR=1` disables advertising.
- `CLAWDBOT_BRIDGE_ENABLED=0` disables the bridge listener (and therefore the bridge beacon). - `CLAWDBOT_BRIDGE_ENABLED=0` disables the bridge listener (and the bridge beacon).
- `bridge.bind` / `bridge.port` in `~/.clawdbot/clawdbot.json` control bridge bind/port (preferred). - `bridge.bind` / `bridge.port` in `~/.clawdbot/clawdbot.json` control bridge bind/port.
- `CLAWDBOT_BRIDGE_HOST` / `CLAWDBOT_BRIDGE_PORT` still work as a back-compat override when `bridge.bind` / `bridge.port` are not set. - `CLAWDBOT_BRIDGE_HOST` / `CLAWDBOT_BRIDGE_PORT` still work as backcompat overrides.
- `CLAWDBOT_SSH_PORT` overrides the SSH port advertised in `_clawdbot-bridge._tcp`. - `CLAWDBOT_SSH_PORT` overrides the SSH port advertised in TXT.
- `CLAWDBOT_TAILNET_DNS` publishes a `tailnetDns` hint (MagicDNS) in `_clawdbot-bridge._tcp`. If unset, the gateway auto-detects Tailscale and publishes the MagicDNS name when possible. - `CLAWDBOT_TAILNET_DNS` publishes a MagicDNS hint in TXT.
- `CLAWDBOT_CLI_PATH` overrides the advertised CLI path.
## Related docs ## Related docs

View File

@@ -1383,8 +1383,8 @@ Notes:
- `z.ai/*` and `z-ai/*` are accepted aliases and normalize to `zai/*`. - `z.ai/*` and `z-ai/*` are accepted aliases and normalize to `zai/*`.
- If `ZAI_API_KEY` is missing, requests to `zai/*` will fail with an auth error at runtime. - If `ZAI_API_KEY` is missing, requests to `zai/*` will fail with an auth error at runtime.
- Example error: `No API key found for provider "zai".` - Example error: `No API key found for provider "zai".`
- Z.AIs general API endpoint is `https://api.z.ai/api/paas/v4`. The GLM Coding - Z.AIs general API endpoint is `https://api.z.ai/api/paas/v4`. GLM coding
Plan uses the dedicated Coding endpoint `https://api.z.ai/api/coding/paas/v4`. requests use the dedicated Coding endpoint `https://api.z.ai/api/coding/paas/v4`.
The built-in `zai` provider uses the Coding endpoint. If you need the general The built-in `zai` provider uses the Coding endpoint. If you need the general
endpoint, define a custom provider in `models.providers` with the base URL endpoint, define a custom provider in `models.providers` with the base URL
override (see the custom providers section above). override (see the custom providers section above).

View File

@@ -44,7 +44,7 @@ Target direction:
Troubleshooting and beacon details: [`docs/bonjour.md`](/gateway/bonjour). Troubleshooting and beacon details: [`docs/bonjour.md`](/gateway/bonjour).
#### Current implementation #### Service beacon details
- Service types: - Service types:
- `_clawdbot-bridge._tcp` (bridge transport beacon) - `_clawdbot-bridge._tcp` (bridge transport beacon)
@@ -98,15 +98,8 @@ The gateway is the source of truth for node/client admission.
- scopes/ACLs (bridge is not a raw proxy to every gateway method) - scopes/ACLs (bridge is not a raw proxy to every gateway method)
- rate limits - rate limits
## Where the code lives (target architecture) ## Responsibilities by component
- Node gateway: - **Gateway**: advertises discovery beacons, owns pairing decisions, runs the bridge listener.
- advertises discovery beacons (Bonjour) - **macOS app**: helps you pick a gateway, shows pairing prompts, and uses SSH only as a fallback.
- owns pairing storage + decisions - **iOS/Android nodes**: browse Bonjour as a convenience and connect via the paired bridge.
- runs the bridge listener (direct transport)
- macOS app:
- UI for picking a gateway, showing pairing prompts, and troubleshooting
- SSH tunneling only for the fallback path
- iOS node:
- browses Bonjour (LAN) as a convenience only
- uses direct transport + pairing to connect to the gateway

View File

@@ -1,47 +1,33 @@
--- ---
summary: "Plan for heartbeat polling messages and notification rules" summary: "Heartbeat polling messages and notification rules"
read_when: read_when:
- Adjusting heartbeat cadence or messaging - Adjusting heartbeat cadence or messaging
--- ---
# Heartbeat (Gateway) # Heartbeat (Gateway)
Heartbeat runs periodic agent turns in the **main session** so the model can Heartbeat runs **periodic agent turns** in the main session so the model can
surface anything that needs attention without spamming the user. surface anything that needs attention without spamming you.
## Defaults ## Defaults
- Interval: `30m` (set `agent.heartbeat.every` to change, `0m` disables).
- Interval: `30m` (set `agent.heartbeat.every`; use `0m` to disable).
- Prompt body (configurable via `agent.heartbeat.prompt`): - Prompt body (configurable via `agent.heartbeat.prompt`):
`Read HEARTBEAT.md if exists. Consider outstanding tasks. Checkup sometimes on your human during (user local) day time.` `Read HEARTBEAT.md if exists. Consider outstanding tasks. Checkup sometimes on your human during (user local) day time.`
- Heartbeat prompt text is sent **verbatim** as the user message. Clawdbot does - The heartbeat prompt is sent **verbatim** as the user message. The system
not append extra body text. The system prompt includes a Heartbeats section prompt includes a Heartbeat section and the run is flagged internally.
and the run is flagged as a heartbeat internally.
## Prompt contract ## Response contract
- If nothing needs attention, the model should reply `HEARTBEAT_OK`.
- During heartbeat runs, Clawdbot treats `HEARTBEAT_OK` as an ack when it appears at
the **start or end** of the reply. Clawdbot strips the token and discards the
reply if the remaining content is **`ackMaxChars`** (default: 30).
- If `HEARTBEAT_OK` is in the **middle** of a reply, it is not treated specially.
- For alerts, do **not** include `HEARTBEAT_OK`; return only the alert text.
## Prompt overrides - If nothing needs attention, reply with **`HEARTBEAT_OK`**.
- Overriding `agent.heartbeat.prompt` **replaces** the default body. Nothing is - During heartbeat runs, Clawdbot treats `HEARTBEAT_OK` as an ack when it appears
merged for you. at the **start or end** of the reply. The token is stripped and the reply is
- If you still want `HEARTBEAT.md` instructions, keep a line like dropped if the remaining content is **`ackMaxChars`** (default: 30).
`Read HEARTBEAT.md if exists` in your custom prompt. - If `HEARTBEAT_OK` appears in the **middle** of a reply, it is not treated
- `HEARTBEAT_OK` handling stays the same; changing the prompt wont break acks. specially.
- For alerts, **do not** include `HEARTBEAT_OK`; return only the alert text.
### Stray `HEARTBEAT_OK` outside heartbeats Outside heartbeats, stray `HEARTBEAT_OK` at the start/end of a message is stripped
If the model accidentally includes `HEARTBEAT_OK` at the start or end of a and logged; a message that is only `HEARTBEAT_OK` is dropped.
normal (non-heartbeat) reply, Clawdbot strips the token and logs a verbose
message. If the reply is only `HEARTBEAT_OK`, it is dropped.
### Outbound normalization (all providers)
For **all providers** (WhatsApp/Web, Telegram, Slack, Discord, Signal, iMessage),
Clawdbot applies the same filtering to tool summaries, streaming block replies,
and final replies:
- drop payloads that are only `HEARTBEAT_OK` with no media
- strip `HEARTBEAT_OK` at the edges when mixed with other text
## Config ## Config
@@ -51,8 +37,8 @@ and final replies:
heartbeat: { heartbeat: {
every: "30m", // default: 30m (0m disables) every: "30m", // default: 30m (0m disables)
model: "anthropic/claude-opus-4-5", model: "anthropic/claude-opus-4-5",
target: "last", // last | whatsapp | telegram | discord | slack | signal | imessage | none target: "last", // last | whatsapp | telegram | discord | slack | signal | imessage | none
to: "+15551234567", // optional provider-specific override (e.g. E.164 or chat id) to: "+15551234567", // optional provider-specific override
prompt: "Read HEARTBEAT.md if exists. Consider outstanding tasks. Checkup sometimes on your human during (user local) day time.", prompt: "Read HEARTBEAT.md if exists. Consider outstanding tasks. Checkup sometimes on your human during (user local) day time.",
ackMaxChars: 30 // max chars allowed after HEARTBEAT_OK ackMaxChars: 30 // max chars allowed after HEARTBEAT_OK
} }
@@ -60,47 +46,45 @@ and final replies:
} }
``` ```
### Fields ### Field notes
- `every`: heartbeat interval (duration string; default unit minutes). Default:
`30m`. Set to `0m` to disable.
- `model`: optional model override for heartbeat runs (`provider/model`).
- `target`: where heartbeat output is delivered.
- `last` (default): send to the last used external provider.
- `whatsapp` / `telegram` / `discord` / `slack` / `signal` / `imessage`: force the provider (optionally set `to`).
- `none`: do not deliver externally; output stays in the session (WebChat-visible).
- `to`: optional recipient override (E.164 for WhatsApp, chat id for Telegram).
- `prompt`: optional override for the heartbeat body (default shown above). Safe to
change; heartbeat acks are still keyed off `HEARTBEAT_OK`.
- `ackMaxChars`: max chars allowed after `HEARTBEAT_OK` before delivery (default: 30).
## Cost awareness - `every`: heartbeat interval (duration string; default unit = minutes).
Heartbeats run full agent turns. Shorter intervals burn more tokens. Be - `model`: optional model override for heartbeat runs (`provider/model`).
intentional about `every`, keep `HEARTBEAT.md` tiny, and consider a cheaper - `target`:
`model` or `target: "none"` if you only want internal state updates. - `last` (default): deliver to the last used external provider.
- explicit provider: `whatsapp` / `telegram` / `discord` / `slack` / `signal` / `imessage`.
- `none`: run the heartbeat but **do not deliver** externally.
- `to`: optional recipient override (E.164 for WhatsApp, chat id for Telegram, etc.).
- `prompt`: overrides the default prompt body (not merged).
- `ackMaxChars`: max chars allowed after `HEARTBEAT_OK` before delivery.
## Delivery behavior
- Heartbeats run in the **main session** (`main`, or `global` when scope is global).
- If the main queue is busy, the heartbeat is skipped and retried later.
- If `target` resolves to no external destination, the run still happens but no
outbound message is sent.
- Heartbeat-only replies do **not** keep the session alive; the last `updatedAt`
is restored so idle expiry behaves normally.
## HEARTBEAT.md (optional) ## HEARTBEAT.md (optional)
If a `HEARTBEAT.md` file exists in the workspace, the default prompt tells the If a `HEARTBEAT.md` file exists in the workspace, the default prompt tells the
agent to read it. Keep it tiny (short checklist or reminders) to avoid prompt agent to read it. Keep it tiny (short checklist or reminders) to avoid prompt
bloat. bloat.
## Behavior ## Manual wake (on-demand)
- Runs in the main session (`main`, or `global` when scope is global).
- Uses the main lane queue; if requests are in flight, the wake is retried.
- Empty output or `HEARTBEAT_OK` is treated as “ok” and does **not** keep the
session alive (`updatedAt` is restored).
- If `target` resolves to no external destination (no last route or `none`), the
heartbeat still runs but no outbound message is sent.
## Ideas for use You can enqueue a system event and trigger an immediate heartbeat with:
- Check up on the user (light, respectful pings during daytime).
- Handle mundane tasks (triage inboxes, summarize queues, refresh notes).
- Nudge on open loops or reminders.
- Background monitoring (health checks, status polling, low-priority alerts).
- Scheduled routines (use [Cron jobs](/automation/cron-jobs) when you
need exact schedules or isolated runs).
## Wake hook ```bash
- The gateway exposes a heartbeat wake hook so cron/jobs/webhooks can request an clawdbot wake --text "Check for urgent follow-ups" --mode now
immediate run (`requestHeartbeatNow`). ```
- `wake` endpoints should enqueue system events and optionally trigger a wake; the
heartbeat runner picks those up on the next tick or immediately. Use `--mode next-heartbeat` to wait for the next scheduled tick.
## Cost awareness
Heartbeats run full agent turns. Shorter intervals burn more tokens. Keep
`HEARTBEAT.md` small and consider a cheaper `model` or `target: "none"` if you
only want internal state updates.

View File

@@ -127,7 +127,9 @@ See also: [`docs/presence.md`](/concepts/presence) for how presence is produced/
## Typing and validation ## Typing and validation
- Server validates every inbound frame with AJV against JSON Schema emitted from the protocol definitions. - Server validates every inbound frame with AJV against JSON Schema emitted from the protocol definitions.
- Clients (TS/Swift) consume generated types (TS directly; Swift via the repos generator). - Clients (TS/Swift) consume generated types (TS directly; Swift via the repos generator).
- Types live in [`src/gateway/protocol/*.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/protocol/*.ts); regenerate schemas/models with `pnpm protocol:gen` (writes [`dist/protocol.schema.json`](https://github.com/clawdbot/clawdbot/blob/main/dist/protocol.schema.json)) and `pnpm protocol:gen:swift` (writes [`apps/macos/Sources/ClawdbotProtocol/GatewayModels.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/ClawdbotProtocol/GatewayModels.swift)). - Protocol definitions are the source of truth; regenerate schema/models with:
- `pnpm protocol:gen`
- `pnpm protocol:gen:swift`
## Connection snapshot ## Connection snapshot
- `hello-ok` includes a `snapshot` with `presence`, `health`, `stateVersion`, and `uptimeMs` plus `policy {maxPayload,maxBufferedBytes,tickIntervalMs}` so clients can render immediately without extra requests. - `hello-ok` includes a `snapshot` with `presence`, `health`, `stateVersion`, and `uptimeMs` plus `policy {maxPayload,maxBufferedBytes,tickIntervalMs}` so clients can render immediately without extra requests.

View File

@@ -10,12 +10,10 @@ read_when:
Clawdbot has two log “surfaces”: Clawdbot has two log “surfaces”:
- **Console output** (what you see in the terminal / Debug UI). - **Console output** (what you see in the terminal / Debug UI).
- **File logs** (JSON lines) written by the internal logger. - **File logs** (JSON lines) written by the gateway logger.
## File-based logger ## File-based logger
Clawdbot uses a file logger backed by `tslog` ([`src/logging.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/logging.ts)).
- Default rolling log file is under `/tmp/clawdbot/` (one file per day): `clawdbot-YYYY-MM-DD.log` - Default rolling log file is under `/tmp/clawdbot/` (one file per day): `clawdbot-YYYY-MM-DD.log`
- The log file path and level can be configured via `~/.clawdbot/clawdbot.json`: - The log file path and level can be configured via `~/.clawdbot/clawdbot.json`:
- `logging.file` - `logging.file`
@@ -40,9 +38,8 @@ clawdbot logs --follow
## Console capture ## Console capture
The CLI entrypoint enables console capture ([`src/index.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/index.ts) calls `enableConsoleCapture()`). The CLI captures `console.log/info/warn/error/debug/trace` and writes them to file logs,
That means every `console.log/info/warn/error/debug/trace` is also written into the file logs, while still printing to stdout/stderr.
while still behaving normally on stdout/stderr.
You can tune console verbosity independently via: You can tune console verbosity independently via:
@@ -94,13 +91,8 @@ clawdbot gateway --verbose --ws-log full
## Console formatting (subsystem logging) ## Console formatting (subsystem logging)
Clawdbot formats console logs via a small wrapper on top of the existing stack:
- **tslog** for structured file logs ([`src/logging.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/logging.ts))
- **chalk** for colors ([`src/globals.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/globals.ts))
The console formatter is **TTY-aware** and prints consistent, prefixed lines. The console formatter is **TTY-aware** and prints consistent, prefixed lines.
Subsystem loggers are created via `createSubsystemLogger("gateway")`. Subsystem loggers keep output grouped and scannable.
Behavior: Behavior:

View File

@@ -7,103 +7,83 @@ read_when:
--- ---
# Gateway-owned pairing (Option B) # Gateway-owned pairing (Option B)
Goal: The Gateway (`clawd`) is the **source of truth** for which nodes are allowed to join the network. In Gateway-owned pairing, the **Gateway** is the source of truth for which nodes
are allowed to join. UIs (macOS app, future clients) are just frontends that
This enables: approve or reject pending requests.
- Headless approval via terminal/CLI (no Swift UI required).
- Optional macOS UI approval (Swift app is just a frontend).
- One consistent membership store for iOS, mac nodes, future hardware nodes.
## Concepts ## Concepts
- **Pending request**: a node asked to join; requires explicit approve/reject.
- **Paired node**: node is allowed; gateway returns an auth token for subsequent connects. - **Pending request**: a node asked to join; requires approval.
- **Bridge**: direct transport endpoint owned by the gateway. The bridge does not decide membership. - **Paired node**: approved node with an issued auth token.
- **Bridge**: transport endpoint only; it forwards requests but does not decide
membership.
## How pairing works
1. A node connects to the bridge and requests pairing.
2. The Gateway stores a **pending request** and emits `node.pair.requested`.
3. You approve or reject the request (CLI or UI).
4. On approval, the Gateway issues a **new token** (tokens are rotated on repair).
5. The node reconnects using the token and is now “paired”.
Pending requests expire automatically after **5 minutes**.
## CLI workflow (headless friendly)
```bash
clawdbot nodes pending
clawdbot nodes approve <requestId>
clawdbot nodes reject <requestId>
clawdbot nodes status
clawdbot nodes rename --node <id|name|ip> --name "Living Room iPad"
```
`nodes status` shows paired/connected nodes and their capabilities.
## API surface (gateway protocol) ## API surface (gateway protocol)
These are conceptual method names; wire them into [`src/gateway/protocol/schema.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/protocol/schema.ts) and regenerate Swift types.
### Events Events:
- `node.pair.requested` - `node.pair.requested` — emitted when a new pending request is created.
- Emitted whenever a new pending pairing request is created. - `node.pair.resolved` — emitted when a request is approved/rejected/expired.
- Payload:
- `requestId` (string)
- `nodeId` (string)
- `displayName?` (string)
- `platform?` (string)
- `version?` (string)
- `remoteIp?` (string)
- `silent?` (boolean) — hint that the UI may attempt auto-approval
- `ts` (ms since epoch)
- `node.pair.resolved`
- Emitted when a pending request is approved/rejected.
- Payload:
- `requestId` (string)
- `nodeId` (string)
- `decision` ("approved" | "rejected" | "expired")
- `ts` (ms since epoch)
### Methods Methods:
- `node.pair.request` - `node.pair.request` — create or reuse a pending request.
- Creates (or returns) a pending request. - `node.pair.list` — list pending + paired nodes.
- Params: node metadata (same shape as `node.pair.requested` payload, minus `requestId`/`ts`). - `node.pair.approve` — approve a pending request (issues token).
- Optional `silent` flag hints that the UI can attempt an SSH auto-approve before showing an alert. - `node.pair.reject` — reject a pending request.
- Result: - `node.pair.verify` — verify `{ nodeId, token }`.
- `status` ("pending")
- `created` (boolean) — whether this call created the pending request
- `request` (pending request object), including `isRepair` when the node was already paired
- Security: **never returns an existing token**. If a paired node “lost” its token, it must be approved again (token rotation).
- `node.pair.list`
- Returns:
- `pending[]` (pending requests)
- `paired[]` (paired node records)
- `node.pair.approve`
- Params: `{ requestId }`
- Result: `{ requestId, node: { nodeId, token, ... } }`
- Must be idempotent (first decision wins).
- `node.pair.reject`
- Params: `{ requestId }`
- Result: `{ requestId, nodeId }`
- `node.pair.verify`
- Params: `{ nodeId, token }`
- Result: `{ ok: boolean, node?: { nodeId, ... } }`
## CLI flows
CLI must be able to fully operate without any GUI:
- `clawdbot nodes pending`
- `clawdbot nodes approve <requestId>`
- `clawdbot nodes reject <requestId>`
- `clawdbot nodes status` (paired nodes + connection status/capabilities)
Optional interactive helper:
- `clawdbot nodes watch` (subscribe to `node.pair.requested` and prompt in-place)
Implementation pointers:
- CLI commands: [`src/cli/nodes-cli.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/cli/nodes-cli.ts)
- Gateway handlers + events: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) + [`src/gateway/server-methods/nodes.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/nodes.ts)
- Pairing store: [`src/infra/node-pairing.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/node-pairing.ts) (under `~/.clawdbot/nodes/`)
- Optional macOS UI prompt (frontend only): [`apps/macos/Sources/Clawdbot/NodePairingApprovalPrompter.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/NodePairingApprovalPrompter.swift)
- Push-first: listens to `node.pair.requested`/`node.pair.resolved`, does a `node.pair.list` on startup/reconnect,
and only runs a slow safety poll while a request is pending/visible.
## Storage (private, local)
Gateway stores the authoritative state under `~/.clawdbot/`:
- `~/.clawdbot/nodes/paired.json`
- `~/.clawdbot/nodes/pending.json` (or `~/.clawdbot/nodes/pending/*.json`)
Notes: Notes:
- Tokens are secrets. Treat `paired.json` as sensitive. - `node.pair.request` is idempotent per node: repeated calls return the same
- Pending entries should have a TTL (e.g. 5 minutes) and expire automatically. pending request.
- Approval **always** generates a fresh token; no token is ever returned from
`node.pair.request`.
- Requests may include `silent: true` as a hint for auto-approval flows.
## Bridge integration ## Auto-approval (macOS app)
Target direction:
- The gateway runs the bridge listener (LAN/tailnet-facing) and advertises discovery beacons (Bonjour).
- The bridge is transport only; it forwards/scopes requests and enforces ACLs, but pairing decisions are made by the gateway.
The macOS UI (Swift) can: The macOS app can optionally attempt a **silent approval** when:
- Subscribe to `node.pair.requested`, show an alert (including `remoteIp`), and call `node.pair.approve` or `node.pair.reject`. - the request is marked `silent`, and
- Or ignore/dismiss (“Later”) and let CLI handle it. - the app can verify an SSH connection to the gateway host using the same user.
- When `silent` is set, it can try a short SSH probe (same user) and auto-approve if reachable; otherwise fall back to the normal alert.
## Implementation note If silent approval fails, it falls back to the normal “Approve/Reject” prompt.
If the bridge is only provided by the macOS app, then “no Swift app running” cannot work end-to-end.
The long-term goal is to move bridge hosting + Bonjour advertising into the Node gateway so headless pairing works by default. ## Storage (local, private)
Pairing state is stored under the Gateway state directory (default `~/.clawdbot`):
- `~/.clawdbot/nodes/paired.json`
- `~/.clawdbot/nodes/pending.json`
If you override `CLAWDBOT_STATE_DIR`, the `nodes/` folder moves with it.
Security notes:
- Tokens are secrets; treat `paired.json` as sensitive.
- Rotating a token requires re-approval (or deleting the node entry).
## Bridge behavior
- The bridge is **transport only**; it does not store membership.
- If the Gateway is offline or pairing is disabled, nodes cannot pair.
- If the bridge is running but the Gateway is in remote mode, pairing still
happens against the remote Gateways store.

View File

@@ -140,11 +140,3 @@ Nodes may include a `permissions` map in `node.list` / `node.describe`, keyed by
- The macOS menubar app connects to the Gateway bridge as a node (so `clawdbot nodes …` works against this Mac). - The macOS menubar app connects to the Gateway bridge as a node (so `clawdbot nodes …` works against this Mac).
- In remote mode, the app opens an SSH tunnel for the bridge port and connects to `localhost`. - In remote mode, the app opens an SSH tunnel for the bridge port and connects to `localhost`.
## Where to look in code
- CLI wiring: [`src/cli/nodes-cli.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/cli/nodes-cli.ts)
- Canvas snapshot decoding/temp paths: [`src/cli/nodes-canvas.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/cli/nodes-canvas.ts)
- Duration parsing for CLI: [`src/cli/parse-duration.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/cli/parse-duration.ts)
- iOS node commands: [`apps/ios/Sources/Model/NodeAppModel.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/ios/Sources/Model/NodeAppModel.swift)
- Android node commands: `apps/android/app/src/main/java/com/clawdbot/android/node/*`

View File

@@ -76,7 +76,7 @@ Goal: model can request location even when node is backgrounded, but only when:
Push-triggered flow (future): Push-triggered flow (future):
1) Gateway sends a push to the node (silent push or FCM data). 1) Gateway sends a push to the node (silent push or FCM data).
2) Node wakes briefly and calls `location.get` internally. 2) Node wakes briefly and requests location from the device.
3) Node forwards payload to Gateway. 3) Node forwards payload to Gateway.
Notes: Notes:

View File

@@ -1,381 +1,105 @@
--- ---
summary: "iOS app (node): architecture + connection runbook" summary: "iOS node app: connect to the Gateway, pairing, canvas, and troubleshooting"
read_when: read_when:
- Pairing or reconnecting the iOS node - Pairing or reconnecting the iOS node
- Debugging iOS bridge discovery or auth - Running the iOS app from source
- Sending screen/canvas commands to iOS - Debugging bridge discovery or canvas commands
- Designing iOS node + gateway integration
- Extending the Gateway protocol for node/canvas commands
- Implementing Bonjour pairing or transport security
--- ---
# iOS App (Node) # iOS App (Node)
Status: prototype implemented (internal) · Date: 2025-12-13 Availability: internal preview. The iOS app is not publicly distributed yet.
## Support snapshot ## What it does
- Role: companion node app (iOS does not host the Gateway).
- Gateway required: yes (run it on macOS, Linux, or Windows via WSL2).
- Install: [Getting Started](/start/getting-started) + [Pairing](/gateway/pairing).
- Gateway: [Runbook](/gateway) + [Configuration](/gateway/configuration).
## System control - Connects to a Gateway over the bridge (LAN or tailnet).
System control (launchd/systemd) lives on the Gateway host. See [Gateway](/gateway). - Exposes node capabilities: Canvas, Screen snapshot, Camera capture, Location, Talk mode, Voice wake.
- Receives `node.invoke` commands and reports node status events.
## Connection Runbook ## Requirements
This is the practical “how do I connect the iOS node” guide: - Gateway running on another device (macOS, Linux, or Windows via WSL2).
- Bridge enabled (default).
- Network path:
- Same LAN via Bonjour, **or**
- Tailnet via unicast DNS-SD (`clawdbot.internal.`), **or**
- Manual host/port (fallback).
**iOS app** ⇄ (Bonjour + TCP bridge) ⇄ **Gateway bridge** ⇄ (loopback WS) ⇄ **Gateway** ## Quick start (pair + connect)
The Gateway WebSocket stays loopback-only (`ws://127.0.0.1:18789`). The iOS node talks to the LAN-facing **bridge** (default `tcp://0.0.0.0:18790`) and uses Gateway-owned pairing. 1) Start the Gateway (bridge enabled by default):
### Prerequisites
- You can run the Gateway on the “master” machine.
- iOS node app can reach the gateway bridge:
- Same LAN with Bonjour/mDNS, **or**
- Same Tailscale tailnet using Wide-Area Bonjour / unicast DNS-SD (see below), **or**
- Manual bridge host/port (fallback)
- You can run the CLI (`clawdbot`) on the gateway machine (or via SSH).
### 1) Start the Gateway (with bridge enabled)
Bridge is enabled by default (disable via `CLAWDBOT_BRIDGE_ENABLED=0`).
```bash ```bash
clawdbot gateway --port 18789 --verbose clawdbot gateway --port 18789
``` ```
Confirm in logs you see something like: 2) In the iOS app, open Settings and pick a discovered gateway (or enable Manual Bridge and enter host/port).
- `bridge listening on tcp://0.0.0.0:18790 (node)`
For tailnet-only setups (recommended for Vienna ⇄ London), bind the bridge to the gateway machines Tailscale IP instead: 3) Approve the pairing request on the gateway host:
- Set `bridge.bind: "tailnet"` in `~/.clawdbot/clawdbot.json` on the gateway host.
- Restart the Gateway / macOS menubar app.
### 2) Verify Bonjour discovery (optional but recommended)
From the gateway machine:
```bash
dns-sd -B _clawdbot-bridge._tcp local.
```
You should see your gateway advertising `_clawdbot-bridge._tcp`.
If browse works, but the iOS node cant connect, try resolving one instance:
```bash
dns-sd -L "<instance name>" _clawdbot-bridge._tcp local.
```
More debugging notes: [`docs/bonjour.md`](/gateway/bonjour).
#### Tailnet (Vienna ⇄ London) discovery via unicast DNS-SD
If the iOS node and the gateway are on different networks but connected via Tailscale, multicast mDNS wont cross the boundary. Use Wide-Area Bonjour / unicast DNS-SD instead:
1) Set up a DNS-SD zone (example `clawdbot.internal.`) on the gateway host and publish `_clawdbot-bridge._tcp` records.
2) Configure Tailscale split DNS for `clawdbot.internal` pointing at that DNS server.
Details and example CoreDNS config: [`docs/bonjour.md`](/gateway/bonjour).
### 3) Connect from the iOS node app
In the iOS node app:
- Pick the discovered bridge (or hit refresh).
- If not paired yet, it will initiate pairing automatically.
- After the first successful pairing, it will auto-reconnect **strictly to the last discovered gateway** on launch (including after reinstall), as long as the iOS Keychain entry is still present.
#### Connection indicator (always visible)
The Settings tab icon shows a small status dot:
- **Green**: connected to the bridge
- **Yellow**: connecting (subtle pulse)
- **Red**: not connected / error
### 4) Approve pairing (CLI)
On the gateway machine:
```bash ```bash
clawdbot nodes pending clawdbot nodes pending
```
Approve the request:
```bash
clawdbot nodes approve <requestId> clawdbot nodes approve <requestId>
``` ```
After approval, the iOS node receives/stores the token and reconnects authenticated. 4) Verify connection:
Pairing details: [`docs/gateway/pairing.md`](/gateway/pairing).
### 5) Verify the node is connected
- In the macOS app: **Instances** tab should show something like `iOS Node (...)` with a green “Active” presence dot shortly after connect.
- Via nodes status (paired + connected):
```bash
clawdbot nodes status
```
- Via Gateway (paired + connected):
```bash
clawdbot gateway call node.list --params "{}"
```
- Via Gateway presence (legacy-ish, still useful):
```bash
clawdbot gateway call system-presence --params "{}"
```
Look for the node `instanceId` (often a UUID).
### 6) Drive the iOS Canvas (draw / snapshot)
The iOS node runs a WKWebView “Canvas” scaffold which exposes:
- `window.__clawdbot.canvas`
- `window.__clawdbot.ctx` (2D context)
- `window.__clawdbot.setStatus(title, subtitle)`
#### Gateway Canvas Host (recommended for web content)
If you want the node to show real HTML/CSS/JS that the agent can edit on disk, point it at the Gateway canvas host.
Note: nodes always use the standalone canvas host on `canvasHost.port` (default `18793`), bound to the bridge interface.
1) Create `~/clawd/canvas/index.html` on the gateway host.
2) Navigate the node to it (LAN):
```bash ```bash
clawdbot nodes invoke --node "iOS Node" --command canvas.navigate --params '{"url":"http://<gateway-hostname>.local:18793/__clawdbot__/canvas/"}' clawdbot nodes status
clawdbot gateway call node.list --params "{}"
```
## Discovery paths
### Bonjour (LAN)
The Gateway advertises `_clawdbot-bridge._tcp` on `local.`. The iOS app lists these automatically.
### Tailnet (cross-network)
If mDNS is blocked, use a unicast DNS-SD zone (recommended domain: `clawdbot.internal.`) and Tailscale split DNS.
See [`docs/bonjour.md`](/gateway/bonjour) for the CoreDNS example.
### Manual host/port
In Settings, enable **Manual Bridge** and enter the gateway host + port (default `18790`).
## Canvas + A2UI
The iOS node renders a WKWebView canvas. Use `node.invoke` to drive it:
```bash
clawdbot nodes invoke --node "iOS Node" --command canvas.navigate --params '{"url":"http://<gateway-host>:18793/__clawdbot__/canvas/"}'
``` ```
Notes: Notes:
- The server injects a live-reload client into HTML and reloads on file changes. - The Gateway canvas host serves `/__clawdbot__/canvas/` and `/__clawdbot__/a2ui/`.
- A2UI is hosted on the same canvas host at `http://<gateway-host>:18793/__clawdbot__/a2ui/`. - The iOS node auto-navigates to A2UI on connect when a canvas host URL is advertised.
- Tailnet (optional): if both devices are on Tailscale, use a MagicDNS name or tailnet IP instead of `.local`, e.g. `http://<gateway-magicdns>:18793/__clawdbot__/canvas/`. - Return to the built-in scaffold with `canvas.navigate` and `{"url":""}`.
- iOS may require App Transport Security allowances to load plain `http://` URLs; if it fails to load, prefer HTTPS or adjust the iOS apps ATS config.
#### Draw with `canvas.eval` ### Canvas eval / snapshot
```bash ```bash
clawdbot nodes invoke --node "iOS Node" --command canvas.eval --params "$(cat <<'JSON' clawdbot nodes invoke --node "iOS Node" --command canvas.eval --params '{"javaScript":"(() => { const {ctx} = window.__clawdbot; ctx.clearRect(0,0,innerWidth,innerHeight); ctx.lineWidth=6; ctx.strokeStyle=\"#ff2d55\"; ctx.beginPath(); ctx.moveTo(40,40); ctx.lineTo(innerWidth-40, innerHeight-40); ctx.stroke(); return \"ok\"; })()"}'
{"javaScript":"(() => { const {ctx,setStatus} = window.__clawdbot; setStatus('Drawing','…'); ctx.clearRect(0,0,innerWidth,innerHeight); ctx.lineWidth=6; ctx.strokeStyle='#ff2d55'; ctx.beginPath(); ctx.moveTo(40,40); ctx.lineTo(innerWidth-40, innerHeight-40); ctx.stroke(); setStatus(null,null); return 'ok'; })()"}
JSON
)"
``` ```
#### Snapshot with `canvas.snapshot`
```bash ```bash
clawdbot nodes invoke --node 192.168.0.88 --command canvas.snapshot --params '{"maxWidth":900}' clawdbot nodes invoke --node "iOS Node" --command canvas.snapshot --params '{"maxWidth":900,"format":"jpeg"}'
``` ```
The response includes `{ format, base64 }` image data (default `format="jpeg"`; pass `{"format":"png"}` when you specifically need lossless PNG). ## Voice wake + talk mode
### Common gotchas - Voice wake and talk mode are available in Settings.
- iOS may suspend background audio; treat voice features as best-effort when the app is not active.
- **iOS in background:** all `canvas.*` commands fail fast with `NODE_BACKGROUND_UNAVAILABLE` (bring the iOS node app to foreground). ## Common errors
- **Return to default scaffold:** `canvas.navigate` with `{"url":""}` or `{"url":"/"}` returns to the built-in scaffold page.
- **mDNS blocked:** some networks block multicast; use a different LAN or plan a tailnet-capable bridge (see [`docs/discovery.md`](/gateway/discovery)).
- **Wrong node selector:** `--node` can be the node id (UUID), display name (e.g. `iOS Node`), IP, or an unambiguous prefix. If its ambiguous, the CLI will tell you.
- **Stale pairing / Keychain cleared:** if the pairing token is missing (or iOS Keychain was wiped), the node must pair again; approve a new pending request.
- **App reinstall but no reconnect:** the node restores `instanceId` + last bridge preference from Keychain; if it still comes up “unpaired”, verify Keychain persistence on your device/simulator and re-pair once.
## Design + Architecture - `NODE_BACKGROUND_UNAVAILABLE`: bring the iOS app to the foreground (canvas/camera/screen commands require it).
- `A2UI_HOST_NOT_CONFIGURED`: the Gateway did not advertise a canvas host URL; check `canvasHost` in [`docs/configuration.md`](/gateway/configuration).
### Goals - Pairing prompt never appears: run `clawdbot nodes pending` and approve manually.
- Build an **iOS app** that acts as a **remote node** for Clawdbot: - Reconnect fails after reinstall: the Keychain pairing token was cleared; re-pair the node.
- **Voice trigger** (wake-word / always-listening intent) that forwards transcripts to the Gateway `agent` method.
- **Canvas** surface that the agent can control: navigate, draw/render, evaluate JS, snapshot.
- **Dead-simple setup**:
- Auto-discover the host on the local network via **Bonjour**.
- One-tap pairing with an approval prompt on the Mac.
- iOS is **never** a local gateway; it is always a remote node.
- Operational clarity:
- When iOS is backgrounded, voice may still run; **canvas commands must fail fast** with a structured error.
- Provide **settings**: node display name, enable/disable voice wake, pairing status.
Non-goals (v1):
- Exposing the Node Gateway directly on the LAN.
- Supporting arbitrary third-party “plugins” on iOS.
- Perfect App Store compliance; this is **internal-only** initially.
### Current repo reality (constraints we respect)
- The Gateway WebSocket server binds to `127.0.0.1:18789` ([`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts)) with an optional `CLAWDBOT_GATEWAY_TOKEN`.
- The Gateway exposes a Canvas file server (`canvasHost`) on `canvasHost.port` (default `18793`), so nodes can `canvas.navigate` to `http://<lanHost>:18793/__clawdbot__/canvas/` and auto-reload on file changes ([`docs/configuration.md`](/gateway/configuration)).
- macOS “Canvas” is controlled via the Gateway node protocol (`canvas.*`), matching iOS/Android ([`docs/mac/canvas.md`](/platforms/mac/canvas)).
- Voice wake forwards via `GatewayChannel` to Gateway `agent` (mac app: `VoiceWakeForwarder` → `GatewayConnection.sendAgent`).
### Recommended topology (B): Gateway-owned Bridge + loopback Gateway
Keep the Node gateway loopback-only; expose a dedicated **gateway-owned bridge** to the LAN/tailnet.
**iOS App** ⇄ (TLS + pairing) ⇄ **Bridge (in gateway)** ⇄ (loopback) ⇄ **Gateway WS** (`ws://127.0.0.1:18789`)
Why:
- Preserves current threat model: Gateway remains local-only.
- Centralizes auth, rate limiting, and allowlisting in the bridge.
- Lets us unify “canvas node” semantics across mac + iOS without exposing raw gateway methods.
### Security plan (internal, but still robust)
#### Transport
- **Current (v0):** bridge is a LAN-facing **TCP** listener with token-based auth after pairing.
- **Next:** wrap the bridge in **TLS** and prefer key-pinned or mTLS-like auth after pairing.
#### Pairing
- Bonjour discovery shows a candidate “Clawdbot Bridge” on the LAN.
- First connection:
1) iOS generates a keypair (Secure Enclave if available).
2) iOS connects to the bridge and requests pairing.
3) The bridge forwards the pairing request to the **Gateway** as a *pending request*.
4) Approval can happen via:
- **macOS UI** (Clawdbot shows an alert with Approve/Reject/Later, including the node IP), or
- **Terminal/CLI** (headless flows).
5) Once approved, the bridge returns a token to iOS; iOS stores it in Keychain.
- Subsequent connections:
- The bridge requires the paired identity. Unpaired clients get a structured “not paired” error and no access.
##### Gateway-owned pairing (Option B details)
Pairing decisions must be owned by the Gateway (`clawd` / Node) so nodes can be approved without the macOS app running.
Key idea:
- The Swift app may still show an alert, but it is only a **frontend** for pending requests stored in the Gateway.
Desired behavior:
- If the Swift UI is present: show alert with Approve/Reject/Later.
- If the Swift UI is not present: `clawdbot` CLI can list pending requests and approve/reject.
See [`docs/gateway/pairing.md`](/gateway/pairing) for the API/events and storage.
CLI (headless approvals):
- `clawdbot nodes pending`
- `clawdbot nodes approve <requestId>`
- `clawdbot nodes reject <requestId>`
#### Authorization / scope control (bridge-side ACL)
The bridge must not be a raw proxy to every gateway method.
- Allow by default:
- `agent` (with guardrails; idempotency required)
- minimal `system-event` beacons (presence updates for the node)
- node/canvas methods defined below (new protocol surface)
- Deny by default:
- anything that widens control without explicit intent (future “shell”, “files”, etc.)
- Rate limit:
- handshake attempts
- voice forwards per minute
- snapshot frequency / payload size
### Protocol unification: add “node/canvas” to Gateway protocol
#### Principle
Unify mac Canvas + iOS Canvas under a single conceptual surface:
- The agent talks to the Gateway using a stable method set (typed protocol).
- The Gateway routes node-targeted requests to:
- local mac Canvas implementation, or
- remote iOS node via the bridge
#### Minimal protocol additions (v1)
Add to [`src/gateway/protocol/schema.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/protocol/schema.ts) (and regenerate Swift models):
**Identity**
- Node identity comes from `connect.params.client.instanceId` (stable), and `connect.params.client.mode = "node"` (or `"ios-node"`).
**Methods**
- `node.list` → list paired/connected nodes + capabilities
- `node.describe` → describe a node (capabilities + supported `node.invoke` commands)
- `node.invoke` → send a command to a specific node
- Params: `{ nodeId, command, params?, timeoutMs? }`
**Events**
- `node.event` → async node status/errors
- e.g. background/foreground transitions, voice availability, canvas availability
#### Node command set (canvas)
These are values for `node.invoke.command`:
- `canvas.present` / `canvas.hide`
- `canvas.navigate` with `{ url }` (loads a URL; use `""` or `"/"` to return to the default scaffold)
- `canvas.eval` with `{ javaScript }`
- `canvas.snapshot` with `{ maxWidth?, quality?, format? }`
- A2UI (mobile + macOS canvas):
- `canvas.a2ui.push` with `{ messages: [...] }` (A2UI v0.8 server→client messages)
- `canvas.a2ui.pushJSONL` with `{ jsonl: "..." }` (legacy alias)
- `canvas.a2ui.reset`
- A2UI is hosted by the Gateway canvas host (`/__clawdbot__/a2ui/`) on `canvasHost.port`. Commands fail if the host is unreachable.
Result pattern:
- Request is a standard `req/res` with `ok` / `error`.
- Long operations (loads, streaming drawing, etc.) may also emit `node.event` progress.
##### Current (implemented)
As of 2025-12-13, the Gateway supports `node.invoke` for bridge-connected nodes.
Example: draw a diagonal line on the iOS Canvas:
```bash
clawdbot nodes invoke --node ios-node --command canvas.eval --params '{"javaScript":"(() => { const {ctx} = window.__clawdbot; ctx.clearRect(0,0,innerWidth,innerHeight); ctx.lineWidth=6; ctx.strokeStyle=\"#ff2d55\"; ctx.beginPath(); ctx.moveTo(40,40); ctx.lineTo(innerWidth-40, innerHeight-40); ctx.stroke(); return \"ok\"; })()"}'
```
### Background behavior requirement
When iOS is backgrounded:
- Voice may still be active (subject to iOS suspension).
- **All `canvas.*` commands must fail** with a stable error code, e.g.:
- `NODE_BACKGROUND_UNAVAILABLE`
- Include `retryable: true` and `retryAfterMs` if we want the agent to wait.
## iOS app architecture (SwiftUI)
### App structure
- Single fullscreen Canvas surface (WKWebView).
- One settings entry point: a **gear button** that opens a settings sheet.
- All navigation is **agent-driven** (no local URL bar).
### Components
- `BridgeDiscovery`: Bonjour browse + resolve (Network.framework `NWBrowser`)
- `BridgeConnection`: TCP session + pairing handshake + reconnect (TLS planned)
- `NodeRuntime`:
- Voice pipeline (wake-word + capture + forward)
- Canvas pipeline (WKWebView controller + snapshot + eval)
- Background state tracking; enforces “canvas unavailable in background”
### Voice in background (internal)
- Enable background audio mode (and required session configuration) so the mic pipeline can keep running when the user switches apps.
- If iOS suspends the app anyway, surface a clear node status (`node.event`) so operators can see voice is unavailable.
## Code sharing (macOS + iOS)
Create/expand SwiftPM targets so both apps share:
- `ClawdbotProtocol` (generated models; platform-neutral)
- `ClawdbotGatewayClient` (shared WS framing + connect/req/res + seq-gap handling)
- `ClawdbotKit` (node/canvas command types + deep links + shared utilities)
macOS continues to own:
- local Canvas implementation details (custom scheme handler serving on-disk HTML, window/panel presentation)
iOS owns:
- iOS-specific audio/speech + WKWebView presentation and lifecycle
## Repo layout
- iOS app: `apps/ios/` (XcodeGen `project.yml`)
- Shared Swift packages: `apps/shared/`
- Lint/format: iOS target runs `swiftformat --lint` + `swiftlint lint` using repo configs (`.swiftformat`, `.swiftlint.yml`).
Generate the Xcode project:
```bash
cd apps/ios
xcodegen generate
open Clawdbot.xcodeproj
```
## Storage plan (private by default)
### iOS
- Canvas/workspace files (persistent, private):
- `Application Support/Clawdbot/canvas/<sessionKey>/...`
- Snapshots / temp exports (evictable):
- `Library/Caches/Clawdbot/canvas-snapshots/<sessionKey>/...`
- Credentials:
- Keychain (paired identity + bridge trust anchor)
## Related docs ## Related docs
- [`docs/gateway.md`](/gateway) (gateway runbook) - [Pairing](/gateway/pairing)
- [`docs/gateway/pairing.md`](/gateway/pairing) (approval + storage) - [Discovery](/gateway/discovery)
- [`docs/bonjour.md`](/gateway/bonjour) (discovery debugging) - [Bonjour](/gateway/bonjour)
- [`docs/discovery.md`](/gateway/discovery) (LAN vs tailnet vs SSH)

View File

@@ -15,7 +15,7 @@ Goal: ship **Clawdbot.app** with a self-contained relay binary that can run both
App bundle layout: App bundle layout:
- `Clawdbot.app/Contents/Resources/Relay/clawdbot` - `Clawdbot.app/Contents/Resources/Relay/clawdbot`
- bun `--compile` relay executable built from [`dist/macos/relay.js`](https://github.com/clawdbot/clawdbot/blob/main/dist/macos/relay.js) - bun `--compile` relay executable built from `dist/macos/relay.js`
- Supports: - Supports:
- `clawdbot …` (CLI) - `clawdbot …` (CLI)
- `clawdbot gateway …` (LaunchAgent daemon) - `clawdbot gateway …` (LaunchAgent daemon)
@@ -47,7 +47,7 @@ Important bundler flags:
Version injection: Version injection:
- `--define "__CLAWDBOT_VERSION__=\"<pkg version>\""` - `--define "__CLAWDBOT_VERSION__=\"<pkg version>\""`
- [`src/version.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/version.ts) also supports `__CLAWDBOT_VERSION__` (and `CLAWDBOT_BUNDLED_VERSION`) so `--version` doesnt depend on reading `package.json` at runtime. - The relay honors `__CLAWDBOT_VERSION__` / `CLAWDBOT_BUNDLED_VERSION` so `--version` doesnt depend on reading `package.json` at runtime.
## Launchd (Gateway as LaunchAgent) ## Launchd (Gateway as LaunchAgent)
@@ -58,7 +58,7 @@ Plist location (per-user):
- `~/Library/LaunchAgents/com.clawdbot.gateway.plist` - `~/Library/LaunchAgents/com.clawdbot.gateway.plist`
Manager: Manager:
- [`apps/macos/Sources/Clawdbot/GatewayLaunchAgentManager.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/GatewayLaunchAgentManager.swift) - The macOS app owns LaunchAgent install/update for the bundled gateway.
Behavior: Behavior:
- “Clawdbot Active” enables/disables the LaunchAgent. - “Clawdbot Active” enables/disables the LaunchAgent.
@@ -79,7 +79,7 @@ Symptom (when mis-signed):
Fix: Fix:
- The bun executable needs JIT-ish permissions under hardened runtime. - The bun executable needs JIT-ish permissions under hardened runtime.
- [`scripts/codesign-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/codesign-mac-app.sh) signs `Relay/clawdbot` with: - `scripts/codesign-mac-app.sh` signs `Relay/clawdbot` with:
- `com.apple.security.cs.allow-jit` - `com.apple.security.cs.allow-jit`
- `com.apple.security.cs.allow-unsigned-executable-memory` - `com.apple.security.cs.allow-unsigned-executable-memory`
@@ -89,18 +89,14 @@ Problem:
- bun cant load some native Node addons like `sharp` (and we dont want to ship native addon trees for the gateway). - bun cant load some native Node addons like `sharp` (and we dont want to ship native addon trees for the gateway).
Solution: Solution:
- Central helper [`src/media/image-ops.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/media/image-ops.ts) - Image operations prefer `/usr/bin/sips` on macOS (especially under bun).
- Prefers `/usr/bin/sips` on macOS (esp. when running under bun) - When running in Node/dev, `sharp` is used when available.
- Falls back to `sharp` when available (Node/dev) - This affects inbound/outbound media, screenshots, and tool image sanitization.
- Used by:
- [`src/web/media.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/web/media.ts) (optimize inbound/outbound images)
- [`src/browser/screenshot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/browser/screenshot.ts)
- [`src/agents/pi-tools.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-tools.ts) (image sanitization)
## Browser control server ## Browser control server
The Gateway starts the browser control server (loopback only) from [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts). The Gateway starts the browser control server (loopback only) from the relay daemon process,
Its started from the relay daemon process, so the relay binary includes Playwright deps. so the relay binary includes Playwright deps.
## Tests / smoke checks ## Tests / smoke checks
@@ -127,7 +123,7 @@ Bun may leave dotfiles like `*.bun-build` in the repo root or subfolders.
## DMG styling (human installer) ## DMG styling (human installer)
[`scripts/create-dmg.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/create-dmg.sh) styles the DMG via Finder AppleScript. `scripts/create-dmg.sh` styles the DMG via Finder AppleScript.
Rules of thumb: Rules of thumb:
- Use a **72dpi** background image that matches the Finder window size in points. - Use a **72dpi** background image that matches the Finder window size in points.

View File

@@ -5,157 +5,117 @@ read_when:
- Adding agent controls for visual workspace - Adding agent controls for visual workspace
- Debugging WKWebView canvas loads - Debugging WKWebView canvas loads
--- ---
# Canvas (macOS app) # Canvas (macOS app)
Status: draft spec · Date: 2025-12-12 The macOS app embeds an agentcontrolled **Canvas panel** using `WKWebView`. It
is a lightweight visual workspace for HTML/CSS/JS, A2UI, and small interactive
UI surfaces.
Note: for iOS/Android nodes that should render agent-edited HTML/CSS/JS over the network, prefer the Gateway `canvasHost` (serves `~/clawd/canvas` over LAN/tailnet with live reload). A2UI is also **hosted by the Gateway** over HTTP. This doc focuses on the macOS in-app canvas panel. See [`docs/configuration.md`](/gateway/configuration). ## Where Canvas lives
Clawdbot can embed an agent-controlled “visual workspace” panel (“Canvas”) inside the macOS app using `WKWebView`, served via a **custom URL scheme** (no loopback HTTP port required). Canvas state is stored under Application Support:
This is designed for: - `~/Library/Application Support/Clawdbot/canvas/<session>/...`
- Agent-written HTML/CSS/JS on disk (per-session directory).
- A real browser engine for layout, rendering, and basic interactivity.
- Agent-driven visibility (show/hide), navigation, DOM/JS queries, and snapshots.
- Minimal chrome: borderless panel; bezel/chrome appears only on hover.
## Why a custom scheme (vs. loopback HTTP) The Canvas panel serves those files via a **custom URL scheme**:
Using `WKURLSchemeHandler` keeps Canvas entirely in-process:
- No port conflicts and no extra local server lifecycle.
- Easier to sandbox: only serve files we explicitly map.
- Works offline and can use an ephemeral data store (no persistent cookies/cache).
If a Canvas page truly needs “real web” semantics (CORS, fetch to loopback endpoints, service workers), consider the loopback-server variant instead (out of scope for this doc).
## URL ↔ directory mapping
The Canvas scheme is:
- `clawdbot-canvas://<session>/<path>` - `clawdbot-canvas://<session>/<path>`
Routing model: Examples:
- `clawdbot-canvas://main/``<canvasRoot>/main/index.html` (or `index.htm`) - `clawdbot-canvas://main/``<canvasRoot>/main/index.html`
- `clawdbot-canvas://main/yolo``<canvasRoot>/main/yolo/index.html` (or `index.htm`)
- `clawdbot-canvas://main/assets/app.css``<canvasRoot>/main/assets/app.css` - `clawdbot-canvas://main/assets/app.css``<canvasRoot>/main/assets/app.css`
- `clawdbot-canvas://main/widgets/todo/``<canvasRoot>/main/widgets/todo/index.html`
Directory listings are not served. If no `index.html` exists at the root, the app shows a **builtin scaffold page**.
When `/` has no `index.html` yet, the handler serves a **built-in scaffold page** (bundled with the macOS app). ## Panel behavior
This is a visual placeholder only (no A2UI renderer).
### Suggested on-disk location - Borderless, resizable panel anchored near the menu bar (or mouse cursor).
- Remembers size/position per session.
- Autoreloads when local canvas files change.
- Only one Canvas panel is visible at a time (session is switched as needed).
Store Canvas state under the app support directory: Canvas can be disabled from Settings → **Allow Canvas**. When disabled, canvas
- `~/Library/Application Support/Clawdbot/canvas/<session>/…` node commands return `CANVAS_DISABLED`.
This keeps it alongside other app-owned state and avoids mixing with `~/.clawdbot/` gateway config. ## Agent API surface
## Panel behavior (agent-controlled) Canvas is exposed via the **node bridge**, so the agent can:
Canvas is presented as a borderless `NSPanel` (similar to the existing WebChat panel): - show/hide the panel
- Can be shown/hidden at any time by the agent. - navigate to a path or URL
- Supports an “anchored” presentation (near the menu bar icon or another anchor rect). - evaluate JavaScript
- Uses a rounded container; shadow stays on, but **chrome/bezel only appears on hover**. - capture a snapshot image
- Default position is the **top-right corner** of the current screens visible frame (unless the user moved/resized it previously).
- The panel is **user-resizable** (edge resize + hover resize handle) and the last frame is persisted per session.
### Hover-only chrome CLI examples:
Implementation notes: ```bash
- Keep the window borderless at all times (dont toggle `styleMask`). clawdbot nodes canvas present --node <id>
- Add an overlay view inside the content container for chrome (stroke + subtle gradient/material). clawdbot nodes canvas navigate --node <id> --url "/"
- Use an `NSTrackingArea` to fade the chrome in/out on `mouseEntered/mouseExited`. clawdbot nodes canvas eval --node <id> --js "document.title"
- Optionally show close/drag affordances only while hovered. clawdbot nodes canvas snapshot --node <id>
```
## Agent API surface (current) Notes:
- `canvas.navigate` accepts **local canvas paths**, `http(s)` URLs, and `file://` URLs.
- If you pass `"/"`, the Canvas shows the local scaffold or `index.html`.
Canvas is exposed via the Gateway **node bridge**, so the agent can: ## A2UI in Canvas
- Show/hide the panel.
- Navigate to a path (relative to the session root).
- Evaluate JavaScript and optionally return results.
- Query/modify DOM (helpers mirroring “dom query/all/attr/click/type/wait” patterns).
- Capture a snapshot image of the current canvas view.
- Optionally set panel placement (screen `x/y` + `width/height`) when showing/navigating.
This should be modeled after `WebChatManager`/`WebChatSwiftUIWindowController` but targeting `clawdbot-canvas://…` URLs. A2UI is hosted by the Gateway canvas host and rendered inside the Canvas panel.
When the Gateway advertises a Canvas host, the macOS app autonavigates to the
A2UI host page on first open.
Related: Default A2UI host URL:
- For “invoke the agent again from UI” flows, prefer the macOS deep link scheme (`clawdbot://agent?...`) so *any* UI surface (Canvas, WebChat, native views) can trigger a new agent run. See [`docs/macos.md`](/platforms/macos).
## Agent commands (current)
Use the main `clawdbot` CLI; it invokes canvas commands via `node.invoke`.
- `clawdbot nodes canvas present --node <id> [--target <...>] [--x/--y/--width/--height]`
- Local targets map into the session directory via the custom scheme (directory targets resolve `index.html|index.htm`).
- If `/` has no index file, Canvas shows the built-in scaffold page and returns `status: "welcome"`.
- `clawdbot nodes canvas hide --node <id>`
- `clawdbot nodes canvas eval --js <code> --node <id>`
- `clawdbot nodes canvas snapshot --node <id>`
### Canvas A2UI
Canvas A2UI is hosted by the **Gateway canvas host** at:
``` ```
http://<gateway-host>:18793/__clawdbot__/a2ui/ http://<gateway-host>:18793/__clawdbot__/a2ui/
``` ```
The macOS app simply renders that page in the Canvas panel. The agent can drive it with JSONL **server→client protocol messages** (one JSON object per line): ### A2UI commands (v0.8)
- `clawdbot nodes canvas a2ui push --jsonl <path> --node <id>` Canvas currently accepts **A2UI v0.8** server→client messages:
- `clawdbot nodes canvas a2ui reset --node <id>`
`push` expects a JSONL file where **each line is a single JSON object** (parsed and forwarded to the in-page A2UI renderer). - `beginRendering`
- `surfaceUpdate`
- `dataModelUpdate`
- `deleteSurface`
Minimal example (v0.8): `createSurface` (v0.9) is not supported.
CLI example:
```bash ```bash
cat > /tmp/a2ui-v0.8.jsonl <<'EOF' cat > /tmp/a2ui-v0.8.jsonl <<'EOFA2'
{"surfaceUpdate":{"surfaceId":"main","components":[{"id":"root","component":{"Column":{"children":{"explicitList":["title","content"]}}}},{"id":"title","component":{"Text":{"text":{"literalString":"Canvas (A2UI v0.8)"},"usageHint":"h1"}}},{"id":"content","component":{"Text":{"text":{"literalString":"If you can read this, `nodes canvas a2ui push` works."},"usageHint":"body"}}}]}} {"surfaceUpdate":{"surfaceId":"main","components":[{"id":"root","component":{"Column":{"children":{"explicitList":["title","content"]}}}},{"id":"title","component":{"Text":{"text":{"literalString":"Canvas (A2UI v0.8)"},"usageHint":"h1"}}},{"id":"content","component":{"Text":{"text":{"literalString":"If you can read this, A2UI push works."},"usageHint":"body"}}}]}}
{"beginRendering":{"surfaceId":"main","root":"root"}} {"beginRendering":{"surfaceId":"main","root":"root"}}
EOF EOFA2
clawdbot nodes canvas a2ui push --jsonl /tmp/a2ui-v0.8.jsonl --node <id> clawdbot nodes canvas a2ui push --jsonl /tmp/a2ui-v0.8.jsonl --node <id>
``` ```
Notes: Quick smoke:
- This does **not** support the A2UI v0.9 examples using `createSurface`.
- A2UI **fails** if the Gateway canvas host is unreachable (no local fallback).
- `nodes canvas a2ui push` validates JSONL (line numbers on errors) and rejects v0.9 payloads.
- Quick smoke: `clawdbot nodes canvas a2ui push --node <id> --text "Hello from A2UI"` renders a minimal v0.8 view.
## Triggering agent runs from Canvas (deep links) ```bash
clawdbot nodes canvas a2ui push --node <id> --text "Hello from A2UI"
```
## Triggering agent runs from Canvas
Canvas can trigger new agent runs via deep links:
Canvas can trigger new agent runs via the macOS app deep-link scheme:
- `clawdbot://agent?...` - `clawdbot://agent?...`
This is intentionally separate from `clawdbot-canvas://…` (which is only for serving local Canvas files into the `WKWebView`). Example (in JS):
Suggested patterns: ```js
- HTML: render links/buttons that navigate to `clawdbot://agent?message=...`. window.location.href = "clawdbot://agent?message=Review%20this%20design";
- JS: set `window.location.href = 'clawdbot://agent?...'` for “run this now” actions. ```
Implementation note (important): The app prompts for confirmation unless a valid key is provided.
- In `WKWebView`, intercept `clawdbot://…` navigations in `WKNavigationDelegate` and forward them to the app, e.g. by calling `DeepLinkHandler.shared.handle(url:)` and returning `.cancel` for the navigation.
Safety: ## Security notes
- Deep links (`clawdbot://agent?...`) are always enabled.
- Without a `key` query param, the app will prompt for confirmation before invoking the agent.
- With a valid `key`, the run is unattended (no prompt). For Canvas-originated actions, the app injects an internal key automatically.
## Security / guardrails - Canvas scheme blocks directory traversal; files must live under the session root.
- Local Canvas content uses a custom scheme (no loopback server required).
Recommended defaults: - External `http(s)` URLs are allowed only when explicitly navigated.
- `WKWebsiteDataStore.nonPersistent()` for Canvas (ephemeral).
- Navigation policy: allow only `clawdbot-canvas://…` (and optionally `about:blank`); open `http/https` externally.
- Scheme handler must prevent directory traversal: resolved file paths must stay under `<canvasRoot>/<session>/`.
- Disable or tightly scope any JS bridge; prefer query-string/bootstrap config over `window.webkit.messageHandlers` for sensitive data.
## Debugging
Suggested debugging hooks:
- Enable Web Inspector for Canvas builds (same approach as WebChat).
- Log scheme requests + resolution decisions to OSLog (subsystem `com.clawdbot`, category `Canvas`).
- Provide a “copy canvas dir” action in debug settings to quickly reveal the session directory in Finder.

View File

@@ -1,72 +1,56 @@
--- ---
summary: "Running the gateway as a child process of the macOS app and why" summary: "Gateway lifecycle on macOS (launchd + attach-only)"
read_when: read_when:
- Integrating the mac app with the gateway lifecycle - Integrating the mac app with the gateway lifecycle
--- ---
# Clawdbot gateway as a child process of the macOS app # Gateway lifecycle on macOS
Date: 2025-12-06 · Status: draft · Owner: steipete The macOS app **manages the Gateway via launchd** by default. This gives you
reliable autostart at login and restart on crashes.
Note (2025-12-19): the current implementation prefers a **launchd LaunchAgent** that runs the **bundled bun-compiled gateway**. This doc remains as an alternative mode for tighter coupling to the UI. Childprocess mode (Gateway spawned directly by the app) is **not in use** today.
If you need tighter coupling to the UI, use **Attachonly** and run the Gateway
manually in a terminal.
## Goal ## Default behavior (launchd)
Run the Node-based Clawdbot/clawdbot gateway as a direct child of the LSUIElement app (instead of a launchd agent) while keeping all TCC-sensitive work inside the Swift app/broker layer and wiring the existing “Clawdbot Active” toggle to start/stop the child.
## When to prefer the child-process mode - The app installs a peruser LaunchAgent labeled `com.clawdbot.gateway`.
- You want gateway lifetime strictly coupled to the menu-bar app (dies when the app quits) and controlled by the “Clawdbot Active” toggle without touching launchd. - When Local mode is enabled, the app ensures the LaunchAgent is loaded and
- Youre okay giving up login persistence/auto-restart that launchd provides, or youll add your own backoff loop. starts the Gateway if needed.
- You want simpler log capture and supervision inside the app (no external plist or user-visible LaunchAgent). - Logs are written to the launchd gateway log path (visible in Debug Settings).
## Tradeoffs vs. launchd Common commands:
- **Pros:** tighter coupling to UI state; simpler surface (no plist install/bootout); easier to stream stdout/stderr; fewer moving parts for beta users.
- **Cons:** no built-in KeepAlive/login auto-start; app crash kills gateway; you must build your own restart/backoff; Activity Monitor will show both processes under the app; still need correct TCC handling (see below).
- **TCC:** behaviorally, child processes often inherit the parent apps “responsible process” for TCC, but this is *not a contract*. Continue to route all protected actions through the Swift app/broker so prompts stay tied to the signed app bundle.
## TCC guardrails (must keep) ```bash
- Screen Recording, Accessibility, mic, and speech prompts must originate from the signed Swift app/broker. The Node child should never call these APIs directly; route through the apps node commands (via Gateway `node.invoke`) for: launchctl kickstart -k gui/$UID/com.clawdbot.gateway
- `system.notify` launchctl bootout gui/$UID/com.clawdbot.gateway
- `system.run` (including `needsScreenRecording`) ```
- `screen.record` / `camera.*`
- PeekabooBridge UI automation (`peekaboo …`)
- Usage strings (`NSMicrophoneUsageDescription`, `NSSpeechRecognitionUsageDescription`, etc.) stay in the app targets Info.plist; a bare Node binary has none and would fail.
- If you ever embed Node that *must* touch TCC, wrap that call in a tiny signed helper target inside the app bundle and have Node exec that helper instead of calling the API directly.
## Process manager design (Swift Subprocess) ## Attachonly (developer mode)
- Add a small `GatewayProcessManager` (Swift) that owns:
- `execution: Execution?` from `Swift Subprocess` to track the child.
- `start(config)` called when “Clawdbot Active” flips ON:
- binary: host Node running the bundled gateway under `Clawdbot.app/Contents/Resources/Gateway/`
- args: current clawdbot entrypoint and flags
- cwd/env: point to `~/.clawdbot` as today; inject the expanded PATH so Homebrew Node resolves under launchd
- output: stream stdout/stderr to `/tmp/clawdbot-gateway.log` (cap buffer via Subprocess OutputLimits)
- restart: optional linear/backoff restart if exit was non-zero and Active is still true
- `stop()` called when Active flips OFF or app terminates: cancel the execution and `waitUntilExit`.
- Wire SwiftUI toggle:
- ON: `GatewayProcessManager.start(...)`
- OFF: `GatewayProcessManager.stop()` (no launchctl calls in this mode)
- Keep the existing `LaunchdManager` around so we can switch back if needed; the toggle can choose between launchd or child mode with a flag if we want both.
## Packaging and signing Attachonly tells the app to **connect to an existing Gateway** without spawning
- Bundle the gateway payload (dist + production node_modules) under `Contents/Resources/Gateway/`; rely on host Node ≥22 instead of embedding a runtime. one. This is ideal for local dev (hotreload, custom flags).
- Codesign native addons and dylibs inside the bundle; no nested runtime binary to sign now.
- Host runtime should not call TCC APIs directly; keep privileged work inside the app/broker.
## Logging and observability Steps:
- Stream child stdout/stderr to `/tmp/clawdbot-gateway.log`; surface the last N lines in the Debug tab.
- Emit a user notification (via existing NotificationManager) on crash/exit while Active is true.
- Add a lightweight heartbeat from Node → app (e.g., ping over stdout) so the app can show status in the menu.
## Failure/edge cases 1) Start the Gateway yourself:
- App crash/quit kills the gateway. Decide if that is acceptable for the deployment tier; otherwise, stick with launchd for production and keep child-process for dev/experiments. ```bash
- If the gateway exits repeatedly, back off (e.g., 1s/2s/5s/10s) and give up after N attempts with a menu warning. pnpm gateway:watch
- Respect the existing pause semantics: when paused, the broker should return `ok=false, "clawdbot paused"`; the gateway should avoid calling privileged routes while paused. ```
2) In the macOS app: Debug Settings → Gateway → **Attach only**.
## Open questions / follow-ups The UI should show “Using existing gateway …” once connected.
- Do we need dual-mode (launchd for prod, child for dev)? If yes, gate via a setting or build flag.
- Embedding a runtime is off the table for now; we rely on host Node for size/simplicity. Revisit only if host PATH drift becomes painful.
- Do we want a tiny signed helper for rare TCC actions that cannot be brokered via the Swift app/broker?
## Decision snapshot (current recommendation) ## Remote mode
- Keep all TCC surfaces in the Swift app/broker (node commands + PeekabooBridgeHost).
- Implement `GatewayProcessManager` with Swift Subprocess to start/stop the gateway on the “Clawdbot Active” toggle. Remote mode never starts a local Gateway. The app uses an SSH tunnel to the
- Maintain the launchd path as a fallback for uptime/login persistence until child-mode proves stable. remote host and connects over that tunnel.
## Why we prefer launchd
- Autostart at login.
- Builtin restart/KeepAlive semantics.
- Predictable logs and supervision.
If a true childprocess mode is ever needed again, it should be documented as a
separate, explicit devonly mode.

View File

@@ -1,170 +1,62 @@
--- ---
summary: "Plan for integrating Peekaboo automation into Clawdbot via PeekabooBridge (socket-based TCC broker)" summary: "PeekabooBridge integration for macOS UI automation"
read_when: read_when:
- Hosting PeekabooBridge in Clawdbot.app - Hosting PeekabooBridge in Clawdbot.app
- Integrating Peekaboo as a submodule - Integrating Peekaboo as a submodule
- Changing PeekabooBridge protocol/paths - Changing PeekabooBridge protocol/paths
--- ---
# Peekaboo Bridge in Clawdbot (macOS UI automation broker) # Peekaboo Bridge (macOS UI automation)
## TL;DR Clawdbot can host **PeekabooBridge** as a local, permissionaware UI automation
- **Peekaboo removed its XPC helper** and now exposes privileged automation via a **UNIX domain socket bridge** (`PeekabooBridge` / `PeekabooBridgeHost`, socket name `bridge.sock`). broker. This lets the `peekaboo` CLI drive UI automation while reusing the
- Clawdbot integrates by **optionally hosting the same bridge** inside **Clawdbot.app** (user-toggleable). The primary client is the **`peekaboo` CLI** (installed via npm); Clawdbot does not need its own `ui …` CLI surface. macOS apps TCC permissions.
- For **visualizations**, we keep them in **Peekaboo.app** (best UX); Clawdbot stays a thin broker host. No visualizer toggle in Clawdbot.
Non-goals: ## What this is (and isnt)
- No auto-launching Peekaboo.app.
- No onboarding deep links from the automation endpoint (Clawdbot onboarding already handles permissions).
- No AI provider/agent runtime dependencies in Clawdbot (avoid pulling Tachikoma/MCP into the Clawdbot app/CLI).
## Big refactor (Dec 2025): XPC → Bridge - **Host**: Clawdbot.app can act as a PeekabooBridge host.
Peekaboos privileged execution moved from “CLI → XPC helper” to “CLI → socket bridge host”. For Clawdbot this is a win: - **Client**: use the `peekaboo` CLI (no separate `clawdbot ui ...` surface).
- It matches the existing “local socket + codesign checks” approach. - **UI**: visual overlays stay in Peekaboo.app; Clawdbot is a thin broker host.
- It lets us piggyback on **either** Peekaboo.apps permissions **or** Clawdbot.apps permissions (whichever is running).
- It avoids “two apps with two TCC bubbles” unless needed.
Reference (Peekaboo submodule): `Peekaboo/docs/bridge-host.md`. ## Enable the bridge
## Architecture In the macOS app:
### Processes - Settings → **Enable Peekaboo Bridge**
- **Bridge hosts** (provide TCC-backed automation):
- **Peekaboo.app** (preferred; also provides visualizations + controls)
- **Claude.app** (secondary; lets `peekaboo` reuse Claude Desktops granted permissions)
- **Clawdbot.app** (secondary; “thin host” only)
- **Bridge clients** (trigger single actions):
- `peekaboo …` (preferred; humans + agents)
- Optional: Clawdbot/Node shells out to `peekaboo` when it needs UI automation/capture
### Host discovery (client-side) When enabled, Clawdbot starts a local UNIX socket server. If disabled, the host
Order is deliberate: is stopped and `peekaboo` will fall back to other available hosts.
1. Peekaboo.app host (full UX)
2. Claude.app host (piggyback on Claude Desktop permissions)
3. Clawdbot.app host (piggyback on Clawdbot permissions)
Socket paths (convention; exact paths must match Peekaboo): ## Client discovery order
- Peekaboo: `~/Library/Application Support/Peekaboo/bridge.sock`
- Claude: `~/Library/Application Support/Claude/bridge.sock`
- Clawdbot: `~/Library/Application Support/clawdbot/bridge.sock`
No auto-launch: if a host isnt reachable, the command fails with a clear error (start Peekaboo.app, Claude.app, or Clawdbot.app). Peekaboo clients typically try hosts in this order:
Override (debugging): set `PEEKABOO_BRIDGE_SOCKET=/path/to/bridge.sock`. 1. Peekaboo.app (full UX)
2. Claude.app (if installed)
3. Clawdbot.app (thin broker)
### Protocol shape Use `peekaboo bridge status --verbose` to see which host is active and which
- **Single request per connection**: connect → write one JSON request → half-close → read one JSON response → close. socket path is in use. You can override with:
- **Timeout**: 10 seconds end-to-end per action (client enforced; host should also enforce per-operation).
- **Errors**: human-readable string by default; structured envelope in `--json`.
## Dependency strategy (submodule) ```bash
Integrate Peekaboo via git submodule (nested submodules are OK). export PEEKABOO_BRIDGE_SOCKET=/path/to/bridge.sock
```
Path in Clawdbot repo: ## Security & permissions
- `./Peekaboo` (Swabble-style; keep stable so SwiftPM path deps dont churn).
What Clawdbot should use: - The bridge validates **caller code signatures**; TeamID `Y5PE65HELJ` is
- **Client side**: `PeekabooBridge` (socket client + protocol models). allowed by default (Peekaboos signing team), plus the Clawdbot apps TeamID.
- **Host side (Clawdbot.app)**: `PeekabooBridgeHost` + the minimal Peekaboo services needed to implement operations. - Requests time out after ~10 seconds.
- If required permissions are missing, the bridge returns a clear error message
rather than launching System Settings.
What Clawdbot should *not* embed: ## Snapshot behavior (automation)
- **Visualizer UI**: keep it in Peekaboo.app for now (toggle + controls live there).
- **XPC**: dont reintroduce helper targets; use the bridge.
## IPC / CLI surface Snapshots are stored in memory and expire automatically after a short window.
### No `clawdbot ui …` If you need longer retention, recapture from the client.
We avoid a parallel “Clawdbot UI automation CLI”. Instead:
- `peekaboo` is the user/agent-facing CLI surface for automation and capture.
- Clawdbot.app can host PeekabooBridge as a **thin TCC broker** so Peekaboo can piggyback on Clawdbot permissions when Peekaboo.app isnt running.
### Diagnostics ## Troubleshooting
Use Peekaboos built-in diagnostics to see which host would be used:
- `peekaboo bridge status`
- `peekaboo bridge status --verbose`
- `peekaboo bridge status --json`
### Output format - If `peekaboo` reports “bridge client is not authorized”, ensure the client is
Peekaboo commands default to human text output. Add `--json` for a structured envelope. properly signed or run the host with `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`
in **debug** mode only.
### Timeouts - If no hosts are found, open one of the host apps (Peekaboo.app or Clawdbot.app)
Default timeout for UI actions: **10 seconds** end-to-end (client enforced; host should also enforce per-operation). and confirm permissions are granted.
## Coordinate model (multi-display)
Requirement: coordinates are **per screen**, not global.
Standardize for the CLI (agent-friendly): **top-left origin per screen**.
Proposed request shape:
- Requests accept `screenIndex` + `{x, y}` in that screens local coordinate space.
- Clawdbot.app converts to global CG coordinates using `NSScreen.screens[screenIndex].frame.origin`.
- Responses should echo both:
- The resolved `screenIndex`
- The local `{x, y}` and bounds
- Optionally the global `{x, y}` for debugging
Ordering: use `NSScreen.screens` ordering consistently (documented in the CLI help + JSON schema).
## Targeting (per app/window)
Expose window/app targeting in the UI surface (align with Peekaboo targeting):
- frontmost
- by app name / bundle id
- by window title substring
- by (app, index)
Peekaboo CLI targeting (agent-friendly):
- `--bundle-id <id>` for app targeting
- `--window-index <n>` (0-based) for disambiguating within an app when capturing
All “see/click/type/scroll/wait” requests should accept a target (default: frontmost).
## “See” + click packs (Playwright-style)
Behavior stays aligned with Peekaboo:
- `peekaboo see` returns element IDs (e.g. `B1`, `T3`) with bounds/labels.
- Follow-up actions reference those IDs without re-scanning.
`peekaboo see` should:
- capture (optionally targeted) window/screen
- return a screenshot **file path** (default: temp directory)
- return a list of elements (text or JSON)
Snapshot lifecycle requirement:
- Host apps are long-lived, so snapshot state should be **in-memory by default**.
- Snapshot scoping: “implicit snapshot” is **per target bundle id** (reuse last snapshot for that app when snapshot id is omitted).
Practical flow (agent-friendly):
- `peekaboo list apps` / `peekaboo list windows` provide bundle-id context for targeting.
- `peekaboo see --bundle-id X` updates the implicit snapshot for `X`.
- `peekaboo click --bundle-id X --on B1` reuses the most recent snapshot for `X` when `--snapshot-id` is omitted.
## Visualizer integration
Keep visualizations in **Peekaboo.app** for now.
- Clawdbot hosts the bridge, but does not render overlays.
- Any “visualizer enabled/disabled” setting is controlled in Peekaboo.app.
## Screenshots (legacy → Peekaboo takeover)
Clawdbot should not grow a separate screenshot CLI surface.
Migration plan:
- Use `peekaboo capture …` / `peekaboo see …` (returns a file path, default temp directory).
- Once Clawdbot legacy screenshot plumbing is replaced, remove it cleanly (no aliases).
## Permissions behavior
If required permissions are missing:
- return `ok=false` with a short human error message (e.g., “Accessibility permission missing”)
- do not try to open System Settings from the automation endpoint
## Security (socket auth)
Both hosts must enforce:
- filesystem perms on the socket path (owner read/write only)
- server-side caller validation:
- require the callers code signature TeamID to be `Y5PE65HELJ`
- optional bundle-id allowlist for tighter scoping
Debug-only escape hatch (development convenience):
- “allow same-UID callers” means: *skip codesign checks for clients running under the same Unix user*.
- This must be **opt-in**, **DEBUG-only**, and guarded by an env var (Peekaboo uses `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`).
## Next integration steps (after this doc)
1. Add Peekaboo as a git submodule (nested submodules OK).
2. Host `PeekabooBridgeHost` inside Clawdbot.app behind a single setting (“Enable Peekaboo Bridge”, default on).
3. Ensure Clawdbot hosts the bridge at `~/Library/Application Support/clawdbot/bridge.sock` and speaks the PeekabooBridge JSON protocol.
4. Validate with `peekaboo bridge status --verbose` that Peekaboo can select Clawdbot as the fallback host (no auto-launch).
5. Keep all protocol decisions aligned with Peekaboo (coordinate system, element IDs, snapshot scoping, error envelopes).

View File

@@ -3,25 +3,37 @@ summary: "How the mac app embeds the gateway WebChat and how to debug it"
read_when: read_when:
- Debugging mac WebChat view or loopback port - Debugging mac WebChat view or loopback port
--- ---
# Web Chat (macOS app) # WebChat (macOS app)
The macOS menu bar app shows the WebChat UI as a native SwiftUI view and reuses the **primary Clawd session** (`main`, or `global` when scope is global). The macOS menu bar app embeds the WebChat UI as a native SwiftUI view. It
connects to the Gateway and defaults to the **main session** for the selected
agent (with a session switcher for other sessions).
- **Local mode**: connects directly to the local Gateway WebSocket. - **Local mode**: connects directly to the local Gateway WebSocket.
- **Remote mode**: forwards the Gateway WebSocket control port over SSH and uses that as the data plane. - **Remote mode**: forwards the Gateway control port over SSH and uses that
tunnel as the data plane.
## Launch & debugging ## Launch & debugging
- Manual: Lobster menu → “Open Chat”. - Manual: Lobster menu → “Open Chat”.
- Auto-open for testing: run `dist/Clawdbot.app/Contents/MacOS/Clawdbot --webchat` (or pass `--webchat` to the binary launched by launchd). The window opens on startup. - Autoopen for testing:
- Logs: see [`./scripts/clawlog.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/clawlog.sh) (subsystem `com.clawdbot`, category `WebChatSwiftUI`). ```bash
dist/Clawdbot.app/Contents/MacOS/Clawdbot --webchat
```
- Logs: `./scripts/clawlog.sh` (subsystem `com.clawdbot`, category `WebChatSwiftUI`).
## How its wired ## How its wired
- Implementation: [`apps/macos/Sources/Clawdbot/WebChatSwiftUI.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/WebChatSwiftUI.swift) hosts `ClawdbotChatUI` and speaks to the Gateway over `GatewayConnection`.
- Data plane: Gateway WebSocket methods `chat.history`, `chat.send`, `chat.abort`; events `chat`, `agent`, `presence`, `tick`, `health`.
- Session: usually primary (`main`); multiple transports (WhatsApp/Telegram/Discord/Desktop) share the same key. The onboarding flow uses a dedicated `onboarding` session to keep first-run setup separate.
## Security / surface area - Data plane: Gateway WS methods `chat.history`, `chat.send`, `chat.abort` and
events `chat`, `agent`, `presence`, `tick`, `health`.
- Session: defaults to the primary session (`main`, or `global` when scope is
global). The UI can switch between sessions.
- Onboarding uses a dedicated session to keep firstrun setup separate.
## Security surface
- Remote mode forwards only the Gateway WebSocket control port over SSH. - Remote mode forwards only the Gateway WebSocket control port over SSH.
## Known limitations ## Known limitations
- The UI is optimized for the primary session and typical “chat” usage (not a full browser-based sandbox surface).
- The UI is optimized for chat sessions (not a full browser sandbox).

View File

@@ -3,7 +3,7 @@ summary: "macOS IPC architecture for Clawdbot app, gateway node bridge, and Peek
read_when: read_when:
- Editing IPC contracts or menu bar app IPC - Editing IPC contracts or menu bar app IPC
--- ---
# Clawdbot macOS IPC architecture (Dec 2025) # Clawdbot macOS IPC architecture
**Current model:** there is **no local control socket** and no `clawdbot-mac` CLI. All agent actions go through the Gateway WebSocket and `node.invoke`. UI automation still uses PeekabooBridge. **Current model:** there is **no local control socket** and no `clawdbot-mac` CLI. All agent actions go through the Gateway WebSocket and `node.invoke`. UI automation still uses PeekabooBridge.
@@ -21,10 +21,10 @@ read_when:
- UI automation uses a separate UNIX socket named `bridge.sock` and the PeekabooBridge JSON protocol. - UI automation uses a separate UNIX socket named `bridge.sock` and the PeekabooBridge JSON protocol.
- Host preference order (client-side): Peekaboo.app → Claude.app → Clawdbot.app → local execution. - Host preference order (client-side): Peekaboo.app → Claude.app → Clawdbot.app → local execution.
- Security: bridge hosts require TeamID `Y5PE65HELJ`; DEBUG-only same-UID escape hatch is guarded by `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (Peekaboo convention). - Security: bridge hosts require TeamID `Y5PE65HELJ`; DEBUG-only same-UID escape hatch is guarded by `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (Peekaboo convention).
- See: [`docs/mac/peekaboo.md`](/platforms/mac/peekaboo) for the Clawdbot plan and naming. - See: [`docs/mac/peekaboo.md`](/platforms/mac/peekaboo) for PeekabooBridge usage.
### Mach/XPC (future direction) ### Mach/XPC
- Still optional for internal app services, but **not required** for automation now that node.invoke is the surface. - Not required for automation; `node.invoke` + PeekabooBridge cover current needs.
## Operational flows ## Operational flows
- Restart/rebuild: `SIGN_IDENTITY="Apple Development: Peter Steinberger (2ZAC4GM7GD)" scripts/restart-mac.sh` - Restart/rebuild: `SIGN_IDENTITY="Apple Development: Peter Steinberger (2ZAC4GM7GD)" scripts/restart-mac.sh`
@@ -37,4 +37,4 @@ read_when:
- Prefer requiring a TeamID match for all privileged surfaces. - Prefer requiring a TeamID match for all privileged surfaces.
- PeekabooBridge: `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (DEBUG-only) may allow same-UID callers for local development. - PeekabooBridge: `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (DEBUG-only) may allow same-UID callers for local development.
- All communication remains local-only; no network sockets are exposed. - All communication remains local-only; no network sockets are exposed.
- TCC prompts originate only from the GUI app bundle; run [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh) so the signed bundle ID stays stable. - TCC prompts originate only from the GUI app bundle; keep the signed bundle ID stable across rebuilds.

View File

@@ -1,123 +1,97 @@
--- ---
summary: "Spec for the Clawdbot macOS companion menu bar app (gateway + node broker)" summary: "Clawdbot macOS companion app (menu bar + gateway broker)"
read_when: read_when:
- Implementing macOS app features - Implementing macOS app features
- Changing gateway lifecycle or node bridging on macOS - Changing gateway lifecycle or node bridging on macOS
--- ---
# Clawdbot macOS Companion (menu bar + gateway broker) # Clawdbot macOS Companion (menu bar + gateway broker)
Author: steipete · Status: draft spec · Date: 2025-12-20 The macOS app is the **menubar companion** for Clawdbot. It owns permissions,
manages the Gateway locally, and exposes macOS capabilities to the agent as a
node.
## Support snapshot ## What it does
- Core Gateway: supported (TypeScript on Node/Bun).
- Companion app: macOS menu bar app with permissions + node bridge.
- Install: [Getting Started](/start/getting-started) or [Install & updates](/install/updating).
- Gateway: [Runbook](/gateway) + [Configuration](/gateway/configuration).
## System control (launchd) - Shows native notifications and status in the menu bar.
If you run the bundled macOS app, it installs a per-user LaunchAgent labeled `com.clawdbot.gateway`. - Owns TCC prompts (Notifications, Accessibility, Screen Recording, Microphone,
CLI-only installs can use `clawdbot onboard --install-daemon`, `clawdbot daemon install`, or `clawdbot configure`**Gateway daemon**. Speech Recognition, Automation/AppleScript).
- Runs or connects to the Gateway (local or remote).
- Exposes macOSonly tools (Canvas, Camera, Screen Recording, `system.run`).
- Optionally hosts **PeekabooBridge** for UI automation.
- Installs a helper CLI (`clawdbot`) into `/usr/local/bin` and
`/opt/homebrew/bin` on request.
## Local vs remote mode
- **Local** (default): the app ensures a local Gateway is running via launchd.
- **Remote**: the app connects to a Gateway over SSH/Tailscale and never starts
a local process.
- **Attachonly** (debug): the app connects to an alreadyrunning local Gateway
and never spawns its own.
## Launchd control
The app manages a peruser LaunchAgent labeled `com.clawdbot.gateway`.
```bash ```bash
launchctl kickstart -k gui/$UID/com.clawdbot.gateway launchctl kickstart -k gui/$UID/com.clawdbot.gateway
launchctl bootout gui/$UID/com.clawdbot.gateway launchctl bootout gui/$UID/com.clawdbot.gateway
``` ```
`launchctl` only works if the LaunchAgent is installed; otherwise run `clawdbot daemon install` first. If the LaunchAgent isnt installed, enable it from the app or run
`clawdbot daemon install`.
Details: [Gateway runbook](/gateway) and [Bundled bun Gateway](/platforms/mac/bun). ## Node capabilities (mac)
## Purpose The macOS app presents itself as a node. Common commands:
- Single macOS menu-bar app named **Clawdbot** that:
- Shows native notifications for Clawdbot/clawdbot events.
- Owns TCC prompts (Notifications, Accessibility, Screen Recording, Automation/AppleScript, Microphone, Speech Recognition).
- Runs (or connects to) the **Gateway** and exposes itself as a **node** so agents can reach macOSonly features.
- Hosts **PeekabooBridge** for UI automation (consumed by `peekaboo`; see [`docs/mac/peekaboo.md`](/platforms/mac/peekaboo)).
- Installs a single CLI (`clawdbot`) by symlinking the bundled binary.
## High-level design - Canvas: `canvas.present`, `canvas.navigate`, `canvas.eval`, `canvas.snapshot`, `canvas.a2ui.*`
- SwiftPM package in `apps/macos/` (macOS 15+, Swift 6). - Camera: `camera.snap`, `camera.clip`
- Targets:
- `ClawdbotIPC` (shared Codable types + helpers for appinternal actions).
- `Clawdbot` (LSUIElement MenuBarExtra app; hosts Gateway + node bridge + PeekabooBridgeHost).
- Bundle ID: `com.clawdbot.mac`.
- Bundled runtime binaries live under `Contents/Resources/Relay/`:
- `clawdbot` (buncompiled relay: CLI + gateway)
- The app symlinks `clawdbot` into `/usr/local/bin` and `/opt/homebrew/bin`.
## Gateway + node bridge
- The mac app runs the Gateway in **local** mode (unless configured remote).
- The gateway port is configurable via `gateway.port` or `CLAWDBOT_GATEWAY_PORT` (default 18789). The mac app reads that value for launchd, probes, and remote SSH tunnels.
- The mac app connects to the bridge as a **node** and advertises capabilities/commands.
- Agentfacing actions are exposed via `node.invoke` (no local control socket).
- The mac app watches `~/.clawdbot/clawdbot.json` and switches modes live when `gateway.mode` or `gateway.remote.url` changes.
- If `gateway.mode` is unset but `gateway.remote.url` is set, the mac app treats it as remote mode.
- Changing connection mode in the mac app writes `gateway.mode` (and `gateway.remote.url` in remote mode) back to the config file.
### Node commands (mac)
- Canvas: `canvas.present|navigate|eval|snapshot|a2ui.*`
- Camera: `camera.snap|camera.clip`
- Screen: `screen.record` - Screen: `screen.record`
- System: `system.run` (shell) and `system.notify` - System: `system.run`, `system.notify`
### Permission advertising The node reports a `permissions` map so agents can decide whats allowed.
- Nodes include a `permissions` map in hello/pairing.
- The Gateway surfaces it via `node.list` / `node.describe` so agents can decide what to run.
## CLI (`clawdbot`) ## Deep links
- The **only** CLI is `clawdbot` (TS/bun). There is no `clawdbot-mac` helper.
- For macspecific actions, the CLI uses `node.invoke`:
- `clawdbot nodes canvas present|navigate|eval|snapshot|a2ui push|a2ui reset`
- `clawdbot nodes run --node <id> -- <command...>`
- `clawdbot nodes notify --node <id> --title ...`
## Onboarding The app registers the `clawdbot://` URL scheme for local actions.
- Install CLI (symlink) → Permissions checklist → Test notification → Done.
- Remote mode skips local gateway/CLI steps.
- Selecting Local auto-enables the bundled Gateway via launchd (unless “Attach only” debug mode is enabled).
## Deep links (URL scheme)
Clawdbot (the macOS app) registers a URL scheme for triggering local actions from anywhere (browser, Shortcuts, CLI, etc.).
Scheme:
- `clawdbot://…`
### `clawdbot://agent` ### `clawdbot://agent`
Triggers a Gateway `agent` request (same machinery as WebChat/agent runs). Triggers a Gateway `agent` request.
Example:
```bash ```bash
open 'clawdbot://agent?message=Hello%20from%20deep%20link' open 'clawdbot://agent?message=Hello%20from%20deep%20link'
``` ```
Query parameters: Query parameters:
- `message` (required): the agent prompt (URL-encoded). - `message` (required)
- `sessionKey` (optional): explicit session key to use. - `sessionKey` (optional)
- `thinking` (optional): thinking hint (e.g. `low`; omit for default). - `thinking` (optional)
- `deliver` (optional): `true|false` (default: false). - `deliver` / `to` / `provider` (optional)
- `to` / `provider` (optional): forwarded to the Gateway `agent` method (only meaningful with `deliver=true`). - `timeoutSeconds` (optional)
- `timeoutSeconds` (optional): timeout hint forwarded to the Gateway. - `key` (optional unattended mode key)
- `key` (optional): unattended mode key (see below).
Safety/guardrails: Safety:
- Always enabled. - Without `key`, the app prompts for confirmation.
- Without a `key` query param, the app will prompt for confirmation before invoking the agent. - With a valid `key`, the run is unattended (intended for personal automations).
- With `key=<value>`, Clawdbot runs without prompting (intended for personal automations).
- The current key is shown in Debug Settings and stored locally in UserDefaults.
Notes: ## Onboarding flow (typical)
- In local mode, Clawdbot will start the local Gateway if needed before issuing the request.
- In remote mode, Clawdbot will use the configured remote tunnel/endpoint. 1) Install and launch **Clawdbot.app**.
2) Complete the permissions checklist (TCC prompts).
3) Ensure **Local** mode is active and the Gateway is running.
4) Install the CLI helper if you want terminal access.
## Build & dev workflow (native) ## Build & dev workflow (native)
- `cd apps/macos && swift build` (debug) / `swift build -c release`.
- Run app for dev: `swift run Clawdbot` (or Xcode scheme).
- Package app + CLI: [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh) (builds bun CLI + gateway).
- Tests: add Swift Testing suites under `apps/macos/Tests`.
## Open questions / decisions - `cd apps/macos && swift build`
- Should `system.run` support streaming stdout/stderr or keep buffered responses only? - `swift run Clawdbot` (or Xcode)
- Should we allow nodeside permission prompts, or always require explicit app UI action? - Package app + CLI: `scripts/package-mac-app.sh`
## Related docs
- [Gateway runbook](/gateway)
- [Bundled bun Gateway](/platforms/mac/bun)
- [macOS permissions](/platforms/mac/permissions)
- [Canvas](/platforms/mac/canvas)

View File

@@ -107,18 +107,16 @@ Slack's Conversations API is type-scoped: you only need the scopes for the
conversation types you actually touch (channels, groups, im, mpim). See conversation types you actually touch (channels, groups, im, mpim). See
https://api.slack.com/docs/conversations-api for the overview. https://api.slack.com/docs/conversations-api for the overview.
### Required by current code ### Required scopes
- `chat:write` (send/update/delete messages via `chat.postMessage`) - `chat:write` (send/update/delete messages via `chat.postMessage`)
https://api.slack.com/methods/chat.postMessage https://api.slack.com/methods/chat.postMessage
- `im:write` (open DMs via `conversations.open` for user DMs) - `im:write` (open DMs via `conversations.open` for user DMs)
https://api.slack.com/methods/conversations.open https://api.slack.com/methods/conversations.open
- `channels:history`, `groups:history`, `im:history`, `mpim:history` - `channels:history`, `groups:history`, `im:history`, `mpim:history`
(`conversations.history` in [`src/slack/actions.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/slack/actions.ts))
https://api.slack.com/methods/conversations.history https://api.slack.com/methods/conversations.history
- `channels:read`, `groups:read`, `im:read`, `mpim:read` - `channels:read`, `groups:read`, `im:read`, `mpim:read`
(`conversations.info` in [`src/slack/monitor.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/slack/monitor.ts))
https://api.slack.com/methods/conversations.info https://api.slack.com/methods/conversations.info
- `users:read` (`users.info` in [`src/slack/monitor.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/slack/monitor.ts) + [`src/slack/actions.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/slack/actions.ts)) - `users:read` (user lookup)
https://api.slack.com/methods/users.info https://api.slack.com/methods/users.info
- `reactions:read`, `reactions:write` (`reactions.get` / `reactions.add`) - `reactions:read`, `reactions:write` (`reactions.get` / `reactions.add`)
https://api.slack.com/methods/reactions.get https://api.slack.com/methods/reactions.get

View File

@@ -188,8 +188,3 @@ Recommended for personal numbers:
- Subsystems: `whatsapp/inbound`, `whatsapp/outbound`, `web-heartbeat`, `web-reconnect`. - Subsystems: `whatsapp/inbound`, `whatsapp/outbound`, `web-heartbeat`, `web-reconnect`.
- Log file: `/tmp/clawdbot/clawdbot-YYYY-MM-DD.log` (configurable). - Log file: `/tmp/clawdbot/clawdbot-YYYY-MM-DD.log` (configurable).
- Troubleshooting guide: [`docs/troubleshooting.md`](/gateway/troubleshooting). - Troubleshooting guide: [`docs/troubleshooting.md`](/gateway/troubleshooting).
## Tests
- [`src/web/auto-reply.test.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/web/auto-reply.test.ts) (mention gating, history injection, reply flow)
- [`src/web/monitor-inbox.test.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/web/monitor-inbox.test.ts) (inbound parsing + reply context)
- [`src/web/outbound.test.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/web/outbound.test.ts) (send mapping + media)

View File

@@ -125,7 +125,7 @@ Use these hubs to discover every page, including deep dives and reference docs t
- [Linux](https://docs.clawd.bot/platforms/linux) - [Linux](https://docs.clawd.bot/platforms/linux)
- [Web surfaces](https://docs.clawd.bot/web) - [Web surfaces](https://docs.clawd.bot/web)
## macOS companion app (internals) ## macOS companion app (advanced)
- [macOS dev setup](https://docs.clawd.bot/platforms/mac/dev-setup) - [macOS dev setup](https://docs.clawd.bot/platforms/mac/dev-setup)
- [macOS menu bar](https://docs.clawd.bot/platforms/mac/menu-bar) - [macOS menu bar](https://docs.clawd.bot/platforms/mac/menu-bar)
@@ -144,7 +144,7 @@ Use these hubs to discover every page, including deep dives and reference docs t
- [macOS bun gateway](https://docs.clawd.bot/platforms/mac/bun) - [macOS bun gateway](https://docs.clawd.bot/platforms/mac/bun)
- [macOS XPC](https://docs.clawd.bot/platforms/mac/xpc) - [macOS XPC](https://docs.clawd.bot/platforms/mac/xpc)
- [macOS skills](https://docs.clawd.bot/platforms/mac/skills) - [macOS skills](https://docs.clawd.bot/platforms/mac/skills)
- [macOS Peekaboo plan](https://docs.clawd.bot/platforms/mac/peekaboo) - [macOS Peekaboo](https://docs.clawd.bot/platforms/mac/peekaboo)
## Workspace + templates ## Workspace + templates
@@ -160,13 +160,13 @@ Use these hubs to discover every page, including deep dives and reference docs t
- [Templates: TOOLS](https://docs.clawd.bot/reference/templates/TOOLS) - [Templates: TOOLS](https://docs.clawd.bot/reference/templates/TOOLS)
- [Templates: USER](https://docs.clawd.bot/reference/templates/USER) - [Templates: USER](https://docs.clawd.bot/reference/templates/USER)
## Experiments + proposals ## Experiments (exploratory)
- [Onboarding config protocol](https://docs.clawd.bot/experiments/onboarding-config-protocol) - [Onboarding config protocol](https://docs.clawd.bot/experiments/onboarding-config-protocol)
- [Plan: cron hardening](https://docs.clawd.bot/experiments/plans/cron-add-hardening) - [Cron hardening notes](https://docs.clawd.bot/experiments/plans/cron-add-hardening)
- [Plan: group policy hardening](https://docs.clawd.bot/experiments/plans/group-policy-hardening) - [Group policy hardening notes](https://docs.clawd.bot/experiments/plans/group-policy-hardening)
- [Research: memory](https://docs.clawd.bot/experiments/research/memory) - [Research: memory](https://docs.clawd.bot/experiments/research/memory)
- [Proposal: model config](https://docs.clawd.bot/experiments/proposals/model-config) - [Model config exploration](https://docs.clawd.bot/experiments/proposals/model-config)
## Testing + release ## Testing + release

View File

@@ -1,207 +1,102 @@
--- ---
summary: "Planned first-run onboarding flow for Clawdbot (local vs remote, OAuth auth, workspace bootstrap ritual)" summary: "First-run onboarding flow for Clawdbot (macOS app)"
read_when: read_when:
- Designing the macOS onboarding assistant - Designing the macOS onboarding assistant
- Implementing Anthropic/OpenAI auth or identity setup - Implementing auth or identity setup
--- ---
# Onboarding (macOS app) # Onboarding (macOS app)
This doc describes the intended **first-run onboarding** for Clawdbot. The goal is a good “day 0” experience: pick where the Gateway runs, bind subscription auth (Anthropic or OpenAI) for the embedded agent runtime, and then let the **agent bootstrap itself** via a first-run ritual in the workspace. This doc describes the **current** firstrun onboarding flow. The goal is a
smooth “day 0” experience: pick where the Gateway runs, connect auth, run the
wizard, and let the agent bootstrap itself.
## Page order (high level) ## Page order (current)
1) **Local vs Remote** 1) Welcome + security notice
2) **(Local only)** Connect subscription auth (Anthropic / OpenAI OAuth) — optional, but recommended 2) **Gateway selection** (Local / Remote / Configure later)
3) **Connect Gmail (optional)** — run `clawdbot hooks gmail setup` to configure Pub/Sub hooks 3) **Auth (Anthropic OAuth)** — local only
4) **Onboarding chat** — dedicated session where the agent introduces itself and guides setup 4) **Setup Wizard** (Gatewaydriven)
5) **Permissions** (TCC prompts)
6) **CLI helper** (optional)
7) **Onboarding chat** (dedicated session)
8) Ready
## 1) Local vs Remote ## 1) Local vs Remote
First question: where does the **Gateway** run? Where does the **Gateway** run?
- **Local (this Mac):** onboarding can run OAuth flows and write OAuth credentials locally. - **Local (this Mac):** onboarding can run OAuth flows and write credentials
- **Remote (over SSH/tailnet):** onboarding must not run OAuth locally, because credentials must exist on the **gateway host**. locally.
- **Remote (over SSH/Tailnet):** onboarding does **not** run OAuth locally;
credentials must exist on the gateway host.
- **Configure later:** skip setup and leave the app unconfigured.
Gateway auth tip: Gateway auth tip:
- If you only use Clawdbot on this Mac (loopback gateway), keep auth **Off**. - If you only use Clawdbot locally (loopback), auth can be **Off**.
- Use **Token** for multi-machine access or non-loopback binds. - Use a **token** for multimachine access or nonloopback binds.
Implementation note (2025-12-19): in local mode, the macOS app bundles the Gateway and enables it via a per-user launchd LaunchAgent (no global npm install/Node requirement for the user). ## 2) Local-only auth (Anthropic OAuth)
## 2) Local-only: Connect subscription auth (Anthropic / OpenAI OAuth) The macOS app supports Anthropic OAuth (Claude Pro/Max). The flow:
This is the “bind Clawdbot to subscription auth” step. It is explicitly the **Anthropic (Claude Pro/Max)** or **OpenAI (ChatGPT/Codex)** OAuth flow, not a generic “login”. - Opens the browser for OAuth (PKCE)
- Asks the user to paste the `code#state` value
- Writes credentials to `~/.clawdbot/credentials/oauth.json`
More detail: [/concepts/oauth](/concepts/oauth) Other providers (OpenAI, custom APIs) are configured via environment variables
or config files for now.
### Recommended: OAuth (Anthropic) ## 3) Setup Wizard (Gatewaydriven)
The macOS app should: The app can run the same setup wizard as the CLI. This keeps onboarding in sync
- Start the Anthropic OAuth (PKCE) flow in the users browser. with Gatewayside behavior and avoids duplicating logic in SwiftUI.
- Ask the user to paste the `code#state` value.
- Exchange it for tokens and write credentials to:
- `~/.clawdbot/credentials/oauth.json` (file mode `0600`, directory mode `0700`)
Why this location matters: its the Clawdbot-owned OAuth store. ## 4) Permissions
Clawdbot also imports `oauth.json` into the agent auth profile store (`~/.clawdbot/agents/<agentId>/agent/auth-profiles.json`) on first use.
### Recommended: OAuth (OpenAI Codex) Onboarding requests TCC permissions needed for:
The macOS app should: - Notifications
- Start the OpenAI Codex OAuth (PKCE) flow in the users browser. - Accessibility
- Auto-capture the callback on `http://127.0.0.1:1455/auth/callback` when possible. - Screen Recording
- If the callback fails, prompt the user to paste the redirect URL or code. - Microphone / Speech Recognition
- Store credentials in `~/.clawdbot/credentials/oauth.json` (same OAuth store as Anthropic). - Automation (AppleScript)
- Set `agent.model` to `openai-codex/gpt-5.2` when the model is unset or `openai/*`.
### Alternative: API key (instructions only) ## 5) CLI helper (optional)
Offer an “API key” option, but for now it is **instructions only**: The app can symlink the bundled `clawdbot` CLI into `/usr/local/bin` and
- Get an Anthropic API key. `/opt/homebrew/bin` so terminal workflows work out of the box.
- Provide it to Clawdbot via your preferred mechanism (env/config).
Note: environment variables are often confusing when the Gateway is launched by a GUI app (launchd environment != your shell). ## 6) Onboarding chat (dedicated session)
### Model safety rule After setup, the app opens a dedicated onboarding chat session so the agent can
introduce itself and guide next steps. This keeps firstrun guidance separate
from your normal conversation.
Clawdbot should **always pass** `--model` when invoking the embedded agent (dont rely on defaults). ## Agent bootstrap ritual
Example (CLI): On the first agent run, Clawdbot bootstraps a workspace (default `~/clawd`):
```bash - Seeds `AGENTS.md`, `BOOTSTRAP.md`, `IDENTITY.md`, `USER.md`
clawdbot agent --mode rpc --model anthropic/claude-opus-4-5 "<message>" - Runs a short Q&A ritual (one question at a time)
``` - Writes identity + preferences to `IDENTITY.md`, `USER.md`, `SOUL.md`
- Removes `BOOTSTRAP.md` when finished so it only runs once
If the user skips auth, onboarding should be clear: the agent likely wont respond until auth is configured. ## Optional: Gmail hooks (manual)
## 4) Onboarding chat (dedicated session) Gmail Pub/Sub setup is currently a manual step. Use:
The onboarding flow now embeds the SwiftUI chat view directly. It uses a **special session key**
(`onboarding`) so the “newborn agent” ritual stays separate from the main chat.
This onboarding chat is where the agent:
- does the BOOTSTRAP.md identity ritual (one question at a time)
- visits **soul.md** with the user and writes `SOUL.md` (values, tone, boundaries)
- asks how the user wants to talk (web-only / Telegram / WhatsApp)
- guides linking steps (including showing a QR inline for WhatsApp via the `whatsapp_login` tool)
If the workspace bootstrap is already complete (BOOTSTRAP.md removed), the onboarding chat step is skipped.
## 2.5) Optional: Connect Gmail
The macOS onboarding includes an optional Gmail step. It runs:
```bash ```bash
clawdbot hooks gmail setup --account you@gmail.com clawdbot hooks gmail setup --account you@gmail.com
``` ```
This writes the full `hooks.gmail` config, installs `gcloud` / `gog` / `tailscale` See [/automation/gmail-pubsub](/automation/gmail-pubsub) for details.
via Homebrew if needed, and configures the Pub/Sub push endpoint. After setup,
restart the gateway so the internal Gmail watcher starts.
Once setup is complete, the user can switch to the normal chat (`main`) via the menu bar panel. ## Remote mode notes
## 5) Agent bootstrap ritual (outside onboarding) When the Gateway runs on another machine, credentials and workspace files live
**on that host**. If you need OAuth in remote mode, create:
We no longer collect identity in the onboarding wizard. Instead, the **first agent run** performs a playful bootstrap ritual using files in the workspace: - `~/.clawdbot/credentials/oauth.json`
- `~/.clawdbot/agents/<agentId>/agent/auth-profiles.json`
- Workspace is created implicitly (default `~/clawd`, configurable via `agent.workspace`) when local is selected, on the gateway host.
but only if the folder is empty or already contains `AGENTS.md`.
- Files are seeded: `AGENTS.md`, `BOOTSTRAP.md`, `IDENTITY.md`, `USER.md`.
- `BOOTSTRAP.md` tells the agent to keep it conversational:
- open with a cute hello
- ask **one question at a time** (no multi-question bombardment)
- offer a small set of suggestions where helpful (name, creature, emoji)
- wait for the users reply before asking the next question
- The agent writes results to:
- `IDENTITY.md` (agent name, vibe/creature, emoji)
- `USER.md` (who the user is + how they want to be addressed)
- `SOUL.md` (identity, tone, boundaries — crafted from the soul.md prompt)
- `~/.clawdbot/clawdbot.json` (structured identity defaults)
- After the ritual, the agent **deletes `BOOTSTRAP.md`** so it only runs once.
Identity data still feeds the same defaults as before:
- outbound prefix emoji (`messages.responsePrefix`)
- group mention patterns / wake words
- default session intro (“You are Samantha…”)
- macOS UI labels
## 6) Workspace notes (no explicit onboarding step)
The workspace is created automatically as part of agent bootstrap (no dedicated onboarding screen).
Recommendation: treat the workspace as the agents “memory” and make it a git repo (ideally private) so identity + memories are backed up:
```bash
cd ~/clawd
git init
git add AGENTS.md
git commit -m "Add agent workspace"
```
Daily memory lives under `memory/` in the workspace:
- one file per day: `memory/YYYY-MM-DD.md`
- read today + yesterday on session start
- keep it short (durable facts, preferences, decisions; avoid secrets)
## Remote mode note (why OAuth is hidden)
If the Gateway runs on another machine, OAuth credentials must be created/stored on that host (where the agent runtime runs).
For now, remote onboarding should:
- explain why OAuth isn't shown
- point the user at the credential location (`~/.clawdbot/credentials/oauth.json`) and the auth profile store (`~/.clawdbot/agents/<agentId>/agent/auth-profiles.json`) on the gateway host
- mention that the **bootstrap ritual happens on the gateway host** (same BOOTSTRAP/IDENTITY/USER files)
### Manual credential setup
On the gateway host, create `~/.clawdbot/credentials/oauth.json` with this exact format:
```json
{
"anthropic": { "type": "oauth", "access": "sk-ant-oat01-...", "refresh": "sk-ant-ort01-...", "expires": 1767304352803 },
"openai-codex": { "type": "oauth", "access": "eyJhbGciOi...", "refresh": "oai-refresh-...", "expires": 1767304352803, "accountId": "acct_..." }
}
```
Set permissions: `chmod 600 ~/.clawdbot/credentials/oauth.json`
**Note:** Clawdbot can import from legacy pi-coding-agent paths (`~/.pi/agent/oauth.json`, etc.), but Claude Code/Codex CLI credentials live in different files.
### Using Claude Code + Codex CLI credentials (direct)
If these CLIs are installed on the **gateway host** and youve already signed in, Clawdbot auto-syncs their OAuth tokens into the per-agent auth profile store (`~/.clawdbot/agents/<agentId>/agent/auth-profiles.json`) on load:
- **Claude Code**: reads `~/.claude/.credentials.json` → profile `anthropic:claude-cli`
- **Codex CLI**: reads `~/.codex/auth.json` → profile `openai-codex:codex-cli`
Verification:
```bash
clawdbot providers list
```
### Fallback: convert Claude Code credentials into `oauth.json`
If you dont want to install Claude Code on the gateway host, you can still seed the legacy import file:
```bash
cat ~/.claude/.credentials.json | jq '{
anthropic: {
type: "oauth",
access: .claudeAiOauth.accessToken,
refresh: .claudeAiOauth.refreshToken,
expires: .claudeAiOauth.expiresAt
}
}' > ~/.clawdbot/credentials/oauth.json
chmod 600 ~/.clawdbot/credentials/oauth.json
```
## Workspace backup (recommended)
We suggest creating a **private GitHub repository** to back up the agent
workspace. The agent is really good at keeping a git repo in shape, and GitHub
is the perfect place for it. Keep it **private**.
Setup steps: https://docs.clawd.bot/concepts/agent-workspace

View File

@@ -43,10 +43,6 @@ Stored under `~/.clawdbot/credentials/`:
Treat these as sensitive (they gate access to your assistant). Treat these as sensitive (they gate access to your assistant).
### Source of truth (code)
- DM pairing storage + code generation: [`src/pairing/pairing-store.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/pairing/pairing-store.ts)
- CLI commands: [`src/cli/pairing-cli.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/cli/pairing-cli.ts)
## 2) Node pairing (iOS/Android nodes joining the gateway) ## 2) Node pairing (iOS/Android nodes joining the gateway)
@@ -70,11 +66,6 @@ Stored under `~/.clawdbot/nodes/`:
Full protocol + design notes: [Gateway pairing](/gateway/pairing) Full protocol + design notes: [Gateway pairing](/gateway/pairing)
### Source of truth (code)
- Node pairing store (pending/paired + token issuance): [`src/infra/node-pairing.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/node-pairing.ts)
- Gateway methods/events (`node.pair.*`): [`src/gateway/server-methods/nodes.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/nodes.ts)
- CLI: [`src/cli/nodes-cli.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/cli/nodes-cli.ts)
## Related docs ## Related docs

View File

@@ -55,7 +55,7 @@ clawdbot providers login
clawdbot health clawdbot health
``` ```
If onboarding is still WIP/broken on your build: If onboarding is not available in your build:
- Run `clawdbot setup`, then `clawdbot providers login`, then start the Gateway manually (`clawdbot gateway`). - Run `clawdbot setup`, then `clawdbot providers login`, then start the Gateway manually (`clawdbot gateway`).
## Bleeding edge workflow (Gateway in a terminal) ## Bleeding edge workflow (Gateway in a terminal)
@@ -77,7 +77,7 @@ pnpm install
pnpm gateway:watch pnpm gateway:watch
``` ```
`gateway:watch` runs `src/entry.ts gateway --force` and reloads on [`src/**/*.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/**/*.ts) changes. `gateway:watch` runs the gateway in watch mode and reloads on TypeScript changes.
### 2) Point the macOS app at your running Gateway ### 2) Point the macOS app at your running Gateway

View File

@@ -1,21 +1,44 @@
--- ---
summary: "Design notes for a direct `clawdbot agent` CLI subcommand without WhatsApp delivery" summary: "Direct `clawdbot agent` CLI runs (with optional delivery)"
read_when: read_when:
- Adding or modifying the agent CLI entrypoint - Adding or modifying the agent CLI entrypoint
--- ---
# `clawdbot agent` (direct-to-agent invocation) # `clawdbot agent` (direct agent runs)
`clawdbot agent` lets you talk to the **embedded** agent runtime directly (no chat send unless you opt in), while reusing the same session store and thinking/verbose persistence as inbound auto-replies. `clawdbot agent` runs a single agent turn without needing an inbound chat message.
By default it goes **through the Gateway**; add `--local` to force the embedded
runtime on the current machine.
## Behavior ## Behavior
- Required: `--message <text>` - Required: `--message <text>`
- Session selection: - Session selection:
- If `--session-id` is given, reuse it. - `--to <E.164>` derives the session key (normal direct-chat routing), **or**
- Else if `--to <e164>` is given, derive the session key from `session.scope` (direct chats collapse to `main`, or `global` when scope is global). - `--session-id <id>` reuses an existing session by id
- Runs the embedded Pi agent (configured via `agent`). - Runs the same embedded agent runtime as normal inbound replies.
- Thinking/verbose: - Thinking/verbose flags persist into the session store.
- Flags `--thinking <off|minimal|low|medium|high>` and `--verbose <on|off>` persist into the session store.
- Output: - Output:
- Default: prints text (and `MEDIA:<url>` lines) to stdout. - default: prints reply text (plus `MEDIA:<url>` lines)
- `--json`: prints structured payloads + meta. - `--json`: prints structured payload + metadata
- Optional: `--deliver` sends the reply back to the selected provider (`whatsapp`, `telegram`, `discord`, `signal`, `imessage`). - Optional delivery back to a provider with `--deliver` + `--provider`.
If the Gateway is unreachable, the CLI **falls back** to the embedded local run.
## Examples
```bash
clawdbot agent --to +15555550123 --message "status update"
clawdbot agent --session-id 1234 --message "Summarize inbox" --thinking medium
clawdbot agent --to +15555550123 --message "Trace logs" --verbose on --json
clawdbot agent --to +15555550123 --message "Summon reply" --deliver
```
## Flags
- `--local`: run locally (requires provider keys in your shell)
- `--deliver`: send the reply to the chosen provider (requires `--to`)
- `--provider`: `whatsapp|telegram|discord|slack|signal|imessage` (default: `whatsapp`)
- `--thinking <off|minimal|low|medium|high>`: persist thinking level
- `--verbose <on|off>`: persist verbose level
- `--timeout <seconds>`: override agent timeout
- `--json`: output structured JSON

View File

@@ -1,254 +1,142 @@
--- ---
summary: "Spec: integrated browser control server + action commands" summary: "Integrated browser control server + action commands"
read_when: read_when:
- Adding agent-controlled browser automation - Adding agent-controlled browser automation
- Debugging why clawd is interfering with your own Chrome - Debugging why clawd is interfering with your own Chrome
- Implementing browser settings + lifecycle in the macOS app - Implementing browser settings + lifecycle in the macOS app
--- ---
# Browser (integrated) — clawd-managed Chrome # Browser (clawd-managed)
Status: draft spec · Date: 2025-12-20 Clawdbot can run a **dedicated Chrome/Chromium profile** that the agent controls.
It is isolated from your personal browser and is managed through a small local
control server.
Goal: give the **clawd** persona its own browser that is: ## What you get
- Visually distinct (lobster-orange, profile labeled "clawd").
- Fully agent-manageable (start/stop, list tabs, focus/close tabs, open URLs, screenshot).
- Non-interfering with the user's own browser (separate profile + dedicated ports).
This doc covers the macOS app/gateway side. It intentionally does not mandate - A separate browser profile named **clawd** (orange accent by default).
Playwright vs Puppeteer; the key is the **contract** and the **separation guarantees**. - Deterministic tab control (list/open/focus/close).
- Agent actions (click/type/drag/select), snapshots, screenshots, PDFs.
- Optional multi-profile support (`clawd`, `work`, `remote`, ...).
## User-facing settings This browser is **not** your daily driver. It is a safe, isolated surface for
agent automation and verification.
Add a dedicated settings section (preferably under **Skills** or its own "Browser" tab): ## Quick start
- **Enable clawd browser** (`default: on`) ```bash
- When off: no browser is launched, and browser tools return "disabled". clawdbot browser status
- **Browser control URL** (`default: http://127.0.0.1:18791`) clawdbot browser start
- Interpreted as the base URL of the local/remote browser-control server. clawdbot browser open https://example.com
- If the URL host is not loopback, Clawdbot must **not** attempt to launch a local clawdbot browser snapshot
browser; it only connects. ```
- **CDP URL** (`default: controlUrl + 1`)
- Base URL for Chrome DevTools Protocol (e.g. `http://127.0.0.1:18792`).
- Set this to a non-loopback host to attach the local control server to a remote
Chrome/Chromium CDP endpoint (SSH/Tailscale tunnel recommended).
- If the CDP URL host is non-loopback, clawd does **not** auto-launch a local browser.
- If you tunnel a remote CDP to `localhost`, set **Attach to existing only** to
avoid accidentally launching a local browser.
- **Accent color** (`default: #FF4500`, "lobster-orange")
- Used to theme the clawd browser profile (best-effort) and to tint UI indicators
in Clawdbot.
Optional (advanced, can be hidden behind Debug initially): If you get “Browser disabled, enable it in config (see below) and restart the
- **Use headless browser** (`default: off`) Gateway.
- **Attach to existing only** (`default: off`) — if on, never launch; only connect if
already running.
- **Browser executable path** (override, optional)
- **No sandbox** (`default: off`) — adds `--no-sandbox` + `--disable-setuid-sandbox`
### Port convention ## Configuration
Clawdbot already uses: Browser settings live in `~/.clawdbot/clawdbot.json`.
- Gateway WebSocket: `18789`
- Bridge (voice/node): `18790`
For the clawd browser-control server, use "family" ports: ```json5
- Browser control HTTP API: `18791` (bridge + 1)
- Browser CDP/debugging port: `18792` (control + 1)
- Canvas host HTTP: `18793` by default, mounted at `/__clawdbot__/canvas/`
The user usually only configures the **control URL** (port `18791`). CDP is an
internal detail.
## Browser isolation guarantees (non-negotiable)
1) **Dedicated user data dir**
- Never attach to or reuse the user's default Chrome profile.
- Store clawd browser state under an app-owned directory, e.g.:
- `~/Library/Application Support/Clawdbot/browser/clawd/` (mac app)
- or `~/.clawdbot/browser/clawd/` (gateway/CLI)
2) **Dedicated ports**
- Never use `9222` (reserved for ad-hoc dev workflows; avoids colliding with
`agent-tools/browser-tools`).
- Default ports are `18791/18792` unless overridden.
3) **Named tab/page management**
- The agent must be able to enumerate and target tabs deterministically (by
stable `targetId` or equivalent), not "last tab".
## Browser selection (macOS + Linux)
On startup (when enabled + local URL), Clawdbot chooses the browser executable
in this order:
1) **Google Chrome Canary** (if installed)
2) **Chromium** (if installed)
3) **Google Chrome** (fallback)
Linux:
- Looks for `google-chrome` / `chromium` in common system paths.
- Use **Browser executable path** to force a specific binary.
Implementation detail:
- macOS: detection is by existence of the `.app` bundle under `/Applications`
(and optionally `~/Applications`), then using the resolved executable path.
- Linux: common `/usr/bin`/`/snap/bin` paths.
Rationale:
- Canary/Chromium are easy to visually distinguish from the user's daily driver.
- Chrome fallback ensures the feature works on a stock machine.
## Visual differentiation ("lobster-orange")
The clawd browser should be obviously different at a glance:
- Profile name: **clawd**
- Profile color: **#FF4500**
Preferred behavior:
- Seed/patch the profile's preferences on first launch so the color + name persist.
Fallback behavior:
- If preferences patching is not reliable, open with the dedicated profile and let
the user set the profile color/name once via Chrome UI; it must persist because
the `userDataDir` is persistent.
## Control server contract (vNext)
Expose a small local HTTP API (and/or gateway RPC surface) so the agent can manage
state without touching the user's Chrome.
Basics:
- `GET /` status payload (enabled/running/pid/cdpPort/etc)
- `POST /start` start browser
- `POST /stop` stop browser
- `GET /tabs` list tabs
- `POST /tabs/open` open a new tab
- `POST /tabs/focus` focus a tab by id/prefix
- `DELETE /tabs/:targetId` close a tab by id/prefix
Inspection:
- `POST /screenshot` `{ targetId?, fullPage?, ref?, element?, type? }`
- `GET /snapshot` `?format=aria|ai&targetId?&limit?`
- `GET /console` `?level?&targetId?`
- `POST /pdf` `{ targetId? }`
Actions:
- `POST /navigate`
- `POST /act` `{ kind, targetId?, ... }` where `kind` is one of:
- `click`, `type`, `press`, `hover`, `drag`, `select`, `fill`, `wait`, `resize`, `close`, `evaluate`
Hooks (arming):
- `POST /hooks/file-chooser` `{ targetId?, paths, timeoutMs? }`
- `POST /hooks/dialog` `{ targetId?, accept, promptText?, timeoutMs? }`
### "Is it open or closed?"
"Open" means:
- the control server is reachable at the configured URL **and**
- it reports a live browser connection.
"Closed" means:
- control server not reachable, or server reports no browser.
Clawdbot should treat "open/closed" as a health check (fast path), not by scanning
global Chrome processes (avoid false positives).
## Multi-profile support
Clawdbot supports multiple named browser profiles, each with:
- Dedicated CDP port (auto-allocated from 18800-18899) **or** a per-profile CDP URL
- Persistent user data directory (`~/.clawdbot/browser/<name>/user-data/`)
- Unique color for visual distinction
### Configuration
```json
{ {
"browser": { browser: {
"enabled": true, enabled: true, // default: true
"defaultProfile": "clawd", controlUrl: "http://127.0.0.1:18791",
"profiles": { cdpUrl: "http://127.0.0.1:18792", // defaults to controlUrl + 1
"clawd": { "cdpPort": 18800, "color": "#FF4500" }, defaultProfile: "clawd",
"work": { "cdpPort": 18801, "color": "#0066CC" }, color: "#FF4500",
"remote": { "cdpUrl": "http://10.0.0.42:9222", "color": "#00AA00" } headless: false,
noSandbox: false,
attachOnly: false,
executablePath: "/Applications/Chromium.app/Contents/MacOS/Chromium",
profiles: {
clawd: { cdpPort: 18800, color: "#FF4500" },
work: { cdpPort: 18801, color: "#0066CC" },
remote: { cdpUrl: "http://10.0.0.42:9222", color: "#00AA00" }
} }
} }
} }
``` ```
### Profile actions Notes:
- `controlUrl` defaults to `http://127.0.0.1:18791`.
- If you override the Gateway port (`gateway.port` or `CLAWDBOT_GATEWAY_PORT`),
the default browser ports shift to stay in the same “family” (control = gateway + 2).
- `cdpUrl` defaults to `controlUrl + 1` when unset.
- `attachOnly: true` means “never launch Chrome; only attach if it is already running.”
- `GET /profiles` — list all profiles with status ## Local vs remote control
- `POST /profiles/create` `{ name, color?, cdpUrl? }` — create new profile (auto-allocates port if no `cdpUrl`)
- `DELETE /profiles/:name` — delete profile (stops browser + removes user data for local profiles)
- `POST /reset-profile?profile=<name>` — kill orphan process on profile's port (local profiles only)
### Profile parameter - **Local control (default):** `controlUrl` is loopback (`127.0.0.1`/`localhost`).
The Gateway starts the control server and can launch Chrome.
- **Remote control:** `controlUrl` is non-loopback. The Gateway **does not** start
a local server; it assumes you are pointing at an existing server elsewhere.
- **Remote CDP:** set `browser.profiles.<name>.cdpUrl` (or `browser.cdpUrl`) to
attach to a remote Chrome. In this case, Clawdbot will not launch a local browser.
All existing endpoints accept optional `?profile=<name>` query parameter: ## Profiles (multi-browser)
- `GET /?profile=work` — status for work profile
- `POST /start?profile=work` — start work profile browser
- `GET /tabs?profile=work` — list tabs for work profile
- `POST /tabs/open?profile=work` — open tab in work profile
- etc.
When `profile` is omitted, uses `browser.defaultProfile` (defaults to "clawd"). Clawdbot supports multiple named profiles. Each profile has its own:
- user data directory
- CDP port (local) or CDP URL (remote)
- accent color
### Agent browser tool Defaults:
- The `clawd` profile is auto-created if missing.
- Local CDP ports allocate from **1880018899** by default.
- Deleting a profile moves its local data directory to Trash.
The `browser` tool accepts an optional `profile` parameter for all actions: All control endpoints accept `?profile=<name>`; the CLI uses `--browser-profile`.
```json ## Isolation guarantees
{
"action": "open",
"targetUrl": "https://example.com",
"profile": "work"
}
```
This routes the operation to the specified profile's browser instance. Omitting - **Dedicated user data dir**: never touches your personal Chrome profile.
`profile` uses the default profile. - **Dedicated ports**: avoids `9222` to prevent collisions with dev workflows.
- **Deterministic tab control**: target tabs by `targetId`, not “last tab”.
### Profile naming rules ## Browser selection
- Lowercase alphanumeric characters and hyphens only When launching locally, Clawdbot picks the first available:
- Must start with a letter or number (not a hyphen) 1. Chrome Canary
- Maximum 64 characters 2. Chromium
- Examples: `clawd`, `work`, `my-project-1` 3. Chrome
### Port allocation You can override with `browser.executablePath`.
Ports are allocated from range 18800-18899 (~100 profiles max). This is far more Platforms:
than practical use — memory and CPU exhaustion occur well before port exhaustion. - macOS: checks `/Applications` and `~/Applications`.
Ports are allocated once at profile creation and persisted permanently. - Linux: looks for `google-chrome`, `chromium`, etc.
Remote profiles are attach-only and do **not** use the local port range. - Windows: checks common install locations.
## Interaction with the agent (clawd)
The agent should use browser tools only when: ## Control API (optional)
- enabled in settings
- control URL is configured
If disabled, tools must fail fast with a friendly error ("Browser disabled in settings"). If you want to integrate directly, the browser control server exposes a small
HTTP API:
The agent should not assume tabs are ephemeral. It should: - Status/start/stop: `GET /`, `POST /start`, `POST /stop`
- call `browser.tabs.list` to discover existing tabs first - Tabs: `GET /tabs`, `POST /tabs/open`, `POST /tabs/focus`, `DELETE /tabs/:targetId`
- reuse an existing tab when appropriate (e.g. a persistent "main" tab) - Snapshot/screenshot: `GET /snapshot`, `POST /screenshot`
- avoid opening duplicate tabs unless asked - Actions: `POST /navigate`, `POST /act`
- Hooks: `POST /hooks/file-chooser`, `POST /hooks/dialog`
- Debugging: `GET /console`, `POST /pdf`
## CLI quick reference (one example each) All endpoints accept `?profile=<name>`.
All commands accept `--browser-profile <name>` to target a specific profile (default: `clawd`). ### Playwright requirement
Some features (navigate/act/ai snapshot, element screenshots, PDF) require
Playwright. In embedded gateway builds, Playwright may be unavailable; those
endpoints return a clear 501 error. ARIA snapshots and basic screenshots still work.
## CLI quick reference
All commands accept `--browser-profile <name>` to target a specific profile.
Profile management:
- `clawdbot browser profiles`
- `clawdbot browser create-profile --name work`
- `clawdbot browser create-profile --name remote --cdp-url http://10.0.0.42:9222`
- `clawdbot browser delete-profile --name work`
Basics: Basics:
- `clawdbot browser status` - `clawdbot browser status`
- `clawdbot browser start` - `clawdbot browser start`
- `clawdbot browser stop` - `clawdbot browser stop`
- `clawdbot browser reset-profile`
- `clawdbot browser tabs` - `clawdbot browser tabs`
- `clawdbot browser open https://example.com` - `clawdbot browser open https://example.com`
- `clawdbot browser focus abcd1234` - `clawdbot browser focus abcd1234`
@@ -260,6 +148,8 @@ Inspection:
- `clawdbot browser screenshot --ref 12` - `clawdbot browser screenshot --ref 12`
- `clawdbot browser snapshot` - `clawdbot browser snapshot`
- `clawdbot browser snapshot --format aria --limit 200` - `clawdbot browser snapshot --format aria --limit 200`
- `clawdbot browser console --level error`
- `clawdbot browser pdf`
Actions: Actions:
- `clawdbot browser navigate https://example.com` - `clawdbot browser navigate https://example.com`
@@ -271,39 +161,27 @@ Actions:
- `clawdbot browser drag 10 11` - `clawdbot browser drag 10 11`
- `clawdbot browser select 9 OptionA OptionB` - `clawdbot browser select 9 OptionA OptionB`
- `clawdbot browser upload /tmp/file.pdf` - `clawdbot browser upload /tmp/file.pdf`
- `clawdbot browser fill --fields '[{\"ref\":\"1\",\"value\":\"Ada\"}]'` - `clawdbot browser fill --fields '[{"ref":"1","type":"text","value":"Ada"}]'`
- `clawdbot browser dialog --accept` - `clawdbot browser dialog --accept`
- `clawdbot browser wait --text "Done"` - `clawdbot browser wait --text "Done"`
- `clawdbot browser evaluate --fn '(el) => el.textContent' --ref 7` - `clawdbot browser evaluate --fn '(el) => el.textContent' --ref 7`
- `clawdbot browser evaluate --fn "document.querySelector('.my-class').click()"`
- `clawdbot browser console --level error`
- `clawdbot browser pdf`
Notes: Notes:
- `upload` and `dialog` are **arming** calls; run them before the click/press that triggers the chooser/dialog. - `upload` and `dialog` are **arming** calls; run them before the click/press
- `upload` can take a `ref` to auto-click after arming (useful for single-step file uploads). that triggers the chooser/dialog.
- `upload` can also take `inputRef` (aria ref) or `element` (CSS selector) to set `<input type="file">` directly without waiting for a file chooser. - `upload` can also set file inputs directly via `--input-ref` or `--element`.
- The arm default timeout is **2 minutes** (clamped to max 2 minutes); pass `timeoutMs` if you need shorter. - `snapshot` defaults to `ai` when available; use `--format aria` for the
- `snapshot` defaults to `ai`; `aria` returns an accessibility tree for debugging. accessibility tree.
- `click`/`type` require `ref` from `snapshot --format ai`; use `evaluate` for rare CSS selector one-offs. - `click`/`type` require a `ref` from `snapshot` (CSS selectors are intentionally
- Avoid `wait` by default; use it only in exceptional cases when there is no reliable UI state to wait on. not supported for actions).
## Security & privacy notes ## Security & privacy
- The clawd browser profile is app-owned; it may contain logged-in sessions. - The clawd browser profile may contain logged-in sessions; treat it as sensitive.
Treat it as sensitive data. - Keep control URLs loopback-only unless you intentionally expose the server.
- The control server must bind to loopback only by default (`127.0.0.1`) unless the - Remote CDP endpoints are powerful; tunnel and protect them.
user explicitly configures a non-loopback URL.
- Never reuse or copy the user's default Chrome profile.
- Remote CDP endpoints should be tunneled or protected; CDP is highly privileged.
## Non-goals (for the first cut)
- Cross-device "sync" of tabs between Mac and Pi.
- Sharing the user's logged-in Chrome sessions automatically.
- General-purpose web scraping; this is primarily for "close-the-loop" verification
and interaction.
## Troubleshooting ## Troubleshooting
For Linux-specific issues (especially Ubuntu with snap Chromium), see [browser-linux-troubleshooting](/tools/browser-linux-troubleshooting). For Linux-specific issues (especially snap Chromium), see
[Browser troubleshooting](/tools/browser-linux-troubleshooting).

View File

@@ -294,25 +294,12 @@ Node targeting:
- Respect user consent for camera/screen capture. - Respect user consent for camera/screen capture.
- Use `status/describe` to ensure permissions before invoking media commands. - Use `status/describe` to ensure permissions before invoking media commands.
## How the model sees tools (pi-mono internals) ## How tools are presented to the agent
Tools are exposed to the model in **two parallel channels**: Tools are exposed in two parallel channels:
1) **System prompt text**: a human-readable list + guidelines. 1) **System prompt text**: a human-readable list + guidance.
2) **Provider tool schema**: the actual function/tool declarations sent to the model API. 2) **Tool schema**: the structured function definitions sent to the model API.
In pi-mono: That means the agent sees both “what tools exist” and “how to call them.” If a tool
- System prompt builder: [`packages/coding-agent/src/core/system-prompt.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/src/core/system-prompt.ts) doesnt appear in the system prompt or the schema, the model cannot call it.
- Builds the `Available tools:` list from `toolDescriptions`.
- Appends skills and project context.
- Tool schemas passed to providers:
- OpenAI: [`packages/ai/src/providers/openai-responses.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/providers/openai-responses.ts) (`convertTools`)
- Anthropic: [`packages/ai/src/providers/anthropic.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/providers/anthropic.ts) (`convertTools`)
- Gemini: [`packages/ai/src/providers/google-shared.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/providers/google-shared.ts) (`convertTools`)
- Tool execution loop:
- Agent loop: [`packages/ai/src/agent/agent-loop.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/agent/agent-loop.ts)
- Validates tool arguments and executes tools, then appends `toolResult` messages.
In Clawdbot:
- System prompt append: [`src/agents/system-prompt.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/system-prompt.ts)
- Tool list injected via `createClawdbotCodingTools()` in [`src/agents/pi-tools.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-tools.ts)

View File

@@ -65,8 +65,3 @@ Use SSH tunneling or Tailscale to reach the Gateway WS.
## Notes ## Notes
- The TUI shows Gateway chat deltas (`event: chat`) and agent tool events. - The TUI shows Gateway chat deltas (`event: chat`) and agent tool events.
- It registers as a Gateway client with `mode: "tui"` for presence and debugging. - It registers as a Gateway client with `mode: "tui"` for presence and debugging.
## Files
- CLI: [`src/cli/tui-cli.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/cli/tui-cli.ts)
- Runner: [`src/tui/tui.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/tui/tui.ts)
- Gateway client: [`src/tui/gateway-chat.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/tui/gateway-chat.ts)