docs: reorganize documentation structure
This commit is contained in:
61
docs/concepts/agent-loop.md
Normal file
61
docs/concepts/agent-loop.md
Normal file
@@ -0,0 +1,61 @@
|
||||
---
|
||||
summary: "Agent loop lifecycle, streams, and wait semantics"
|
||||
read_when:
|
||||
- You need an exact walkthrough of the agent loop or lifecycle events
|
||||
---
|
||||
# Agent Loop (Clawdis)
|
||||
|
||||
Short, exact flow of one agent run. Source of truth: current code in `src/`.
|
||||
|
||||
## Entry points
|
||||
- Gateway RPC: `agent` and `agent.wait` in [`src/gateway/server-methods/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/agent.ts).
|
||||
- CLI: `agentCommand` in [`src/commands/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/agent.ts).
|
||||
|
||||
## High-level flow
|
||||
1) `agent` RPC validates params, resolves session (sessionKey/sessionId), persists session metadata, returns `{ runId, acceptedAt }` immediately.
|
||||
2) `agentCommand` runs the agent:
|
||||
- resolves model + thinking/verbose defaults
|
||||
- loads skills snapshot
|
||||
- calls `runEmbeddedPiAgent` (pi-agent-core runtime)
|
||||
- emits **lifecycle end/error** if the embedded loop does not emit one
|
||||
3) `runEmbeddedPiAgent`:
|
||||
- builds `AgentSession` and subscribes to pi events
|
||||
- streams assistant deltas + tool events
|
||||
- enforces timeout -> aborts run if exceeded
|
||||
- returns payloads + usage metadata
|
||||
4) `subscribeEmbeddedPiSession` bridges pi-agent-core events to Clawdis `agent` stream:
|
||||
- tool events => `stream: "tool"`
|
||||
- assistant deltas => `stream: "assistant"`
|
||||
- lifecycle events => `stream: "lifecycle"` (`phase: "start" | "end" | "error"`)
|
||||
5) `agent.wait` uses `waitForAgentJob`:
|
||||
- waits for **lifecycle end/error** for `runId`
|
||||
- returns `{ status: ok|error|timeout, startedAt, endedAt, error? }`
|
||||
|
||||
## Event streams (today)
|
||||
- `lifecycle`: emitted by `subscribeEmbeddedPiSession` (and as a fallback by `agentCommand`)
|
||||
- `assistant`: streamed deltas from pi-agent-core
|
||||
- `tool`: streamed tool events from pi-agent-core
|
||||
|
||||
## Chat provider handling
|
||||
- `createAgentEventHandler` in [`src/gateway/server-chat.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-chat.ts):
|
||||
- buffers assistant deltas
|
||||
- emits chat `delta` messages
|
||||
- emits chat `final` when **lifecycle end/error** arrives
|
||||
|
||||
## Timeouts
|
||||
- `agent.wait` default: 30s (just the wait). `timeoutMs` param overrides.
|
||||
- Agent runtime: `agent.timeoutSeconds` default 600s; enforced in `runEmbeddedPiAgent` abort timer.
|
||||
|
||||
## Where things can end early
|
||||
- Agent timeout (abort)
|
||||
- AbortSignal (cancel)
|
||||
- Gateway disconnect or RPC timeout
|
||||
- `agent.wait` timeout (wait-only, does not stop agent)
|
||||
|
||||
## Files
|
||||
- [`src/gateway/server-methods/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/agent.ts)
|
||||
- [`src/gateway/server-methods/agent-job.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-methods/agent-job.ts)
|
||||
- [`src/commands/agent.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/agent.ts)
|
||||
- [`src/agents/pi-embedded-runner.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-embedded-runner.ts)
|
||||
- [`src/agents/pi-embedded-subscribe.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-embedded-subscribe.ts)
|
||||
- [`src/gateway/server-chat.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server-chat.ts)
|
||||
187
docs/concepts/agent-workspace.md
Normal file
187
docs/concepts/agent-workspace.md
Normal file
@@ -0,0 +1,187 @@
|
||||
---
|
||||
summary: "Agent workspace: location, layout, and backup strategy"
|
||||
read_when:
|
||||
- You need to explain the agent workspace or its file layout
|
||||
- You want to back up or migrate an agent workspace
|
||||
---
|
||||
# Agent workspace
|
||||
|
||||
The workspace is the agent's home. It is the only working directory used for
|
||||
file tools and for workspace context. Keep it private and treat it as memory.
|
||||
|
||||
This is separate from `~/.clawdbot/`, which stores config, credentials, and
|
||||
sessions.
|
||||
|
||||
## Default location
|
||||
|
||||
- Default: `~/clawd`
|
||||
- If `CLAWDBOT_PROFILE` is set and not `"default"`, the default becomes
|
||||
`~/clawd-<profile>`.
|
||||
- Override in `~/.clawdbot/clawdbot.json`:
|
||||
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
workspace: "~/clawd"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`clawdbot onboard`, `clawdbot configure`, or `clawdbot setup` will create the
|
||||
workspace and seed the bootstrap files if they are missing.
|
||||
|
||||
If you already manage the workspace files yourself, you can disable bootstrap
|
||||
file creation:
|
||||
|
||||
```json5
|
||||
{ agent: { skipBootstrap: true } }
|
||||
```
|
||||
|
||||
## Workspace file map (what each file means)
|
||||
|
||||
These are the standard files Clawdbot expects inside the workspace:
|
||||
|
||||
- `AGENTS.md`
|
||||
- Operating instructions for the agent and how it should use memory.
|
||||
- Loaded at the start of every session.
|
||||
- Good place for rules, priorities, and "how to behave" details.
|
||||
|
||||
- `SOUL.md`
|
||||
- Persona, tone, and boundaries.
|
||||
- Loaded every session.
|
||||
|
||||
- `USER.md`
|
||||
- Who the user is and how to address them.
|
||||
- Loaded every session.
|
||||
|
||||
- `IDENTITY.md`
|
||||
- The agent's name, vibe, and emoji.
|
||||
- Created/updated during the bootstrap ritual.
|
||||
|
||||
- `TOOLS.md`
|
||||
- Notes about your local tools and conventions.
|
||||
- Does not control tool availability; it is only guidance.
|
||||
|
||||
- `HEARTBEAT.md`
|
||||
- Optional tiny checklist for heartbeat runs.
|
||||
- Keep it short to avoid token burn.
|
||||
|
||||
- `BOOTSTRAP.md`
|
||||
- One-time first-run ritual.
|
||||
- Only created for a brand-new workspace.
|
||||
- Delete it after the ritual is complete.
|
||||
|
||||
- `memory/YYYY-MM-DD.md`
|
||||
- Daily memory log (one file per day).
|
||||
- Recommended to read today + yesterday on session start.
|
||||
|
||||
- `MEMORY.md` (optional)
|
||||
- Curated long-term memory.
|
||||
- Only load in the main, private session (not shared/group contexts).
|
||||
|
||||
- `skills/` (optional)
|
||||
- Workspace-specific skills.
|
||||
- Overrides managed/bundled skills when names collide.
|
||||
|
||||
- `canvas/` (optional)
|
||||
- Canvas UI files for node displays (for example `canvas/index.html`).
|
||||
|
||||
If any bootstrap file is missing, Clawdbot injects a "missing file" marker into
|
||||
the session and continues. `clawdbot setup` can recreate missing defaults
|
||||
without overwriting existing files.
|
||||
|
||||
## What is NOT in the workspace
|
||||
|
||||
These live under `~/.clawdbot/` and should NOT be committed to the workspace repo:
|
||||
|
||||
- `~/.clawdbot/clawdbot.json` (config)
|
||||
- `~/.clawdbot/credentials/` (OAuth tokens, API keys)
|
||||
- `~/.clawdbot/agents/<agentId>/sessions/` (session transcripts + metadata)
|
||||
- `~/.clawdbot/skills/` (managed skills)
|
||||
|
||||
If you need to migrate sessions or config, copy them separately and keep them
|
||||
out of version control.
|
||||
|
||||
## Git backup (recommended, private)
|
||||
|
||||
Treat the workspace as private memory. Put it in a **private** git repo so it is
|
||||
backed up and recoverable.
|
||||
|
||||
Run these steps on the machine where the Gateway runs (that is where the
|
||||
workspace lives).
|
||||
|
||||
### 1) Initialize the repo
|
||||
|
||||
```bash
|
||||
cd ~/clawd
|
||||
git init
|
||||
git add AGENTS.md SOUL.md TOOLS.md IDENTITY.md USER.md HEARTBEAT.md memory/
|
||||
git commit -m "Add agent workspace"
|
||||
```
|
||||
|
||||
### 2) Add a private remote (beginner-friendly options)
|
||||
|
||||
Option A: GitHub web UI
|
||||
|
||||
1. Create a new **private** repository on GitHub.
|
||||
2. Do not initialize with a README (avoids merge conflicts).
|
||||
3. Copy the HTTPS remote URL.
|
||||
4. Add the remote and push:
|
||||
|
||||
```bash
|
||||
git branch -M main
|
||||
git remote add origin <https-url>
|
||||
git push -u origin main
|
||||
```
|
||||
|
||||
Option B: GitHub CLI (`gh`)
|
||||
|
||||
```bash
|
||||
gh auth login
|
||||
gh repo create clawd-workspace --private --source . --remote origin --push
|
||||
```
|
||||
|
||||
### 3) Ongoing updates
|
||||
|
||||
```bash
|
||||
git status
|
||||
git add .
|
||||
git commit -m "Update memory"
|
||||
git push
|
||||
```
|
||||
|
||||
## Do not commit secrets
|
||||
|
||||
Even in a private repo, avoid storing secrets in the workspace:
|
||||
|
||||
- API keys, OAuth tokens, passwords, or private credentials.
|
||||
- Anything under `~/.clawdbot/`.
|
||||
- Raw dumps of chats or sensitive attachments.
|
||||
|
||||
If you must store sensitive references, use placeholders and keep the real
|
||||
secret elsewhere (password manager, environment variables, or `~/.clawdbot/`).
|
||||
|
||||
Suggested `.gitignore` starter:
|
||||
|
||||
```gitignore
|
||||
.DS_Store
|
||||
.env
|
||||
**/*.key
|
||||
**/*.pem
|
||||
**/secrets*
|
||||
```
|
||||
|
||||
## Moving the workspace to a new machine
|
||||
|
||||
1. Clone the repo to the desired path (default `~/clawd`).
|
||||
2. Set `agent.workspace` to that path in `~/.clawdbot/clawdbot.json`.
|
||||
3. Run `clawdbot setup --workspace <path>` to seed any missing files.
|
||||
4. If you need sessions, copy `~/.clawdbot/agents/<agentId>/sessions/` from the
|
||||
old machine separately.
|
||||
|
||||
## Advanced notes
|
||||
|
||||
- Multi-agent routing can use different workspaces per agent. See
|
||||
`docs/provider-routing.md` for routing configuration.
|
||||
- If `agent.sandbox` is enabled, non-main sessions can use per-session sandbox
|
||||
workspaces under `agent.sandbox.workspaceRoot`.
|
||||
112
docs/concepts/agent.md
Normal file
112
docs/concepts/agent.md
Normal file
@@ -0,0 +1,112 @@
|
||||
---
|
||||
summary: "Agent runtime (embedded p-mono), workspace contract, and session bootstrap"
|
||||
read_when:
|
||||
- Changing agent runtime, workspace bootstrap, or session behavior
|
||||
---
|
||||
# Agent Runtime 🤖
|
||||
|
||||
CLAWDBOT runs a single embedded agent runtime derived from **p-mono** (internal name: **p**).
|
||||
|
||||
## Workspace (required)
|
||||
|
||||
CLAWDBOT uses a single agent workspace directory (`agent.workspace`) as the agent’s **only** working directory (`cwd`) for tools and context.
|
||||
|
||||
Recommended: use `clawdbot setup` to create `~/.clawdbot/clawdbot.json` if missing and initialize the workspace files.
|
||||
|
||||
Full workspace layout + backup guide: [`docs/agent-workspace.md`](/agent-workspace)
|
||||
|
||||
If `agent.sandbox` is enabled, non-main sessions can override this with
|
||||
per-session workspaces under `agent.sandbox.workspaceRoot` (see
|
||||
[`docs/configuration.md`](/configuration)).
|
||||
|
||||
## Bootstrap files (injected)
|
||||
|
||||
Inside `agent.workspace`, CLAWDBOT expects these user-editable files:
|
||||
- `AGENTS.md` — operating instructions + “memory”
|
||||
- `SOUL.md` — persona, boundaries, tone
|
||||
- `TOOLS.md` — user-maintained tool notes (e.g. `imsg`, `sag`, conventions)
|
||||
- `BOOTSTRAP.md` — one-time first-run ritual (deleted after completion)
|
||||
- `IDENTITY.md` — agent name/vibe/emoji
|
||||
- `USER.md` — user profile + preferred address
|
||||
|
||||
On the first turn of a new session, CLAWDBOT injects the contents of these files directly into the agent context.
|
||||
|
||||
If a file is missing, CLAWDBOT injects a single “missing file” marker line (and `clawdbot setup` will create a safe default template).
|
||||
|
||||
`BOOTSTRAP.md` is only created for a **brand new workspace** (no other bootstrap files present). If you delete it after completing the ritual, it should not be recreated on later restarts.
|
||||
|
||||
To disable bootstrap file creation entirely (for pre-seeded workspaces), set:
|
||||
|
||||
```json5
|
||||
{ agent: { skipBootstrap: true } }
|
||||
```
|
||||
|
||||
## Built-in tools (internal)
|
||||
|
||||
p’s embedded core tools (read/bash/edit/write and related internals) are defined in code and always available. `TOOLS.md` does **not** control which tools exist; it’s guidance for how *you* want them used.
|
||||
|
||||
## Skills
|
||||
|
||||
Clawdbot loads skills from three locations (workspace wins on name conflict):
|
||||
- Bundled (shipped with the install)
|
||||
- Managed/local: `~/.clawdbot/skills`
|
||||
- Workspace: `<workspace>/skills`
|
||||
|
||||
Skills can be gated by config/env (see `skills` in [`docs/configuration.md`](/configuration)).
|
||||
|
||||
## p-mono integration
|
||||
|
||||
Clawdbot reuses pieces of the p-mono codebase (models/tools), but **session management, discovery, and tool wiring are Clawdbot-owned**.
|
||||
|
||||
- No p-coding agent runtime.
|
||||
- No `~/.pi/agent` or `<workspace>/.pi` settings are consulted.
|
||||
|
||||
## Peter @ steipete (only)
|
||||
|
||||
Apply these notes **only** when the user is Peter Steinberger at steipete.
|
||||
|
||||
- Gateway runs on the **Mac Studio in London**.
|
||||
- Primary work computer: **MacBook Pro**.
|
||||
- Peter travels between **Vienna** and **London**; there are two networks bridged via **Tailscale**.
|
||||
- For debugging, connect to the Mac Studio (London) or MacBook Pro (primary).
|
||||
- There is also an **M1 MacBook Pro** on the Vienna tailnet you can use to access the Vienna network.
|
||||
- Nodes can be accessed via the `clawdbot` binary (`pnpm clawdbot` in `~/Projects/clawdbot`).
|
||||
- See also `skills/clawdbot*` for node/browser/canvas/cron usage.
|
||||
|
||||
## Sessions
|
||||
|
||||
Session transcripts are stored as JSONL at:
|
||||
- `~/.clawdbot/agents/<agentId>/sessions/<SessionId>.jsonl`
|
||||
|
||||
The session ID is stable and chosen by CLAWDBOT.
|
||||
Legacy Pi/Tau session folders are **not** read.
|
||||
|
||||
## Steering while streaming
|
||||
|
||||
When queue mode is `steer`, inbound messages are injected into the current run.
|
||||
The queue is checked **after each tool call**; if a queued message is present,
|
||||
remaining tool calls from the current assistant message are skipped (error tool
|
||||
results with "Skipped due to queued user message."), then the queued user
|
||||
message is injected before the next assistant response.
|
||||
|
||||
When queue mode is `followup` or `collect`, inbound messages are held until the
|
||||
current turn ends, then a new agent turn starts with the queued payloads. See
|
||||
[`docs/queue.md`](/queue) for mode + debounce/cap behavior.
|
||||
|
||||
Block streaming sends completed assistant blocks as soon as they finish; disable
|
||||
via `agent.blockStreamingDefault: "off"` if you only want the final response.
|
||||
Tune the boundary via `agent.blockStreamingBreak` (`text_end` vs `message_end`; defaults to text_end).
|
||||
Control soft block chunking with `agent.blockStreamingChunk` (defaults to
|
||||
800–1200 chars; prefers paragraph breaks, then newlines; sentences last).
|
||||
Verbose tool summaries are emitted at tool start (no debounce); Control UI
|
||||
streams tool output via agent events when available.
|
||||
|
||||
## Configuration (minimal)
|
||||
|
||||
At minimum, set:
|
||||
- `agent.workspace`
|
||||
- `whatsapp.allowFrom` (strongly recommended)
|
||||
|
||||
---
|
||||
|
||||
*Next: [Group Chats](/group-messages)* 🦞
|
||||
114
docs/concepts/architecture.md
Normal file
114
docs/concepts/architecture.md
Normal file
@@ -0,0 +1,114 @@
|
||||
---
|
||||
summary: "WebSocket gateway architecture, components, and client flows"
|
||||
read_when:
|
||||
- Working on gateway protocol, clients, or transports
|
||||
---
|
||||
# Gateway Architecture
|
||||
|
||||
Last updated: 2026-01-05
|
||||
|
||||
## Overview
|
||||
- A single long-lived **Gateway** process owns all messaging surfaces (WhatsApp via Baileys, Telegram via grammY, Slack via Bolt, Discord via discord.js, Signal via signal-cli, iMessage via imsg, WebChat) and the control/event plane.
|
||||
- All clients (macOS app, CLI, web UI, automations) connect to the Gateway over one transport: **WebSocket on the configured bind host** (default `127.0.0.1:18789`; tunnel or VPN for remote).
|
||||
- One Gateway per host; it is the only place that is allowed to open a WhatsApp session. All sends/agent runs go through it.
|
||||
- By default: the Gateway exposes a Canvas host on `canvasHost.port` (default `18793`), serving `~/clawd/canvas` at `/__clawdbot__/canvas/` with live-reload; disable via `canvasHost.enabled=false` or `CLAWDBOT_SKIP_CANVAS_HOST=1`.
|
||||
|
||||
## Implementation snapshot (current code)
|
||||
|
||||
### TypeScript Gateway ([`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts))
|
||||
- Single HTTP + WebSocket server (default `18789`); bind policy `loopback|lan|tailnet|auto`. Refuses non-loopback binds without auth; Tailscale serve/funnel requires loopback.
|
||||
- Handshake: first frame must be a `connect` request; AJV validates request + params against TypeBox schemas; protocol negotiated via `minProtocol`/`maxProtocol`.
|
||||
- `hello-ok` includes snapshot (presence/health/stateVersion/uptime/configPath/stateDir), features (methods/events), policy (max payload/buffer/tick), and `canvasHostUrl` when available.
|
||||
- Events emitted: `agent`, `chat`, `presence`, `tick`, `health`, `heartbeat`, `cron`, `talk.mode`, `node.pair.requested`, `node.pair.resolved`, `voicewake.changed`, `shutdown`.
|
||||
- Idempotency keys are required for `send`, `agent`, `chat.send`, and node invokes; the dedupe cache avoids double-sends on reconnects. Payload sizes are capped per connection.
|
||||
- Optional node bridge ([`src/infra/bridge/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/bridge/server.ts)): TCP JSONL frames (`hello`, `pair-request`, `req/res`, `event`, `invoke`, `ping`). Node connect/disconnect updates presence and flows into the session bus.
|
||||
- Control UI + Canvas host: HTTP serves Control UI (base path configurable) and can host the A2UI canvas via [`src/canvas-host/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/canvas-host/server.ts) (live reload). Canvas host URL is advertised to nodes + clients.
|
||||
|
||||
### iOS node (`apps/ios`)
|
||||
- Discovery + pairing: `BridgeDiscoveryModel` uses `NWBrowser` Bonjour discovery and reads TXT fields for LAN/tailnet host hints plus gateway/bridge/canvas ports.
|
||||
- Auto-connect: `BridgeConnectionController` uses stored `node.instanceId` + Keychain token; supports manual host/port; performs `pair-and-hello`.
|
||||
- Bridge runtime: `BridgeSession` actor owns an `NWConnection`, JSONL frames, `hello`/`hello-ok`, ping/pong, `req/res`, server `event`s, and `invoke` callbacks; stores `canvasHostUrl`.
|
||||
- Commands: `NodeAppModel` executes `canvas.*`, `canvas.a2ui.*`, `camera.*`, `screen.record`, `location.get`. Canvas/camera/screen are blocked when backgrounded.
|
||||
- Canvas + actions: `WKWebView` with A2UI action bridge; accepts actions from local-network or trusted file URLs; intercepts `clawdbot://` deep links and forwards `agent.request` to the bridge.
|
||||
- Voice/talk: voice wake sends `voice.transcript` events and syncs triggers via `voicewake.get` + `voicewake.changed`; Talk Mode attaches to the bridge.
|
||||
|
||||
### Android node (`apps/android`)
|
||||
- Discovery + pairing: `BridgeDiscovery` uses mDNS/NSD to find `_clawdbot-bridge._tcp`, with manual host/port fallback.
|
||||
- Auto-connect: `NodeRuntime` restores a stored token, performs `pair-and-hello`, and reconnects to the last discovered or manual bridge.
|
||||
- Bridge runtime: `BridgeSession` owns the TCP JSONL session (`hello`/`hello-ok`, ping/pong, `req/res`, `event`, `invoke`); stores `canvasHostUrl`.
|
||||
- Commands: `NodeRuntime` executes `canvas.*`, `canvas.a2ui.*`, `camera.*`, and chat/session events; foreground-only for canvas/camera.
|
||||
|
||||
## Components and flows
|
||||
- **Gateway (daemon)**
|
||||
- Maintains WhatsApp (Baileys), Telegram (grammY), Slack (Bolt), Discord (discord.js), Signal (signal-cli), and iMessage (imsg) connections.
|
||||
- Exposes a typed WS API (req/resp + server push events).
|
||||
- Validates every inbound frame against JSON Schema; rejects anything before a mandatory `connect`.
|
||||
- **Clients (mac app / CLI / web admin)**
|
||||
- One WS connection per client.
|
||||
- Send requests (`health`, `status`, `send`, `agent`, `system-presence`, toggles) and subscribe to events (`tick`, `agent`, `presence`, `shutdown`).
|
||||
- On macOS, the app can also be invoked via deep links (`clawdbot://agent?...`) which translate into the same Gateway `agent` request path (see [`docs/macos.md`](/macos)).
|
||||
- **Agent process (Pi)**
|
||||
- Spawned by the Gateway on demand for `agent` calls; streams events back over the same WS connection.
|
||||
- **WebChat**
|
||||
- Serves static assets locally.
|
||||
- Holds a single WS connection to the Gateway for control/data; all sends/agent runs go through the Gateway WS.
|
||||
- Remote use goes through the same SSH/Tailscale tunnel as other clients.
|
||||
|
||||
## Connection lifecycle (single client)
|
||||
```
|
||||
Client Gateway
|
||||
| |
|
||||
|---- req:connect -------->|
|
||||
|<------ res (ok) ---------| (or res error + close)
|
||||
| (payload=hello-ok carries snapshot: presence + health)
|
||||
| |
|
||||
|<------ event:presence ---| (deltas)
|
||||
|<------ event:tick -------| (keepalive/no-op)
|
||||
| |
|
||||
|------- req:agent ------->|
|
||||
|<------ res:agent --------| (ack: {runId,status:"accepted"})
|
||||
|<------ event:agent ------| (streaming)
|
||||
|<------ res:agent --------| (final: {runId,status,summary})
|
||||
| |
|
||||
```
|
||||
## Wire protocol (summary)
|
||||
- Transport: WebSocket, text frames with JSON payloads.
|
||||
- First frame must be `req {type:"req", id, method:"connect", params:{minProtocol, maxProtocol, client:{name,version,platform,mode,instanceId}, caps, auth?, locale?, userAgent? } }`.
|
||||
- Server replies `res {type:"res", id, ok:true, payload: hello-ok }` or `ok:false` then closes.
|
||||
- After handshake:
|
||||
- Requests: `{type:"req", id, method, params}` → `{type:"res", id, ok, payload|error}`
|
||||
- Events: `{type:"event", event:"agent"|"presence"|"tick"|"shutdown", payload, seq?, stateVersion?}`
|
||||
- If `CLAWDBOT_GATEWAY_TOKEN` (or `--token`) is set, `connect.params.auth.token` must match; otherwise the socket closes with policy violation.
|
||||
- Presence payload is structured, not free text: `{host, ip, version, mode, lastInputSeconds?, ts, reason?, tags?[], instanceId? }`.
|
||||
- Agent runs are acked `{runId,status:"accepted"}` then complete with a final res `{runId,status,summary}`; streamed output arrives as `event:"agent"`.
|
||||
- Protocol versions are bumped on breaking changes; clients must match `minClient`; Gateway chooses within client’s min/max.
|
||||
- Idempotency keys are required for side-effecting methods (`send`, `agent`) to safely retry; server keeps a short-lived dedupe cache.
|
||||
- Policy in `hello-ok` communicates payload/queue limits and tick interval.
|
||||
|
||||
## Type system and codegen
|
||||
- Source of truth: TypeBox (or ArkType) definitions in `protocol/` on the server.
|
||||
- Build step emits JSON Schema.
|
||||
- Clients:
|
||||
- TypeScript: uses the same TypeBox types directly.
|
||||
- Swift: generated `Codable` models via quicktype from the JSON Schema.
|
||||
- Validation: AJV on the server for every inbound frame; optional client-side validation for defensive programming.
|
||||
|
||||
## Invariants
|
||||
- Exactly one Gateway controls a single Baileys session per host. No fallbacks to ad-hoc direct Baileys sends.
|
||||
- Handshake is mandatory; any non-JSON or non-connect first frame is a hard close.
|
||||
- All methods and events are versioned; new fields are additive; breaking changes increment `protocol`.
|
||||
- No event replay: on seq gaps, clients must refresh (`health` + `system-presence`) and continue; presence is bounded via TTL/max entries.
|
||||
|
||||
## Remote access
|
||||
- Preferred: Tailscale or VPN; alternate: SSH tunnel `ssh -N -L 18789:127.0.0.1:18789 user@host`.
|
||||
- Same protocol over the tunnel; same handshake. If a shared token is configured, clients must send it in `connect.params.auth.token` even over the tunnel.
|
||||
- Same protocol over the tunnel; same handshake. If a shared token is configured, clients must send it in `connect.params.auth.token` even over the tunnel.
|
||||
|
||||
## Operations snapshot
|
||||
- Start: `clawdbot gateway` (foreground, logs to stdout).
|
||||
Supervise with launchd/systemd for restarts.
|
||||
- Health: request `health` over WS; also surfaced in `hello-ok.health`.
|
||||
- Metrics/logging: keep outside this spec; gateway should expose Prometheus text or structured logs separately.
|
||||
|
||||
## Migration notes
|
||||
- This architecture supersedes the legacy stdin RPC and the ad-hoc TCP control port. New clients should speak only the WS protocol. Legacy compatibility is intentionally dropped.
|
||||
73
docs/concepts/group-messages.md
Normal file
73
docs/concepts/group-messages.md
Normal file
@@ -0,0 +1,73 @@
|
||||
---
|
||||
summary: "Behavior and config for WhatsApp group message handling (mentionPatterns are shared across surfaces)"
|
||||
read_when:
|
||||
- Changing group message rules or mentions
|
||||
---
|
||||
# Group messages (web provider)
|
||||
|
||||
Goal: let Clawd sit in WhatsApp groups, wake up only when pinged, and keep that thread separate from the personal DM session.
|
||||
|
||||
Note: `routing.groupChat.mentionPatterns` is now used by Telegram/Discord/Slack/iMessage as well; this doc focuses on WhatsApp-specific behavior.
|
||||
|
||||
## What’s implemented (2025-12-03)
|
||||
- Activation modes: `mention` (default) or `always`. `mention` requires a ping (real WhatsApp @-mentions via `mentionedJids`, regex patterns, or the bot’s E.164 anywhere in the text). `always` wakes the agent on every message but it should reply only when it can add meaningful value; otherwise it returns the silent token `NO_REPLY`. Defaults can be set in config (`whatsapp.groups`) and overridden per group via `/activation`. When `whatsapp.groups` is set, it also acts as a group allowlist (include `"*"` to allow all).
|
||||
- Group policy: `whatsapp.groupPolicy` controls whether group messages are accepted (`open|disabled|allowlist`). `allowlist` uses `whatsapp.groupAllowFrom` (fallback: explicit `whatsapp.allowFrom`).
|
||||
- Per-group sessions: session keys look like `whatsapp:group:<jid>` so commands such as `/verbose on` or `/think high` (sent as standalone messages) are scoped to that group; personal DM state is untouched. Heartbeats are skipped for group threads.
|
||||
- Context injection: last N (default 50) group messages are prefixed under `[Chat messages since your last reply - for context]`, with the triggering line under `[Current message - respond to this]`.
|
||||
- Sender surfacing: every group batch now ends with `[from: Sender Name (+E164)]` so Pi knows who is speaking.
|
||||
- Ephemeral/view-once: we unwrap those before extracting text/mentions, so pings inside them still trigger.
|
||||
- Group system prompt: on the first turn of a group session (and whenever `/activation` changes the mode) we inject a short blurb into the system prompt like `You are replying inside the WhatsApp group "<subject>". Group members: Alice (+44...), Bob (+43...), … Activation: trigger-only … Address the specific sender noted in the message context.` If metadata isn’t available we still tell the agent it’s a group chat.
|
||||
|
||||
## Config for Clawd UK (+447700900123)
|
||||
Add a `groupChat` block to `~/.clawdbot/clawdbot.json` so display-name pings work even when WhatsApp strips the visual `@` in the text body:
|
||||
|
||||
```json5
|
||||
{
|
||||
"whatsapp": {
|
||||
"groups": {
|
||||
"*": { "requireMention": true }
|
||||
}
|
||||
},
|
||||
"routing": {
|
||||
"groupChat": {
|
||||
"historyLimit": 50,
|
||||
"mentionPatterns": [
|
||||
"@?clawd",
|
||||
"@?clawd\\s*uk",
|
||||
"@?clawdbot",
|
||||
"\\+?447700900123"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Notes:
|
||||
- The regexes are case-insensitive; they cover `@clawd`, `@clawd uk`, `clawdbot`, and the raw number with or without `+`/spaces.
|
||||
- WhatsApp still sends canonical mentions via `mentionedJids` when someone taps the contact, so the number fallback is rarely needed but is a good safety net.
|
||||
|
||||
### Activation command (owner-only)
|
||||
|
||||
Use the group chat command:
|
||||
- `/activation mention`
|
||||
- `/activation always`
|
||||
|
||||
Only the owner number (from `whatsapp.allowFrom`, or the bot’s own E.164 when unset) can change this. Send `/status` as a standalone message in the group to see the current activation mode.
|
||||
|
||||
## How to use
|
||||
1) Add Clawd UK (`+447700900123`) to the group.
|
||||
2) Say `@clawd …` (or `@clawd uk`, `@clawdbot`, or include the number). Anyone in the group can trigger it.
|
||||
3) The agent prompt will include recent group context plus the trailing `[from: …]` marker so it can address the right person.
|
||||
4) Session-level directives (`/verbose on`, `/think high`, `/new` or `/reset`, `/compact`) apply only to that group’s session; send them as standalone messages so they register. Your personal DM session remains independent.
|
||||
|
||||
## Testing / verification
|
||||
- Automated: `pnpm test -- src/web/auto-reply.test.ts --runInBand` (covers mention gating, history injection, sender suffix).
|
||||
- Manual smoke:
|
||||
- Send an `@clawd` ping in the group and confirm a reply that references the sender name.
|
||||
- Send a second ping and verify the history block is included then cleared on the next turn.
|
||||
- Check gateway logs (run with `--verbose`) to see `inbound web message` entries showing `from: <groupJid>` and the `[from: …]` suffix.
|
||||
|
||||
## Known considerations
|
||||
- Heartbeats are intentionally skipped for groups to avoid noisy broadcasts.
|
||||
- Echo suppression uses the combined batch string; if you send identical text twice without mentions, only the first will get a response.
|
||||
- Session store entries will appear as `agent:<agentId>:whatsapp:group:<jid>` in the session store (`~/.clawdbot/agents/<agentId>/sessions/sessions.json` by default); a missing entry just means the group hasn’t triggered a run yet.
|
||||
130
docs/concepts/groups.md
Normal file
130
docs/concepts/groups.md
Normal file
@@ -0,0 +1,130 @@
|
||||
---
|
||||
summary: "Group chat behavior across surfaces (WhatsApp/Telegram/Discord/Slack/Signal/iMessage)"
|
||||
read_when:
|
||||
- Changing group chat behavior or mention gating
|
||||
---
|
||||
# Groups
|
||||
|
||||
Clawdbot treats group chats consistently across surfaces: WhatsApp, Telegram, Discord, Slack, Signal, iMessage.
|
||||
|
||||
## Session keys
|
||||
- Group sessions use `agent:<agentId>:<provider>:group:<id>` session keys (rooms/channels use `agent:<agentId>:<provider>:channel:<id>`).
|
||||
- Direct chats use the main session (or per-sender if configured).
|
||||
- Heartbeats are skipped for group sessions.
|
||||
|
||||
## Display labels
|
||||
- UI labels use `displayName` when available, formatted as `<provider>:<token>`.
|
||||
- `#room` is reserved for rooms/channels; group chats use `g-<slug>` (lowercase, spaces -> `-`, keep `#@+._-`).
|
||||
|
||||
## Group policy
|
||||
Control how group/room messages are handled per provider:
|
||||
|
||||
```json5
|
||||
{
|
||||
whatsapp: {
|
||||
groupPolicy: "disabled", // "open" | "disabled" | "allowlist"
|
||||
groupAllowFrom: ["+15551234567"]
|
||||
},
|
||||
telegram: {
|
||||
groupPolicy: "disabled",
|
||||
groupAllowFrom: ["123456789", "@username"]
|
||||
},
|
||||
signal: {
|
||||
groupPolicy: "disabled",
|
||||
groupAllowFrom: ["+15551234567"]
|
||||
},
|
||||
imessage: {
|
||||
groupPolicy: "disabled",
|
||||
groupAllowFrom: ["chat_id:123"]
|
||||
},
|
||||
discord: {
|
||||
groupPolicy: "allowlist",
|
||||
guilds: {
|
||||
"GUILD_ID": { channels: { help: { allow: true } } }
|
||||
}
|
||||
},
|
||||
slack: {
|
||||
groupPolicy: "allowlist",
|
||||
channels: { "#general": { allow: true } }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Policy | Behavior |
|
||||
|--------|----------|
|
||||
| `"open"` | Default. Groups bypass allowlists; mention-gating still applies. |
|
||||
| `"disabled"` | Block all group messages entirely. |
|
||||
| `"allowlist"` | Only allow groups/rooms that match the configured allowlist. |
|
||||
|
||||
Notes:
|
||||
- `groupPolicy` is separate from mention-gating (which requires @mentions).
|
||||
- WhatsApp/Telegram/Signal/iMessage: use `groupAllowFrom` (fallback: explicit `allowFrom`).
|
||||
- Discord: allowlist uses `discord.guilds.<id>.channels`.
|
||||
- Slack: allowlist uses `slack.channels`.
|
||||
- Group DMs are controlled separately (`discord.dm.*`, `slack.dm.*`).
|
||||
- Telegram allowlist can match user IDs (`"123456789"`, `"telegram:123456789"`, `"tg:123456789"`) or usernames (`"@alice"` or `"alice"`); prefixes are case-insensitive.
|
||||
|
||||
## Mention gating (default)
|
||||
Group messages require a mention unless overridden per group. Defaults live per subsystem under `*.groups."*"`.
|
||||
|
||||
```json5
|
||||
{
|
||||
whatsapp: {
|
||||
groups: {
|
||||
"*": { requireMention: true },
|
||||
"123@g.us": { requireMention: false }
|
||||
}
|
||||
},
|
||||
telegram: {
|
||||
groups: {
|
||||
"*": { requireMention: true },
|
||||
"123456789": { requireMention: false }
|
||||
}
|
||||
},
|
||||
imessage: {
|
||||
groups: {
|
||||
"*": { requireMention: true },
|
||||
"123": { requireMention: false }
|
||||
}
|
||||
},
|
||||
routing: {
|
||||
groupChat: {
|
||||
mentionPatterns: ["@clawd", "clawdbot", "\\+15555550123"],
|
||||
historyLimit: 50
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Notes:
|
||||
- `mentionPatterns` are case-insensitive regexes.
|
||||
- Surfaces that provide explicit mentions still pass; patterns are a fallback.
|
||||
- Mention gating is only enforced when mention detection is possible (native mentions or `mentionPatterns` are configured).
|
||||
- Discord defaults live in `discord.guilds."*"` (overridable per guild/channel).
|
||||
|
||||
## Group allowlists
|
||||
When `whatsapp.groups`, `telegram.groups`, or `imessage.groups` is configured, the keys act as a group allowlist. Use `"*"` to allow all groups while still setting default mention behavior.
|
||||
|
||||
## Activation (owner-only)
|
||||
Group owners can toggle per-group activation:
|
||||
- `/activation mention`
|
||||
- `/activation always`
|
||||
|
||||
Owner is determined by `whatsapp.allowFrom` (or the bot’s self E.164 when unset). Send the command as a standalone message. Other surfaces currently ignore `/activation`.
|
||||
|
||||
## Context fields
|
||||
Group inbound payloads set:
|
||||
- `ChatType=group`
|
||||
- `GroupSubject` (if known)
|
||||
- `GroupMembers` (if known)
|
||||
- `WasMentioned` (mention gating result)
|
||||
|
||||
The agent system prompt includes a group intro on the first turn of a new group session.
|
||||
|
||||
## iMessage specifics
|
||||
- Prefer `chat_id:<id>` when routing or allowlisting.
|
||||
- List chats: `imsg chats --limit 20`.
|
||||
- Group replies always go back to the same `chat_id`.
|
||||
|
||||
## WhatsApp specifics
|
||||
See [`docs/group-messages.md`](/group-messages) for WhatsApp-only behavior (history injection, mention handling details).
|
||||
93
docs/concepts/model-failover.md
Normal file
93
docs/concepts/model-failover.md
Normal file
@@ -0,0 +1,93 @@
|
||||
---
|
||||
summary: "How Clawdbot rotates auth profiles and falls back across models"
|
||||
read_when:
|
||||
- Diagnosing auth profile rotation, cooldowns, or model fallback behavior
|
||||
- Updating failover rules for auth profiles or models
|
||||
---
|
||||
|
||||
# Model failover
|
||||
|
||||
Clawdbot handles failures in two stages:
|
||||
1) **Auth profile rotation** within the current provider.
|
||||
2) **Model fallback** to the next model in `agent.model.fallbacks`.
|
||||
|
||||
This doc explains the runtime rules and the data that backs them.
|
||||
|
||||
## Auth storage (keys + OAuth)
|
||||
|
||||
Clawdbot uses **auth profiles** for both API keys and OAuth tokens.
|
||||
|
||||
- Secrets live in `~/.clawdbot/agents/<agentId>/agent/auth-profiles.json` (legacy: `~/.clawdbot/agent/auth-profiles.json`).
|
||||
- Config `auth.profiles` / `auth.order` are **metadata + routing only** (no secrets).
|
||||
- Legacy import-only OAuth file: `~/.clawdbot/credentials/oauth.json` (imported into `auth-profiles.json` on first use).
|
||||
|
||||
Credential types:
|
||||
- `type: "api_key"` → `{ provider, key }`
|
||||
- `type: "oauth"` → `{ provider, access, refresh, expires, email? }` (+ `projectId`/`enterpriseUrl` for some providers)
|
||||
|
||||
## Profile IDs
|
||||
|
||||
OAuth logins create distinct profiles so multiple accounts can coexist.
|
||||
- Default: `provider:default` when no email is available.
|
||||
- OAuth with email: `provider:<email>` (for example `google-antigravity:user@gmail.com`).
|
||||
|
||||
Profiles live in `~/.clawdbot/agents/<agentId>/agent/auth-profiles.json` under `profiles`.
|
||||
|
||||
## Rotation order
|
||||
|
||||
When a provider has multiple profiles, Clawdbot chooses an order like this:
|
||||
|
||||
1) **Explicit config**: `auth.order[provider]` (if set).
|
||||
2) **Configured profiles**: `auth.profiles` filtered by provider.
|
||||
3) **Stored profiles**: entries in `auth-profiles.json` for the provider.
|
||||
|
||||
If no explicit order is configured, Clawdbot uses a round‑robin order:
|
||||
- **Primary key:** profile type (**OAuth before API keys**).
|
||||
- **Secondary key:** `usageStats.lastUsed` (oldest first, within each type).
|
||||
- **Cooldown profiles** are moved to the end, ordered by soonest cooldown expiry.
|
||||
|
||||
### Why OAuth can “look lost”
|
||||
|
||||
If you have both an OAuth profile and an API key profile for the same provider, round‑robin can switch between them across messages unless pinned. To force a single profile:
|
||||
- Pin with `auth.order[provider] = ["provider:profileId"]`, or
|
||||
- Use a per-session override via `/model …` with a profile override (when supported by your UI/chat surface).
|
||||
|
||||
## Cooldowns
|
||||
|
||||
When a profile fails due to auth/rate‑limit errors (or a timeout that looks
|
||||
like rate limiting), Clawdbot marks it in cooldown and moves to the next profile.
|
||||
|
||||
Cooldowns use exponential backoff:
|
||||
- 1 minute
|
||||
- 5 minutes
|
||||
- 25 minutes
|
||||
- 1 hour (cap)
|
||||
|
||||
State is stored in `auth-profiles.json` under `usageStats`:
|
||||
|
||||
```json
|
||||
{
|
||||
"usageStats": {
|
||||
"provider:profile": {
|
||||
"lastUsed": 1736160000000,
|
||||
"cooldownUntil": 1736160600000,
|
||||
"errorCount": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Model fallback
|
||||
|
||||
If all profiles for a provider fail, Clawdbot moves to the next model in
|
||||
`agent.model.fallbacks`. This applies to auth failures, rate limits, and
|
||||
timeouts that exhausted profile rotation.
|
||||
|
||||
## Related config
|
||||
|
||||
See [`docs/configuration.md`](/configuration) for:
|
||||
- `auth.profiles` / `auth.order`
|
||||
- `agent.model.primary` / `agent.model.fallbacks`
|
||||
- `agent.imageModel` routing
|
||||
|
||||
See [`docs/models.md`](/models) for the broader model selection and fallback overview.
|
||||
97
docs/concepts/models.md
Normal file
97
docs/concepts/models.md
Normal file
@@ -0,0 +1,97 @@
|
||||
---
|
||||
summary: "Plan for models CLI: scan, list, aliases, fallbacks, status"
|
||||
read_when:
|
||||
- Adding or modifying models CLI (models list/set/scan/aliases/fallbacks)
|
||||
- Changing model fallback behavior or selection UX
|
||||
- Updating model scan probes (tools/images)
|
||||
---
|
||||
# Models CLI plan
|
||||
|
||||
See [`docs/model-failover.md`](/model-failover) for how auth profiles rotate (OAuth vs API keys), cooldowns, and how that interacts with model fallbacks.
|
||||
|
||||
Goal: give clear model visibility + control (configured vs available), plus scan tooling
|
||||
that prefers tool-call + image-capable models and maintains ordered fallbacks.
|
||||
|
||||
## Model recommendations
|
||||
|
||||
Through testing, we’ve found Anthropic Opus 4.5 is the most useful general-purpose model for anything coding-related. We suggest GPT 5.2 Codex for coding and sub-agents. For personal assistant work, nothing comes close to Opus. If you’re going all-in on Claude, we recommend the Max $200 subscription: https://claude.com/pricing
|
||||
|
||||
## Command tree (draft)
|
||||
|
||||
- `clawdbot models list`
|
||||
- default: configured models only
|
||||
- flags: `--all` (full catalog), `--local`, `--provider <name>`, `--json`, `--plain`
|
||||
- `clawdbot models status`
|
||||
- show default model + aliases + fallbacks + configured models
|
||||
- `clawdbot models set <modelOrAlias>`
|
||||
- writes `agent.model.primary` and ensures `agent.models` entry
|
||||
- `clawdbot models set-image <modelOrAlias>`
|
||||
- writes `agent.imageModel.primary` and ensures `agent.models` entry
|
||||
- `clawdbot models aliases list|add|remove`
|
||||
- writes `agent.models.*.alias`
|
||||
- `clawdbot models fallbacks list|add|remove|clear`
|
||||
- writes `agent.model.fallbacks`
|
||||
- `clawdbot models image-fallbacks list|add|remove|clear`
|
||||
- writes `agent.imageModel.fallbacks`
|
||||
- `clawdbot models scan`
|
||||
- OpenRouter :free scan; probe tool-call + image; interactive selection
|
||||
|
||||
## Config changes
|
||||
|
||||
- `agent.models` (configured model catalog + aliases).
|
||||
- `agent.model.primary` + `agent.model.fallbacks`.
|
||||
- `agent.imageModel.primary` + `agent.imageModel.fallbacks` (optional).
|
||||
- `auth.profiles` + `auth.order` for per-provider auth failover.
|
||||
|
||||
## Scan behavior (models scan)
|
||||
|
||||
Input
|
||||
- OpenRouter `/models` list (filter `:free`)
|
||||
- Requires OpenRouter API key from auth profiles or `OPENROUTER_API_KEY`
|
||||
- Optional filters: `--max-age-days`, `--min-params`, `--provider`, `--max-candidates`
|
||||
- Probe controls: `--timeout`, `--concurrency`
|
||||
|
||||
Probes (direct pi-ai complete)
|
||||
- Tool-call probe (required):
|
||||
- Provide a dummy tool, verify tool call emitted.
|
||||
- Image probe (preferred):
|
||||
- Prompt includes 1x1 PNG; success if no "unsupported image" error.
|
||||
|
||||
Scoring/selection
|
||||
- Prefer models passing tool + image for text/tool fallbacks.
|
||||
- Prefer image-only models for image tool fallback (even if tool probe fails).
|
||||
- Rank by: image ok, then lower tool latency, then larger context, then params.
|
||||
|
||||
Interactive selection (TTY)
|
||||
- Multiselect list with per-model stats:
|
||||
- model id, tool ok, image ok, median latency, context, inferred params.
|
||||
- Pre-select top N (default 6).
|
||||
- Non-TTY: auto-select; require `--yes`/`--no-input` to apply.
|
||||
|
||||
Output
|
||||
- Writes `agent.model.fallbacks` ordered.
|
||||
- Writes `agent.imageModel.fallbacks` ordered (image-capable models).
|
||||
- Ensures `agent.models` entries exist for selected models.
|
||||
- Optional `--set-default` to set `agent.model.primary`.
|
||||
- Optional `--set-image` to set `agent.imageModel.primary`.
|
||||
|
||||
## Runtime fallback
|
||||
|
||||
- On model failure: try `agent.model.fallbacks` in order.
|
||||
- Per-provider auth failover uses `auth.order` (or stored profile order) **before**
|
||||
moving to the next model.
|
||||
- Image routing uses `agent.imageModel` **only when configured** and the primary
|
||||
model lacks image input.
|
||||
- Persist last successful provider/model to session entry; auth profile success is global.
|
||||
- See [`docs/model-failover.md`](/model-failover) for auth profile rotation, cooldowns, and timeout handling.
|
||||
|
||||
## Tests
|
||||
|
||||
- Unit: scan selection ordering + probe classification.
|
||||
- CLI: list/aliases/fallbacks add/remove + scan writes config.
|
||||
- Status: shows last used model + fallbacks.
|
||||
|
||||
## Docs
|
||||
|
||||
- Update [`docs/configuration.md`](/configuration) with `agent.models` + `agent.model` + `agent.imageModel`.
|
||||
- Keep this doc current when CLI surface or scan logic changes.
|
||||
121
docs/concepts/multi-agent.md
Normal file
121
docs/concepts/multi-agent.md
Normal file
@@ -0,0 +1,121 @@
|
||||
---
|
||||
summary: "Multi-agent routing: isolated agents, provider accounts, and bindings"
|
||||
title: Multi-Agent Routing
|
||||
read_when: "You want multiple isolated agents (workspaces + auth) in one gateway process."
|
||||
status: active
|
||||
---
|
||||
|
||||
# Multi-Agent Routing
|
||||
|
||||
Goal: multiple *isolated* agents (separate workspace + `agentDir` + sessions), plus multiple provider accounts (e.g. two WhatsApps) in one running Gateway. Inbound is routed to an agent via bindings.
|
||||
|
||||
## What is “one agent”?
|
||||
|
||||
An **agent** is a fully scoped brain with its own:
|
||||
|
||||
- **Workspace** (files, AGENTS.md/SOUL.md/USER.md, local notes, persona rules).
|
||||
- **State directory** (`agentDir`) for auth profiles, model registry, and per-agent config.
|
||||
- **Session store** (chat history + routing state) under `~/.clawdbot/agents/<agentId>/sessions`.
|
||||
|
||||
The Gateway can host **one agent** (default) or **many agents** side-by-side.
|
||||
|
||||
### Single-agent mode (default)
|
||||
|
||||
If you do nothing, Clawdbot runs a single agent:
|
||||
|
||||
- `agentId` defaults to **`main`**.
|
||||
- Sessions are keyed as `agent:main:<mainKey>`.
|
||||
- Workspace defaults to `~/clawd` (or `~/clawd-<profile>` when `CLAWDBOT_PROFILE` is set).
|
||||
- State defaults to `~/.clawdbot/agents/main/agent`.
|
||||
|
||||
## Multiple agents = multiple people, multiple personalities
|
||||
|
||||
With **multiple agents**, each `agentId` becomes a **fully isolated persona**:
|
||||
|
||||
- **Different phone numbers/accounts** (per provider `accountId`).
|
||||
- **Different personalities** (per-agent workspace files like `AGENTS.md` and `SOUL.md`).
|
||||
- **Separate auth + sessions** (no cross-talk unless explicitly enabled).
|
||||
|
||||
This lets **multiple people** share one Gateway server while keeping their AI “brains” and data isolated.
|
||||
|
||||
## Routing rules (how messages pick an agent)
|
||||
|
||||
Bindings are **deterministic** and **most-specific wins**:
|
||||
|
||||
1. `peer` match (exact DM/group/channel id)
|
||||
2. `guildId` (Discord)
|
||||
3. `teamId` (Slack)
|
||||
4. `accountId` match for a provider
|
||||
5. provider-level match (`accountId: "*"`)
|
||||
6. fallback to `routing.defaultAgentId` (default: `main`)
|
||||
|
||||
## Multiple accounts / phone numbers
|
||||
|
||||
Providers that support **multiple accounts** (e.g. WhatsApp) use `accountId` to identify
|
||||
each login. Each `accountId` can be routed to a different agent, so one server can host
|
||||
multiple phone numbers without mixing sessions.
|
||||
|
||||
## Concepts
|
||||
|
||||
- `agentId`: one “brain” (workspace, per-agent auth, per-agent session store).
|
||||
- `accountId`: one provider account instance (e.g. WhatsApp account `"personal"` vs `"biz"`).
|
||||
- `binding`: routes inbound messages to an `agentId` by `(provider, accountId, peer)` and optionally guild/team ids.
|
||||
- Direct chats collapse to `agent:<agentId>:<mainKey>` (per-agent “main”; `session.mainKey`).
|
||||
|
||||
## Example: two WhatsApps → two agents
|
||||
|
||||
`~/.clawdbot/clawdbot.json` (JSON5):
|
||||
|
||||
```js
|
||||
{
|
||||
routing: {
|
||||
defaultAgentId: "home",
|
||||
|
||||
agents: {
|
||||
home: {
|
||||
workspace: "~/clawd-home",
|
||||
agentDir: "~/.clawdbot/agents/home/agent",
|
||||
},
|
||||
work: {
|
||||
workspace: "~/clawd-work",
|
||||
agentDir: "~/.clawdbot/agents/work/agent",
|
||||
},
|
||||
},
|
||||
|
||||
// Deterministic routing: first match wins (most-specific first).
|
||||
bindings: [
|
||||
{ agentId: "home", match: { provider: "whatsapp", accountId: "personal" } },
|
||||
{ agentId: "work", match: { provider: "whatsapp", accountId: "biz" } },
|
||||
|
||||
// Optional per-peer override (example: send a specific group to work agent).
|
||||
{
|
||||
agentId: "work",
|
||||
match: {
|
||||
provider: "whatsapp",
|
||||
accountId: "personal",
|
||||
peer: { kind: "group", id: "1203630...@g.us" },
|
||||
},
|
||||
},
|
||||
],
|
||||
|
||||
// Off by default: agent-to-agent messaging must be explicitly enabled + allowlisted.
|
||||
agentToAgent: {
|
||||
enabled: false,
|
||||
allow: ["home", "work"],
|
||||
},
|
||||
},
|
||||
|
||||
whatsapp: {
|
||||
accounts: {
|
||||
personal: {
|
||||
// Optional override. Default: ~/.clawdbot/credentials/whatsapp/personal
|
||||
// authDir: "~/.clawdbot/credentials/whatsapp/personal",
|
||||
},
|
||||
biz: {
|
||||
// Optional override. Default: ~/.clawdbot/credentials/whatsapp/biz
|
||||
// authDir: "~/.clawdbot/credentials/whatsapp/biz",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
133
docs/concepts/presence.md
Normal file
133
docs/concepts/presence.md
Normal file
@@ -0,0 +1,133 @@
|
||||
---
|
||||
summary: "How Clawdbot presence entries are produced, merged, and displayed"
|
||||
read_when:
|
||||
- Debugging the Instances tab
|
||||
- Investigating duplicate or stale instance rows
|
||||
- Changing gateway WS connect or system-event beacons
|
||||
---
|
||||
# Presence
|
||||
|
||||
Clawdbot “presence” is a lightweight, best-effort view of:
|
||||
- The **Gateway** itself (one per host), and
|
||||
- The **clients connected to the Gateway** (mac app, WebChat, CLI, etc.).
|
||||
|
||||
Presence is used primarily to render the mac app’s **Instances** tab and to provide quick operator visibility.
|
||||
|
||||
## The data model
|
||||
|
||||
Presence entries are structured objects with (some) fields:
|
||||
- `instanceId` (optional but strongly recommended): stable client identity used for dedupe
|
||||
- `host`: a human-readable name (often the machine name)
|
||||
- `ip`: best-effort IP address (may be missing or stale)
|
||||
- `version`: client version string
|
||||
- `deviceFamily` (optional): hardware family like `iPad`, `iPhone`, `Mac`
|
||||
- `modelIdentifier` (optional): hardware model identifier like `iPad16,6` or `Mac16,6`
|
||||
- `mode`: e.g. `gateway`, `app`, `webchat`, `cli`
|
||||
- `lastInputSeconds` (optional): “seconds since last user input” for that client machine
|
||||
- `reason`: a short marker like `self`, `connect`, `node-connected`, `node-disconnected`, `periodic`, `instances-refresh`
|
||||
- `text`: legacy/debug summary string (kept for backwards compatibility and UI display)
|
||||
- `ts`: last update timestamp (ms since epoch)
|
||||
|
||||
## Producers (where presence comes from)
|
||||
|
||||
Presence entries are produced by multiple sources and then **merged**.
|
||||
|
||||
### 1) Gateway self entry
|
||||
|
||||
The Gateway seeds a “self” entry at startup so UIs always show at least the current gateway host.
|
||||
|
||||
Implementation: [`src/infra/system-presence.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/system-presence.ts) (`initSelfPresence()`).
|
||||
|
||||
### 2) WebSocket connect (connection-derived presence)
|
||||
|
||||
Every WS client must begin with a `connect` request. On successful handshake, the Gateway upserts a presence entry for that connection.
|
||||
|
||||
This is meant to answer: “Which clients are currently connected?”
|
||||
|
||||
Implementation: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) (connect handling uses `connect.params.client.instanceId` when provided; otherwise falls back to `connId`).
|
||||
|
||||
#### Why one-off CLI commands do not show up
|
||||
|
||||
The CLI connects to the Gateway to execute one-off commands (health/status/send/agent/etc.). These are not “nodes” and would spam the Instances list, so the Gateway does not create presence entries for clients with `client.mode === "cli"`.
|
||||
|
||||
### 3) `system-event` beacons (client-reported presence)
|
||||
|
||||
Clients can publish richer periodic beacons via the `system-event` method. The mac app uses this to report:
|
||||
- a human-friendly host name
|
||||
- its best-known IP address
|
||||
- `lastInputSeconds`
|
||||
|
||||
Implementation:
|
||||
- Gateway: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) handles method `system-event` by calling `updateSystemPresence(...)`.
|
||||
- mac app beaconing: [`apps/macos/Sources/Clawdbot/PresenceReporter.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/PresenceReporter.swift).
|
||||
|
||||
### 4) Node bridge beacons (gateway-owned presence)
|
||||
|
||||
When a node bridge connection authenticates, the Gateway emits a presence entry
|
||||
for that node and starts periodic refresh beacons so it does not expire.
|
||||
|
||||
- Connect/disconnect markers: `node-connected`, `node-disconnected`
|
||||
- Periodic heartbeat: every 3 minutes (`reason: periodic`)
|
||||
|
||||
Implementation: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) (node bridge handlers + timer beacons).
|
||||
|
||||
## Merge + dedupe rules (why `instanceId` matters)
|
||||
|
||||
All producers write into a single in-memory presence map.
|
||||
|
||||
Key points:
|
||||
- Entries are **keyed** by a “presence key”. If two producers use the same key, they update the same entry.
|
||||
- The best key is a stable, opaque `instanceId` that does not change across restarts.
|
||||
- Keys are treated case-insensitively.
|
||||
|
||||
Implementation: [`src/infra/system-presence.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/system-presence.ts) (`normalizePresenceKey()`).
|
||||
|
||||
### mac app identity (stable UUID)
|
||||
|
||||
The mac app uses a persisted UUID as `instanceId` so:
|
||||
- restarts/reconnects do not create duplicates
|
||||
- renaming the Mac does not create a new “instance”
|
||||
- debug/release builds can share the same identity
|
||||
|
||||
Implementation: [`apps/macos/Sources/Clawdbot/InstanceIdentity.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/InstanceIdentity.swift).
|
||||
|
||||
`displayName` (machine name) is used for UI, while `instanceId` is used for dedupe.
|
||||
|
||||
## TTL and bounded size (why stale rows disappear)
|
||||
|
||||
Presence entries are not permanent:
|
||||
- TTL: entries older than 5 minutes are pruned
|
||||
- Max: map is capped at 200 entries (LRU by `ts`)
|
||||
|
||||
Implementation: [`src/infra/system-presence.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/infra/system-presence.ts) (`TTL_MS`, `MAX_ENTRIES`, pruning in `listSystemPresence()`).
|
||||
|
||||
## Remote/tunnel caveat (loopback IPs)
|
||||
|
||||
When a client connects over an SSH tunnel / local port forward, the Gateway may see the remote address as loopback (`127.0.0.1`).
|
||||
|
||||
To avoid degrading an otherwise-correct client beacon IP, the Gateway avoids writing loopback remote addresses into presence entries.
|
||||
|
||||
Implementation: [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts) (`isLoopbackAddress()`).
|
||||
|
||||
## Consumers (who reads presence)
|
||||
|
||||
### macOS Instances tab
|
||||
|
||||
The mac app’s Instances tab renders the result of `system-presence`.
|
||||
|
||||
Implementation:
|
||||
- View: [`apps/macos/Sources/Clawdbot/InstancesSettings.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/InstancesSettings.swift)
|
||||
- Store: [`apps/macos/Sources/Clawdbot/InstancesStore.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/InstancesStore.swift)
|
||||
|
||||
The Instances rows show a small presence indicator (Active/Idle/Stale) based on
|
||||
the last beacon age. The label is derived from the entry timestamp (`ts`).
|
||||
|
||||
The store refreshes periodically and also applies `presence` WS events.
|
||||
|
||||
## Debugging tips
|
||||
|
||||
- To see the raw list, call `system-presence` against the gateway.
|
||||
- If you see duplicates:
|
||||
- confirm clients send a stable `instanceId` in the handshake (`connect.params.client.instanceId`)
|
||||
- confirm beaconing uses the same `instanceId`
|
||||
- check whether the connection-derived entry is missing `instanceId` (then it will be keyed by `connId` and duplicates are expected on reconnect)
|
||||
25
docs/concepts/provider-routing.md
Normal file
25
docs/concepts/provider-routing.md
Normal file
@@ -0,0 +1,25 @@
|
||||
---
|
||||
summary: "Routing rules per provider (WhatsApp, Telegram, Discord, web) and shared context"
|
||||
read_when:
|
||||
- Changing provider routing or inbox behavior
|
||||
---
|
||||
# Providers & Routing
|
||||
|
||||
Updated: 2026-01-06
|
||||
|
||||
Goal: deterministic replies per provider, while supporting multi-agent + multi-account routing.
|
||||
|
||||
- **Provider**: provider label (`whatsapp`, `webchat`, `telegram`, `discord`, `signal`, `imessage`, …). Routing is fixed: replies go back to the origin provider; the model doesn’t choose.
|
||||
- **AccountId**: provider account instance (e.g. WhatsApp account `"default"` vs `"work"`). Not every provider supports multi-account yet.
|
||||
- **AgentId**: one isolated “brain” (workspace + per-agent agentDir + per-agent session store).
|
||||
- **Reply context:** inbound replies include `ReplyToId`, `ReplyToBody`, and `ReplyToSender`, and the quoted context is appended to `Body` as a `[Replying to ...]` block.
|
||||
- **Canonical direct session (per agent):** direct chats collapse to `agent:<agentId>:<mainKey>` (default `main`). Groups/channels stay isolated per agent:
|
||||
- group: `agent:<agentId>:<provider>:group:<id>`
|
||||
- channel/room: `agent:<agentId>:<provider>:channel:<id>`
|
||||
- **Session store:** per-agent store lives under `~/.clawdbot/agents/<agentId>/sessions/sessions.json` (override via `session.store` with `{agentId}` templating). JSONL transcripts live next to it.
|
||||
- **WebChat:** attaches to the selected agent’s main session (so desktop reflects cross-provider history for that agent).
|
||||
- **Implementation hints:**
|
||||
- Set `Provider` + `AccountId` in each ingress.
|
||||
- Route inbound to an agent via `routing.bindings` (match on `provider`, `accountId`, plus optional peer/guild/team).
|
||||
- Keep routing deterministic: originate → same provider. Use the gateway WebSocket for sends; avoid side channels.
|
||||
- Do not let the agent emit “send to X” decisions; keep that policy in the host code.
|
||||
77
docs/concepts/queue.md
Normal file
77
docs/concepts/queue.md
Normal file
@@ -0,0 +1,77 @@
|
||||
---
|
||||
summary: "Command queue design that serializes auto-reply command execution"
|
||||
read_when:
|
||||
- Changing auto-reply execution or concurrency
|
||||
---
|
||||
# Command Queue (2026-01-03)
|
||||
|
||||
We now serialize command-based auto-replies (WhatsApp Web listener) through a tiny in-process queue to prevent multiple commands from running at once, while allowing safe parallelism across sessions.
|
||||
|
||||
## Why
|
||||
- Some auto-reply commands are expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
|
||||
- Serializing avoids competing for terminal/stdin, keeps logs readable, and reduces the chance of rate limits from upstream tools.
|
||||
|
||||
## How it works
|
||||
- [`src/process/command-queue.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/process/command-queue.ts) holds a lane-aware FIFO queue and drains each lane synchronously.
|
||||
- `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session.
|
||||
- Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agent.maxConcurrent`.
|
||||
- When verbose logging is enabled, queued commands emit a short notice if they waited more than ~2s before starting.
|
||||
- Typing indicators (`onReplyStart`) still fire immediately on enqueue so user experience is unchanged while we wait our turn.
|
||||
|
||||
## Queue modes (per provider)
|
||||
Inbound messages can steer the current run, wait for a followup turn, or do both:
|
||||
- `steer`: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
|
||||
- `followup`: enqueue for the next agent turn after the current run ends.
|
||||
- `collect`: coalesce all queued messages into a **single** followup turn (default).
|
||||
- `steer-backlog` (aka `steer+backlog`): steer now **and** preserve the message for a followup turn.
|
||||
- `interrupt` (legacy): abort the active run for that session, then run the newest message.
|
||||
- `queue` (legacy alias): same as `steer`.
|
||||
|
||||
Steer-backlog means you can get a followup response after the steered run, so
|
||||
streaming surfaces can look like duplicates. Prefer `collect`/`steer` if you want
|
||||
one response per inbound message.
|
||||
Send `/queue collect` as a standalone command (per-session) or set `routing.queue.byProvider.discord: "collect"`.
|
||||
|
||||
Defaults (when unset in config):
|
||||
- All surfaces → `collect`
|
||||
|
||||
Configure globally or per provider via `routing.queue`:
|
||||
|
||||
```json5
|
||||
{
|
||||
routing: {
|
||||
queue: {
|
||||
mode: "collect",
|
||||
debounceMs: 1000,
|
||||
cap: 20,
|
||||
drop: "summarize",
|
||||
byProvider: { discord: "collect" }
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Queue options
|
||||
Options apply to `followup`, `collect`, and `steer-backlog` (and to `steer` when it falls back to followup):
|
||||
- `debounceMs`: wait for quiet before starting a followup turn (prevents “continue, continue”).
|
||||
- `cap`: max queued messages per session.
|
||||
- `drop`: overflow policy (`old`, `new`, `summarize`).
|
||||
|
||||
Summarize keeps a short bullet list of dropped messages and injects it as a synthetic followup prompt.
|
||||
Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.
|
||||
|
||||
## Per-session overrides
|
||||
- Send `/queue <mode>` as a standalone command to store the mode for the current session.
|
||||
- Options can be combined: `/queue collect debounce:2s cap:25 drop:summarize`
|
||||
- `/queue default` or `/queue reset` clears the session override.
|
||||
|
||||
## Scope and guarantees
|
||||
- Applies only to config-driven command replies; plain text replies are unaffected.
|
||||
- Default lane (`main`) is process-wide for inbound + main heartbeats; set `agent.maxConcurrent` to allow multiple sessions in parallel.
|
||||
- Additional lanes may exist (e.g. `cron`) so background jobs can run in parallel without blocking inbound replies.
|
||||
- Per-session lanes guarantee that only one agent run touches a given session at a time.
|
||||
- No external dependencies or background worker threads; pure TypeScript + promises.
|
||||
|
||||
## Troubleshooting
|
||||
- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
|
||||
- `enqueueCommand` exposes a lightweight `getQueueSize()` helper if you need to surface queue depth in future diagnostics.
|
||||
155
docs/concepts/session-tool.md
Normal file
155
docs/concepts/session-tool.md
Normal file
@@ -0,0 +1,155 @@
|
||||
---
|
||||
summary: "Agent session tools for listing sessions, fetching history, and sending cross-session messages"
|
||||
read_when:
|
||||
- Adding or modifying session tools
|
||||
---
|
||||
|
||||
# Session Tools
|
||||
|
||||
Goal: small, hard-to-misuse tool set so agents can list sessions, fetch history, and send to another session.
|
||||
|
||||
## Tool Names
|
||||
- `sessions_list`
|
||||
- `sessions_history`
|
||||
- `sessions_send`
|
||||
- `sessions_spawn`
|
||||
|
||||
## Key Model
|
||||
- Main direct chat bucket is always the literal key `"main"`.
|
||||
- Group chats use `<provider>:group:<id>` or `<provider>:channel:<id>`.
|
||||
- Cron jobs use `cron:<job.id>`.
|
||||
- Hooks use `hook:<uuid>` unless explicitly set.
|
||||
- Node bridge uses `node-<nodeId>` unless explicitly set.
|
||||
|
||||
`global` and `unknown` are internal-only and never listed. If `session.scope = "global"`, we alias it to `main` for all tools so callers never see `global`.
|
||||
|
||||
## sessions_list
|
||||
List sessions as an array of rows.
|
||||
|
||||
Parameters:
|
||||
- `kinds?: string[]` filter: any of `"main" | "group" | "cron" | "hook" | "node" | "other"`
|
||||
- `limit?: number` max rows (default: server default, clamp e.g. 200)
|
||||
- `activeMinutes?: number` only sessions updated within N minutes
|
||||
- `messageLimit?: number` 0 = no messages (default 0); >0 = include last N messages
|
||||
|
||||
Behavior:
|
||||
- `messageLimit > 0` fetches `chat.history` per session and includes the last N messages.
|
||||
- Tool results are filtered out in list output; use `sessions_history` for tool messages.
|
||||
- When running in a **sandboxed** agent session, session tools default to **spawned-only visibility** (see below).
|
||||
|
||||
Row shape (JSON):
|
||||
- `key`: session key (string)
|
||||
- `kind`: `main | group | cron | hook | node | other`
|
||||
- `provider`: `whatsapp | telegram | discord | signal | imessage | webchat | internal | unknown`
|
||||
- `displayName` (group display label if available)
|
||||
- `updatedAt` (ms)
|
||||
- `sessionId`
|
||||
- `model`, `contextTokens`, `totalTokens`
|
||||
- `thinkingLevel`, `verboseLevel`, `systemSent`, `abortedLastRun`
|
||||
- `sendPolicy` (session override if set)
|
||||
- `lastProvider`, `lastTo`
|
||||
- `transcriptPath` (best-effort path derived from store dir + sessionId)
|
||||
- `messages?` (only when `messageLimit > 0`)
|
||||
|
||||
## sessions_history
|
||||
Fetch transcript for one session.
|
||||
|
||||
Parameters:
|
||||
- `sessionKey` (required)
|
||||
- `limit?: number` max messages (server clamps)
|
||||
- `includeTools?: boolean` (default false)
|
||||
|
||||
Behavior:
|
||||
- `includeTools=false` filters `role: "toolResult"` messages.
|
||||
- Returns messages array in the raw transcript format.
|
||||
|
||||
## sessions_send
|
||||
Send a message into another session.
|
||||
|
||||
Parameters:
|
||||
- `sessionKey` (required)
|
||||
- `message` (required)
|
||||
- `timeoutSeconds?: number` (default >0; 0 = fire-and-forget)
|
||||
|
||||
Behavior:
|
||||
- `timeoutSeconds = 0`: enqueue and return `{ runId, status: "accepted" }`.
|
||||
- `timeoutSeconds > 0`: wait up to N seconds for completion, then return `{ runId, status: "ok", reply }`.
|
||||
- If wait times out: `{ runId, status: "timeout", error }`. Run continues; call `sessions_history` later.
|
||||
- If the run fails: `{ runId, status: "error", error }`.
|
||||
- Waits via gateway `agent.wait` (server-side) so reconnects don't drop the wait.
|
||||
- Agent-to-agent message context is injected for the primary run.
|
||||
- After the primary run completes, Clawdbot runs a **reply-back loop**:
|
||||
- Round 2+ alternates between requester and target agents.
|
||||
- Reply exactly `REPLY_SKIP` to stop the ping‑pong.
|
||||
- Max turns is `session.agentToAgent.maxPingPongTurns` (0–5, default 5).
|
||||
- Once the loop ends, Clawdbot runs the **agent‑to‑agent announce step** (target agent only):
|
||||
- Reply exactly `ANNOUNCE_SKIP` to stay silent.
|
||||
- Any other reply is sent to the target provider.
|
||||
- Announce step includes the original request + round‑1 reply + latest ping‑pong reply.
|
||||
|
||||
## Provider Field
|
||||
- For groups, `provider` is the provider recorded on the session entry.
|
||||
- For direct chats, `provider` maps from `lastProvider`.
|
||||
- For cron/hook/node, `provider` is `internal`.
|
||||
- If missing, `provider` is `unknown`.
|
||||
|
||||
## Security / Send Policy
|
||||
Policy-based blocking by provider/chat type (not per session id).
|
||||
|
||||
```json
|
||||
{
|
||||
"session": {
|
||||
"sendPolicy": {
|
||||
"rules": [
|
||||
{
|
||||
"match": { "provider": "discord", "chatType": "group" },
|
||||
"action": "deny"
|
||||
}
|
||||
],
|
||||
"default": "allow"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Runtime override (per session entry):
|
||||
- `sendPolicy: "allow" | "deny"` (unset = inherit config)
|
||||
- Settable via `sessions.patch` or owner-only `/send on|off|inherit` (standalone message).
|
||||
|
||||
Enforcement points:
|
||||
- `chat.send` / `agent` (gateway)
|
||||
- auto-reply delivery logic
|
||||
|
||||
## sessions_spawn
|
||||
Spawn a sub-agent run in an isolated session and announce the result back to the requester chat provider.
|
||||
|
||||
Parameters:
|
||||
- `task` (required)
|
||||
- `label?` (optional; used for logs/UI)
|
||||
- `model?` (optional; overrides the sub-agent model; invalid values error)
|
||||
- `timeoutSeconds?` (default 0; 0 = fire-and-forget)
|
||||
- `cleanup?` (`delete|keep`, default `delete`)
|
||||
|
||||
Behavior:
|
||||
- Starts a new `subagent:<uuid>` session with `deliver: false`.
|
||||
- Sub-agents default to the full tool set **minus session tools** (configurable via `agent.subagents.tools`).
|
||||
- Sub-agents are not allowed to call `sessions_spawn` (no sub-agent → sub-agent spawning).
|
||||
- After completion (or best-effort wait), Clawdbot runs a sub-agent **announce step** and posts the result to the requester chat provider.
|
||||
- Reply exactly `ANNOUNCE_SKIP` during the announce step to stay silent.
|
||||
|
||||
## Sandbox Session Visibility
|
||||
|
||||
Sandboxed sessions can use session tools, but by default they only see sessions they spawned via `sessions_spawn`.
|
||||
|
||||
Config:
|
||||
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
sandbox: {
|
||||
// default: "spawned"
|
||||
sessionToolsVisibility: "spawned" // or "all"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
87
docs/concepts/session.md
Normal file
87
docs/concepts/session.md
Normal file
@@ -0,0 +1,87 @@
|
||||
---
|
||||
summary: "Session management rules, keys, and persistence for chats"
|
||||
read_when:
|
||||
- Modifying session handling or storage
|
||||
---
|
||||
# Session Management
|
||||
|
||||
Clawdbot treats **one direct-chat session per agent** as primary. Direct chats collapse to `agent:<agentId>:<mainKey>` (default `main`), while group/channel chats get their own keys. `session.mainKey` is honored.
|
||||
|
||||
## Gateway is the source of truth
|
||||
All session state is **owned by the gateway** (the “master” Clawdbot). UI clients (macOS app, WebChat, etc.) must query the gateway for session lists and token counts instead of reading local files.
|
||||
|
||||
- In **remote mode**, the session store you care about lives on the remote gateway host, not your Mac.
|
||||
- Token counts shown in UIs come from the gateway’s store fields (`inputTokens`, `outputTokens`, `totalTokens`, `contextTokens`). Clients do not parse JSONL transcripts to “fix up” totals.
|
||||
|
||||
## Where state lives
|
||||
- On the **gateway host**:
|
||||
- Store file: `~/.clawdbot/agents/<agentId>/sessions/sessions.json` (per agent).
|
||||
- Transcripts: `~/.clawdbot/agents/<agentId>/sessions/<SessionId>.jsonl` (one file per session id).
|
||||
- The store is a map `sessionKey -> { sessionId, updatedAt, ... }`. Deleting entries is safe; they are recreated on demand.
|
||||
- Group entries may include `displayName`, `provider`, `subject`, `room`, and `space` to label sessions in UIs.
|
||||
- Clawdbot does **not** read legacy Pi/Tau session folders.
|
||||
|
||||
## Mapping transports → session keys
|
||||
- Direct chats collapse to the per-agent primary key: `agent:<agentId>:<mainKey>`.
|
||||
- Multiple phone numbers and providers can map to the same agent main key; they act as transports into one conversation.
|
||||
- Group chats isolate state: `agent:<agentId>:<provider>:group:<id>` (rooms/channels use `agent:<agentId>:<provider>:channel:<id>`).
|
||||
- Legacy `group:<id>` keys are still recognized for migration.
|
||||
- Other sources:
|
||||
- Cron jobs: `cron:<job.id>`
|
||||
- Webhooks: `hook:<uuid>` (unless explicitly set by the hook)
|
||||
- Node bridge runs: `node-<nodeId>`
|
||||
|
||||
## Lifecyle
|
||||
- Idle expiry: `session.idleMinutes` (default 60). After the timeout a new `sessionId` is minted on the next message.
|
||||
- Reset triggers: exact `/new` or `/reset` (plus any extras in `resetTriggers`) start a fresh session id and pass the remainder of the message through. If `/new` or `/reset` is sent alone, Clawdbot runs a short “hello” greeting turn to confirm the reset.
|
||||
- Manual reset: delete specific keys from the store or remove the JSONL transcript; the next message recreates them.
|
||||
|
||||
## Send policy (optional)
|
||||
Block delivery for specific session types without listing individual ids.
|
||||
|
||||
```json5
|
||||
{
|
||||
session: {
|
||||
sendPolicy: {
|
||||
rules: [
|
||||
{ action: "deny", match: { provider: "discord", chatType: "group" } },
|
||||
{ action: "deny", match: { keyPrefix: "cron:" } }
|
||||
],
|
||||
default: "allow"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Runtime override (owner only):
|
||||
- `/send on` → allow for this session
|
||||
- `/send off` → deny for this session
|
||||
- `/send inherit` → clear override and use config rules
|
||||
Send these as standalone messages so they register.
|
||||
|
||||
## Configuration (optional rename example)
|
||||
```json5
|
||||
// ~/.clawdbot/clawdbot.json
|
||||
{
|
||||
session: {
|
||||
scope: "per-sender", // keep group keys separate
|
||||
idleMinutes: 120,
|
||||
resetTriggers: ["/new", "/reset"],
|
||||
store: "~/.clawdbot/agents/{agentId}/sessions/sessions.json",
|
||||
mainKey: "main",
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Inspecting
|
||||
- `pnpm clawdbot status` — shows store path and recent sessions.
|
||||
- `pnpm clawdbot sessions --json` — dumps every entry (filter with `--active <minutes>`).
|
||||
- `clawdbot gateway call sessions.list --params '{}'` — fetch sessions from the running gateway (use `--url`/`--token` for remote gateway access).
|
||||
- Send `/status` as a standalone message in chat to see whether the agent is reachable, how much of the session context is used, current thinking/verbose toggles, and when your WhatsApp web creds were last refreshed (helps spot relink needs).
|
||||
- Send `/stop` as a standalone message to abort the current run.
|
||||
- Send `/compact` (optional instructions) as a standalone message to summarize older context and free up window space.
|
||||
- JSONL transcripts can be opened directly to review full turns.
|
||||
|
||||
## Tips
|
||||
- Keep the primary key dedicated to 1:1 traffic; let groups keep their own keys.
|
||||
- When automating cleanup, delete individual keys instead of the whole store to preserve context elsewhere.
|
||||
8
docs/concepts/sessions.md
Normal file
8
docs/concepts/sessions.md
Normal file
@@ -0,0 +1,8 @@
|
||||
---
|
||||
summary: "Alias for session management docs"
|
||||
read_when:
|
||||
- You looked for docs/sessions.md; canonical doc lives in docs/session.md
|
||||
---
|
||||
# Sessions
|
||||
|
||||
Canonical session management docs live in [`docs/session.md`](/session).
|
||||
40
docs/concepts/timezone.md
Normal file
40
docs/concepts/timezone.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
summary: "Timezone handling for agents, envelopes, and prompts"
|
||||
read_when:
|
||||
- You need to understand how timestamps are normalized for the model
|
||||
- Configuring the user timezone for system prompts
|
||||
---
|
||||
|
||||
# Timezones
|
||||
|
||||
Clawdbot standardizes timestamps so the model sees a **single reference time**.
|
||||
|
||||
## Message envelopes (UTC)
|
||||
|
||||
Inbound messages are wrapped in an envelope like:
|
||||
|
||||
```
|
||||
[Provider ... 2026-01-05T21:26Z] message text
|
||||
```
|
||||
|
||||
The timestamp in the envelope is **always UTC**, with minutes precision.
|
||||
|
||||
## Tool payloads (raw provider data)
|
||||
|
||||
Tool calls (`discord.readMessages`, `slack.readMessages`, etc.) return **raw provider timestamps**.
|
||||
These are typically UTC ISO strings (Discord) or UTC epoch strings (Slack). We do not rewrite them.
|
||||
|
||||
## User timezone for the system prompt
|
||||
|
||||
Set `agent.userTimezone` to tell the model the user's local time zone. If it is
|
||||
unset, Clawdbot resolves the **host timezone at runtime** (no config write).
|
||||
|
||||
```json5
|
||||
{
|
||||
agent: { userTimezone: "America/Chicago" }
|
||||
}
|
||||
```
|
||||
|
||||
The system prompt includes:
|
||||
- `User timezone: America/Chicago`
|
||||
- `Current user time: 2026-01-05 15:26`
|
||||
42
docs/concepts/typebox.md
Normal file
42
docs/concepts/typebox.md
Normal file
@@ -0,0 +1,42 @@
|
||||
---
|
||||
summary: "TypeBox schemas as the single source of truth for the gateway protocol"
|
||||
read_when:
|
||||
- Updating protocol schemas or codegen
|
||||
---
|
||||
# TypeBox as Protocol Source of Truth
|
||||
|
||||
Last updated: 2025-12-09
|
||||
|
||||
We use TypeBox schemas in [`src/gateway/protocol/schema.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/protocol/schema.ts) as the single source of truth for the Gateway control plane (connect/req/res/event frames and payloads). All derived artifacts should be generated from these schemas, not edited by hand.
|
||||
|
||||
## Current pipeline
|
||||
|
||||
- **TypeBox → JSON Schema**: `pnpm protocol:gen` writes [`dist/protocol.schema.json`](https://github.com/clawdbot/clawdbot/blob/main/dist/protocol.schema.json) (draft-07) and runs AJV in the server tests.
|
||||
- **TypeBox → Swift**: `pnpm protocol:gen:swift` generates [`apps/macos/Sources/ClawdbotProtocol/GatewayModels.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/ClawdbotProtocol/GatewayModels.swift).
|
||||
|
||||
## Problem
|
||||
|
||||
- We want strong typing in Swift, including a sealed `GatewayFrame` enum with a discriminator and a forward-compatible `unknown` case.
|
||||
|
||||
## Preferred plan (next step)
|
||||
|
||||
- Add a small, custom Swift generator driven directly by the TypeBox schemas:
|
||||
- Emit a sealed `enum GatewayFrame: Codable { case req(RequestFrame), res(ResponseFrame), event(EventFrame) }`.
|
||||
- Emit strongly typed payload structs/enums (`ConnectParams`, `HelloOk`, `RequestFrame`, `ResponseFrame`, `EventFrame`, `PresenceEntry`, `Snapshot`, `StateVersion`, `ErrorShape`, `AgentEvent`, `TickEvent`, `ShutdownEvent`, `SendParams`, `AgentParams`, `ErrorCode`, `PROTOCOL_VERSION`).
|
||||
- Custom `init(from:)` / `encode(to:)` enforces the `type` discriminator and can include an `unknown` case for forward compatibility.
|
||||
- Wire a new script (e.g., `pnpm protocol:gen:swift`) into `protocol:check` so CI fails if the generated Swift is stale.
|
||||
|
||||
Why this path:
|
||||
- Single source of truth stays TypeBox; no new IDL to maintain.
|
||||
- Predictable, strongly typed Swift (no optional soup).
|
||||
- Small deterministic codegen (~150–200 LOC script) we control.
|
||||
|
||||
## Alternative (if we want off-the-shelf codegen)
|
||||
|
||||
- Wrap the existing JSON Schema into an OpenAPI 3.1 doc (auto-generated) and use **swift-openapi-generator** or **openapi-generator swift5**. More moving parts, but also yields enums with discriminator support. Keep this as a fallback if we don’t want a custom emitter.
|
||||
|
||||
## Action items
|
||||
|
||||
- Implement `protocol:gen:swift` that reads the TypeBox schemas and emits the sealed Swift enum + payload structs.
|
||||
- Update `protocol:check` to include the Swift generator output in the diff check.
|
||||
- Remove quicktype output once the custom generator is in place (or keep it for docs only).
|
||||
Reference in New Issue
Block a user