feat: configurable heartbeat session

2026-01-19 15:42:07 -05:00
parent db61451c67
commit 7725dd6795
8 changed files with 230 additions and 298 deletions
--- a/docs/gateway/configuration.md
+++ b/docs/gateway/configuration.md
@@ -24,7 +24,7 @@ Unknown keys, malformed types, or invalid values cause the Gateway to **refuse t

 When validation fails:
 - The Gateway does not boot.
- Only diagnostic commands are allowed (for example: `clawdbot doctor`, `clawdbot logs`, `clawdbot health`, `clawdbot status`, `clawdbot gateway status`, `clawdbot gateway probe`, `clawdbot help`).
+- Only diagnostic commands are allowed (for example: `clawdbot doctor`, `clawdbot logs`, `clawdbot health`, `clawdbot status`, `clawdbot service`, `clawdbot help`).
 - Run `clawdbot doctor` to see the exact issues.
 - Run `clawdbot doctor --fix` (or `--yes`) to apply migrations/repairs.

@@ -1414,7 +1414,7 @@ Each `agents.defaults.models` entry can include:
 - `alias` (optional model shortcut, e.g. `/opus`).
 - `params` (optional provider-specific API params passed through to the model request).

-`params` is also applied to streaming runs (embedded agent + compaction). Supported keys today: `temperature`, `maxTokens`, `cacheControlTtl` (`"5m"` or `"1h"`, Anthropic API + OpenRouter Anthropic models only; ignored for Anthropic OAuth/Claude Code tokens). These merge with call-time options; caller-supplied values win. `temperature` is an advanced knob—leave unset unless you know the model’s defaults and need a change. Clawdbot includes the `extended-cache-ttl-2025-04-11` beta flag for Anthropic API; keep it if you override provider headers.
+`params` is also applied to streaming runs (embedded agent + compaction). Supported keys today: `temperature`, `maxTokens`. These merge with call-time options; caller-supplied values win. `temperature` is an advanced knob—leave unset unless you know the model’s defaults and need a change.

 Example:

@@ -1569,7 +1569,7 @@ Example:
 }
 ```

-#### `agents.defaults.contextPruning` (TTL-aware tool-result pruning)
+#### `agents.defaults.contextPruning` (tool-result pruning)

 `agents.defaults.contextPruning` prunes **old tool results** from the in-memory context right before a request is sent to the LLM.
 It does **not** modify the session history on disk (`*.jsonl` remains complete).
@@ -1580,9 +1580,11 @@ High level:
 - Never touches user/assistant messages.
 - Protects the last `keepLastAssistants` assistant messages (no tool results after that point are pruned).
 - Protects the bootstrap prefix (nothing before the first user message is pruned).
- Mode:
-  - `cache-ttl`: pruning only runs when the last Anthropic call for the session is **older** than `ttl`.
-    When it runs, it uses the same soft-trim + hard-clear behavior as before.
+- Modes:
+  - `adaptive`: soft-trims oversized tool results (keep head/tail) when the estimated context ratio crosses `softTrimRatio`.
+    Then hard-clears the oldest eligible tool results when the estimated context ratio crosses `hardClearRatio` **and**
+    there’s enough prunable tool-result bulk (`minPrunableToolChars`).
+  - `aggressive`: always replaces eligible tool results before the cutoff with the `hardClear.placeholder` (no ratio checks).

 Soft vs hard pruning (what changes in the context sent to the LLM):
 - **Soft-trim**: only for *oversized* tool results. Keeps the beginning + end and inserts `...` in the middle.
@@ -1596,41 +1598,44 @@ Notes / current limitations:
 - Tool results containing **image blocks are skipped** (never trimmed/cleared) right now.
 - The estimated “context ratio” is based on **characters** (approximate), not exact tokens.
 - If the session doesn’t contain at least `keepLastAssistants` assistant messages yet, pruning is skipped.
- `cache-ttl` only activates for Anthropic API calls (and OpenRouter Anthropic models).
- After a prune, the TTL window resets so subsequent requests keep cache until `ttl` expires again.
- For best results, match `contextPruning.ttl` to the model `cacheControlTtl` you set in `agents.defaults.models.*.params`.
+- In `aggressive` mode, `hardClear.enabled` is ignored (eligible tool results are always replaced with `hardClear.placeholder`).

-Default (off, unless Anthropic auth profiles are detected):
+Default (adaptive):
+```json5
+{
+  agents: { defaults: { contextPruning: { mode: "adaptive" } } }
+}
+```
+
+To disable:
 ```json5
 {
  agents: { defaults: { contextPruning: { mode: "off" } } }
 }
 ```

-Enable TTL-aware pruning:
+Defaults (when `mode` is `"adaptive"` or `"aggressive"`):
+- `keepLastAssistants`: `3`
+- `softTrimRatio`: `0.3` (adaptive only)
+- `hardClearRatio`: `0.5` (adaptive only)
+- `minPrunableToolChars`: `50000` (adaptive only)
+- `softTrim`: `{ maxChars: 4000, headChars: 1500, tailChars: 1500 }` (adaptive only)
+- `hardClear`: `{ enabled: true, placeholder: "[Old tool result content cleared]" }`
+
+Example (aggressive, minimal):
 ```json5
 {
-  agents: { defaults: { contextPruning: { mode: "cache-ttl" } } }
+  agents: { defaults: { contextPruning: { mode: "aggressive" } } }
 }
 ```

-Defaults (when `mode` is `"cache-ttl"`):
- `ttl`: `"5m"`
- `keepLastAssistants`: `3`
- `softTrimRatio`: `0.3`
- `hardClearRatio`: `0.5`
- `minPrunableToolChars`: `50000`
- `softTrim`: `{ maxChars: 4000, headChars: 1500, tailChars: 1500 }`
- `hardClear`: `{ enabled: true, placeholder: "[Old tool result content cleared]" }`
-
-Example (cache-ttl tuned):
+Example (adaptive tuned):
 ```json5
 {
  agents: {
    defaults: {
      contextPruning: {
-        mode: "cache-ttl",
-        ttl: "5m",
+        mode: "adaptive",
        keepLastAssistants: 3,
        softTrimRatio: 0.3,
        hardClearRatio: 0.5,
@@ -1737,12 +1742,9 @@ Z.AI models are available as `zai/<model>` (e.g. `zai/glm-4.7`) and require
  `30m`. Set `0m` to disable.
 - `model`: optional override model for heartbeat runs (`provider/model`).
 - `includeReasoning`: when `true`, heartbeats will also deliver the separate `Reasoning:` message when available (same shape as `/reasoning on`). Default: `false`.
- `activeHours`: optional local-time window that controls when heartbeats run.
-  - `start`: start time (HH:MM, 24h). Inclusive.
-  - `end`: end time (HH:MM, 24h). Exclusive. Use `"24:00"` for end-of-day.
-  - `timezone`: `"user"` (default), `"local"`, or an IANA timezone id.
- `target`: optional delivery channel (`last`, `whatsapp`, `telegram`, `discord`, `slack`, `signal`, `imessage`, `none`). Default: `last`.
+- `session`: optional session key to control which session the heartbeat runs in. Default: `main`.
 - `to`: optional recipient override (channel-specific id, e.g. E.164 for WhatsApp, chat id for Telegram).
+- `target`: optional delivery channel (`last`, `whatsapp`, `telegram`, `discord`, `slack`, `msteams`, `signal`, `imessage`, `none`). Default: `last`.
 - `prompt`: optional override for the heartbeat body (default: `Read HEARTBEAT.md if it exists (workspace context). Follow it strictly. Do not infer or repeat old tasks from prior chats. If nothing needs attention, reply HEARTBEAT_OK.`). Overrides are sent verbatim; include a `Read HEARTBEAT.md` line if you still want the file read.
 - `ackMaxChars`: max chars allowed after `HEARTBEAT_OK` before delivery (default: 300).

@@ -1773,7 +1775,6 @@ Note: `applyPatch` is only under `tools.exec`.
 - `tools.web.fetch.maxChars` (default 50000)
 - `tools.web.fetch.timeoutSeconds` (default 30)
 - `tools.web.fetch.cacheTtlMinutes` (default 15)
- `tools.web.fetch.maxRedirects` (default 3)
 - `tools.web.fetch.userAgent` (optional override)
 - `tools.web.fetch.readability` (default true; disable to use basic HTML cleanup only)
 - `tools.web.fetch.firecrawl.enabled` (default true when an API key is set)
@@ -1840,7 +1841,7 @@ Example:

 `agents.defaults.subagents` configures sub-agent defaults:
 - `model`: default model for spawned sub-agents (string or `{ primary, fallbacks }`). If omitted, sub-agents inherit the caller’s model unless overridden per agent or per call.
- `maxConcurrent`: max concurrent sub-agent runs (default 8)
+- `maxConcurrent`: max concurrent sub-agent runs (default 1)
 - `archiveAfterMinutes`: auto-archive sub-agent sessions after N minutes (default 60; set `0` to disable)
 - Per-subagent tool policy: `tools.subagents.tools.allow` / `tools.subagents.tools.deny` (deny wins)

@@ -1974,7 +1975,7 @@ Notes:

 `agents.defaults.maxConcurrent` sets the maximum number of embedded agent runs that can
 execute in parallel across sessions. Each session is still serialized (one run
-per session key at a time). Default: 4.
+per session key at a time). Default: 1.

 ### `agents.defaults.sandbox`

@@ -2452,9 +2453,6 @@ Controls session scoping, reset policy, reset triggers, and where the session st
      dm: { mode: "idle", idleMinutes: 240 },
      group: { mode: "idle", idleMinutes: 120 }
    },
-    resetByChannel: {
-      discord: { mode: "idle", idleMinutes: 10080 }
-    },
    resetTriggers: ["/new", "/reset"],
    // Default is already per-agent under ~/.clawdbot/agents/<agentId>/sessions/sessions.json
    // You can override with {agentId} templating:
@@ -2490,7 +2488,7 @@ Fields:
  - `idleMinutes`: sliding idle window in minutes. When daily + idle are both configured, whichever expires first wins.
 - `resetByType`: per-session overrides for `dm`, `group`, and `thread`.
  - If you only set legacy `session.idleMinutes` without any `reset`/`resetByType`, Clawdbot stays in idle-only mode for backward compatibility.
- `resetByChannel`: channel-specific reset policy overrides (keyed by channel id, applies to all session types for that channel; overrides `reset`/`resetByType`).
+- `heartbeatIdleMinutes`: optional idle override for heartbeat checks (daily reset still applies when enabled).
 - `agentToAgent.maxPingPongTurns`: max reply-back turns between requester/target (0–5, default 5).
 - `sendPolicy.default`: `allow` or `deny` fallback when no rule matches.
 - `sendPolicy.rules[]`: match by `channel`, `chatType` (`direct|group|room`), or `keyPrefix` (e.g. `cron:`). First deny wins; otherwise allow.
@@ -2617,13 +2615,10 @@ Defaults:
    // noSandbox: false,
    // executablePath: "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser",
    // attachOnly: false, // set true when tunneling a remote CDP to localhost
-    // snapshotDefaults: { mode: "efficient" }, // tool/CLI default snapshot preset
  }
 }
 ```

-Note: `browser.snapshotDefaults` only affects Clawdbot's browser tool + CLI. Direct HTTP clients must pass `mode` explicitly.
-
 ### `ui` (Appearance)

 Optional accent color used by the native apps for UI chrome (e.g. Talk Mode bubble tint).
@@ -2647,13 +2642,6 @@ Defaults:
 - bind: `loopback`
 - port: `18789` (single port for WS + HTTP)

-Bind modes:
- `loopback`: `127.0.0.1` (local-only)
- `lan`: `0.0.0.0` (all interfaces)
- `tailnet`: Tailscale IPv4 address (100.64.0.0/10)
- `auto`: prefer loopback, fall back to LAN if loopback cannot bind
- `custom`: `gateway.customBindHost` (IPv4), fallback to LAN if unavailable
-
 ```json5
 {
  gateway: {
@@ -2684,15 +2672,14 @@ Notes:
 - `clawdbot gateway` refuses to start unless `gateway.mode` is set to `local` (or you pass the override flag).
 - `gateway.port` controls the single multiplexed port used for WebSocket + HTTP (control UI, hooks, A2UI).
 - OpenAI Chat Completions endpoint: **disabled by default**; enable with `gateway.http.endpoints.chatCompletions.enabled: true`.
- OpenResponses endpoint: **disabled by default**; enable with `gateway.http.endpoints.responses.enabled: true`.
 - Precedence: `--port` > `CLAWDBOT_GATEWAY_PORT` > `gateway.port` > default `18789`.
- Non-loopback binds (`lan`/`tailnet`/`custom`, or `auto` when loopback is unavailable) require auth. Use `gateway.auth.token` (or `CLAWDBOT_GATEWAY_TOKEN`).
+- Non-loopback binds (`lan`/`tailnet`/`auto`) require auth. Use `gateway.auth.token` (or `CLAWDBOT_GATEWAY_TOKEN`).
 - The onboarding wizard generates a gateway token by default (even on loopback).
 - `gateway.remote.token` is **only** for remote CLI calls; it does not enable local gateway auth. `gateway.token` is ignored.

 Auth and Tailscale:
 - `gateway.auth.mode` sets the handshake requirements (`token` or `password`).
- `gateway.auth.token` stores the shared token for token auth (used by the CLI on the same machine and as the bootstrap credential for device pairing).
+- `gateway.auth.token` stores the shared token for token auth (used by the CLI on the same machine).
 - When `gateway.auth.mode` is set, only that method is accepted (plus optional Tailscale headers).
 - `gateway.auth.password` can be set here, or via `CLAWDBOT_GATEWAY_PASSWORD` (recommended).
 - `gateway.auth.allowTailscale` allows Tailscale Serve identity headers
@@ -2701,9 +2688,6 @@ Auth and Tailscale:
  `true`, Serve requests do not need a token/password; set `false` to require
  explicit credentials. Defaults to `true` when `tailscale.mode = "serve"` and
  auth mode is not `password`.
- After pairing, the Gateway issues **device tokens** scoped to the device role + scopes.
-  These are returned in `hello-ok.auth.deviceToken`; clients should persist and reuse them
-  instead of the shared token. Rotate/revoke via `device.token.rotate`/`device.token.revoke`.
 - `gateway.tailscale.mode: "serve"` uses Tailscale Serve (tailnet only, loopback bind).
 - `gateway.tailscale.mode: "funnel"` exposes the dashboard publicly; requires auth.
 - `gateway.tailscale.resetOnExit` resets Serve/Funnel config on shutdown.
@@ -2712,7 +2696,6 @@ Remote client defaults (CLI):
 - `gateway.remote.url` sets the default Gateway WebSocket URL for CLI calls when `gateway.mode = "remote"`.
 - `gateway.remote.token` supplies the token for remote calls (leave unset for no auth).
 - `gateway.remote.password` supplies the password for remote calls (leave unset for no auth).
- `gateway.remote.tlsFingerprint` pins the gateway TLS cert fingerprint (sha256).

 macOS app behavior:
 - Clawdbot.app watches `~/.clawdbot/clawdbot.json` and switches modes live when `gateway.mode` or `gateway.remote.url` changes.
@@ -2726,36 +2709,12 @@ macOS app behavior:
    remote: {
      url: "ws://gateway.tailnet:18789",
      token: "your-token",
-      password: "your-password",
-      tlsFingerprint: "sha256:ab12cd34..."
+      password: "your-password"
    }
  }
 }
 ```

-### `gateway.nodes` (Node command allowlist)
-
-The Gateway enforces a per-platform command allowlist for `node.invoke`. Nodes must both
-**declare** a command and have it **allowed** by the Gateway to run it.
-
-Use this section to extend or deny commands:
-
-```json5
-{
-  gateway: {
-    nodes: {
-      allowCommands: ["custom.vendor.command"], // extra commands beyond defaults
-      denyCommands: ["sms.send"]      // block a command even if declared
-    }
-  }
-}
-```
-
-Notes:
- `allowCommands` extends the built-in per-platform defaults.
- `denyCommands` always wins (even if the node claims the command).
- `node.invoke` rejects commands that are not declared by the node.
-
 ### `gateway.reload` (Config hot reload)

 The Gateway watches `~/.clawdbot/clawdbot.json` (or `CLAWDBOT_CONFIG_PATH`) and applies changes automatically.
@@ -3003,7 +2962,7 @@ Auto-generated certs require `openssl` on PATH; if generation fails, the bridge

 ### `discovery.wideArea` (Wide-Area Bonjour / unicast DNS‑SD)

-When enabled, the Gateway writes a unicast DNS-SD zone for `_clawdbot-gw._tcp` under `~/.clawdbot/dns/` using the standard discovery domain `clawdbot.internal.`
+When enabled, the Gateway writes a unicast DNS-SD zone for `_clawdbot-bridge._tcp` under `~/.clawdbot/dns/` using the standard discovery domain `clawdbot.internal.`

 To make iOS/Android discover across networks (Vienna ⇄ London), pair this with:
 - a DNS server on the gateway host serving `clawdbot.internal.` (CoreDNS is recommended)
@@ -3033,9 +2992,6 @@ Template placeholders are expanded in `tools.media.*.models[].args` and `tools.m
 | `{{From}}` | Sender identifier (E.164 for WhatsApp; may differ per channel) |
 | `{{To}}` | Destination identifier |
 | `{{MessageSid}}` | Channel message id (when available) |
-| `{{MessageSidFull}}` | Provider-specific full message id when `MessageSid` is shortened |
-| `{{ReplyToId}}` | Reply-to message id (when available) |
-| `{{ReplyToIdFull}}` | Provider-specific full reply-to id when `ReplyToId` is shortened |
 | `{{SessionId}}` | Current session UUID |
 | `{{IsNewSession}}` | `"true"` when a new session was created |
 | `{{MediaUrl}}` | Inbound media pseudo-URL (if present) |
--- a/docs/gateway/heartbeat.md
+++ b/docs/gateway/heartbeat.md
@@ -127,18 +127,26 @@ Example: two agents, only the second agent runs heartbeats.
 - `every`: heartbeat interval (duration string; default unit = minutes).
 - `model`: optional model override for heartbeat runs (`provider/model`).
 - `includeReasoning`: when enabled, also deliver the separate `Reasoning:` message when available (same shape as `/reasoning on`).
+- `session`: optional session key for heartbeat runs.
+  - `main` (default): agent main session.
+  - Explicit session key (copy from `clawdbot sessions --json` or the [sessions CLI](/cli/sessions)).
+  - Session key formats: see [Sessions](/concepts/session) and [Groups](/concepts/groups).
 - `target`:
  - `last` (default): deliver to the last used external channel.
-  - explicit channel: `whatsapp` / `telegram` / `discord` / `slack` / `signal` / `imessage`.
+  - explicit channel: `whatsapp` / `telegram` / `discord` / `slack` / `msteams` / `signal` / `imessage`.
  - `none`: run the heartbeat but **do not deliver** externally.
- `to`: optional recipient override (E.164 for WhatsApp, chat id for Telegram, etc.).
+- `to`: optional recipient override (channel-specific id, e.g. E.164 for WhatsApp or a Telegram chat id).
 - `prompt`: overrides the default prompt body (not merged).
 - `ackMaxChars`: max chars allowed after `HEARTBEAT_OK` before delivery.

 ## Delivery behavior

- Heartbeats run in each agent’s **main session** (`agent:<id>:<mainKey>`), or `global`
-  when `session.scope = "global"`.
+- Heartbeats run in the agent’s main session by default (`agent:<id>:<mainKey>`),
+  or `global` when `session.scope = "global"`. Set `session` to override to a
+  specific channel session (Discord/WhatsApp/etc.).
+- `session` only affects the run context; delivery is controlled by `target` and `to`.
+- To deliver to a specific channel/recipient, set `target` + `to`. With
+  `target: "last"`, delivery uses the last external channel for that session.
 - If the main queue is busy, the heartbeat is skipped and retried later.
 - If `target` resolves to no external destination, the run still happens but no
  outbound message is sent.