feat: unify gateway heartbeat
This commit is contained in:
@@ -130,7 +130,8 @@ Controls the embedded agent runtime (model/thinking/verbose/timeouts).
|
||||
timeoutSeconds: 600,
|
||||
mediaMaxMb: 5,
|
||||
heartbeat: {
|
||||
every: "30m"
|
||||
every: "30m",
|
||||
target: "last"
|
||||
},
|
||||
maxConcurrent: 3,
|
||||
bash: {
|
||||
@@ -151,6 +152,9 @@ deprecation fallback.
|
||||
- `every`: duration string (`ms`, `s`, `m`, `h`); default unit minutes. Omit or set
|
||||
`0m` to disable.
|
||||
- `model`: optional override model for heartbeat runs (`provider/model`).
|
||||
- `target`: delivery channel (`last`, `whatsapp`, `telegram`, `none`). Default: `last`.
|
||||
- `to`: optional recipient override (E.164 for WhatsApp, chat id for Telegram).
|
||||
- `prompt`: override the default heartbeat body (`HEARTBEAT`).
|
||||
|
||||
`agent.bash` configures background bash defaults:
|
||||
- `backgroundMs`: time before auto-background (ms, default 20000)
|
||||
|
||||
14
docs/cron.md
14
docs/cron.md
@@ -14,7 +14,7 @@ Last updated: 2025-12-13
|
||||
## Context
|
||||
|
||||
Clawdis already has:
|
||||
- A **periodic reply heartbeat** that runs the agent with `HEARTBEAT` and suppresses `HEARTBEAT_OK` (`src/web/auto-reply.ts`).
|
||||
- A **gateway heartbeat runner** that runs the agent with `HEARTBEAT` and suppresses `HEARTBEAT_OK` (`src/infra/heartbeat-runner.ts`).
|
||||
- A lightweight, in-memory **system event queue** (`enqueueSystemEvent`) that is injected into the next **main session** turn (`drainSystemEvents` in `src/auto-reply/reply.ts`).
|
||||
- A WebSocket **Gateway** daemon that is intended to be always-on (`docs/gateway.md`).
|
||||
|
||||
@@ -197,12 +197,12 @@ This yields:
|
||||
We need a way for the Gateway (or the scheduler) to request an immediate heartbeat without duplicating heartbeat logic.
|
||||
|
||||
Design:
|
||||
- `monitorWebProvider` owns the real `runReplyHeartbeat()` function (it already has all the local state needed).
|
||||
- Add a small global hook module:
|
||||
- `setReplyHeartbeatWakeHandler(fn | null)` installed by `monitorWebProvider`
|
||||
- `requestReplyHeartbeatNow({ reason, coalesceMs? })`
|
||||
- If the handler is absent (provider not connected), the request is stored as “pending”; the next time the handler is installed, it runs once.
|
||||
- Coalesce rapid calls and respect the existing “skip when queue busy” behavior (prefer retrying soon vs dropping).
|
||||
- `startHeartbeatRunner` owns the real heartbeat execution and installs a wake handler.
|
||||
- Wake hook lives in `src/infra/heartbeat-wake.ts`:
|
||||
- `setHeartbeatWakeHandler(fn | null)` installed by the heartbeat runner
|
||||
- `requestHeartbeatNow({ reason, coalesceMs? })`
|
||||
- If the handler is absent, the request is stored as “pending”; the next time the handler is installed, it runs once.
|
||||
- Coalesce rapid calls and respect the “skip when queue busy” behavior (retry soon vs dropping).
|
||||
|
||||
## Run history log (JSONL)
|
||||
|
||||
|
||||
@@ -3,49 +3,53 @@ summary: "Plan for heartbeat polling messages and notification rules"
|
||||
read_when:
|
||||
- Adjusting heartbeat cadence or messaging
|
||||
---
|
||||
# Heartbeat polling plan (2025-11-26)
|
||||
# Heartbeat (Gateway)
|
||||
|
||||
Goal: add a simple heartbeat poll for the embedded agent that only notifies users when something matters, using the `HEARTBEAT_OK` sentinel. The heartbeat body we send is `HEARTBEAT` so the model can easily spot it.
|
||||
Heartbeat runs periodic agent turns in the **main session** so the model can
|
||||
surface anything that needs attention without spamming the user.
|
||||
|
||||
## Prompt contract
|
||||
- Extend the agent system prompt to explain: “If this is a heartbeat poll and nothing needs attention, reply exactly `HEARTBEAT_OK` and nothing else. For any alert, do **not** include `HEARTBEAT_OK`; just return the alert text.” Heartbeat prompt body is `HEARTBEAT`.
|
||||
- Keep existing WhatsApp length guidance; forbid burying the sentinel inside alerts.
|
||||
- Heartbeat body defaults to `HEARTBEAT` (configurable via `agent.heartbeat.prompt`).
|
||||
- If nothing needs attention, the model must reply **exactly** `HEARTBEAT_OK`.
|
||||
- For alerts, do **not** include `HEARTBEAT_OK`; return only the alert text.
|
||||
|
||||
## Config & defaults
|
||||
- New config key: `agent.heartbeat` with:
|
||||
- `every`: duration string (`ms`, `s`, `m`, `h`; default unit minutes). `0m` disables.
|
||||
- `model`: optional override model (`provider/model`) for heartbeat runs.
|
||||
- Default: disabled unless `agent.heartbeat.every` is set.
|
||||
- New optional idle override for heartbeats: `session.heartbeatIdleMinutes` (defaults to `idleMinutes`). Heartbeat skips do **not** update the session `updatedAt` so idle expiry still works.
|
||||
## Config
|
||||
|
||||
## Poller behavior
|
||||
- When gateway runs with command-mode auto-reply, start a timer with the resolved heartbeat interval.
|
||||
- Each tick invokes the configured command with a short heartbeat body (e.g., “(heartbeat) summarize any important changes since last turn”) while reusing the active session args so Pi context stays warm.
|
||||
- Heartbeats never create a new session implicitly: if there’s no stored session for the target (fallback path), the heartbeat is skipped instead of starting a fresh Pi session.
|
||||
- Abort timer on SIGINT/abort of the gateway.
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
heartbeat: {
|
||||
every: "30m", // duration string: ms|s|m|h (0m disables)
|
||||
model: "anthropic/claude-opus-4-5",
|
||||
target: "last", // last | whatsapp | telegram | none
|
||||
to: "+15551234567", // optional override for whatsapp/telegram
|
||||
prompt: "HEARTBEAT" // optional override
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Sentinel handling
|
||||
- Trim output. If the trimmed text equals `HEARTBEAT_OK` (case-sensitive) -> skip outbound message.
|
||||
- Otherwise, send the text/media as normal, stripping the sentinel if it somehow appears.
|
||||
- Treat empty output as `HEARTBEAT_OK` to avoid spurious pings.
|
||||
### Fields
|
||||
- `every`: heartbeat interval (duration string; default unit minutes). Omit or set
|
||||
to `0m` to disable.
|
||||
- `model`: optional model override for heartbeat runs (`provider/model`).
|
||||
- `target`: where heartbeat output is delivered.
|
||||
- `last` (default): send to the last used external channel.
|
||||
- `whatsapp` / `telegram`: force the channel (optionally set `to`).
|
||||
- `none`: do not deliver externally; output stays in the session (WebChat-visible).
|
||||
- `to`: optional recipient override (E.164 for WhatsApp, chat id for Telegram).
|
||||
- `prompt`: override the default heartbeat body.
|
||||
|
||||
## Logging requirements
|
||||
- Normal mode: single info line per tick, e.g., `heartbeat: ok (skipped)` or `heartbeat: alert sent (32ms)`.
|
||||
- `--verbose`: log start/end, command argv, duration, and whether it was skipped/sent/error; include session ID and connection/run IDs via `getChildLogger` for correlation.
|
||||
- On command failure: warn-level one-liner in normal mode; verbose log includes stdout/stderr snippets.
|
||||
## Behavior
|
||||
- Runs in the main session (`session.mainKey`, or `global` when scope is global).
|
||||
- Uses the main lane queue; if requests are in flight, the wake is retried.
|
||||
- Empty output or `HEARTBEAT_OK` is treated as “ok” and does **not** keep the
|
||||
session alive (`updatedAt` is restored).
|
||||
- If `target` resolves to no external destination (no last route or `none`), the
|
||||
heartbeat still runs but no outbound message is sent.
|
||||
|
||||
## Failure/backoff
|
||||
- If a heartbeat command errors, log it and retry on the next scheduled tick (no exponential backoff unless command repeatedly fails; keep it simple for now).
|
||||
|
||||
## Tests to add
|
||||
- Unit: sentinel detection (`HEARTBEAT_OK`, empty output, mixed text), skip vs send decision, default interval resolver (30m, override, disable).
|
||||
- Unit/integration: verbose logger emits start/end lines; normal logger emits a single line.
|
||||
|
||||
## Documentation
|
||||
- Add a short README snippet under configuration showing `agent.heartbeat` and the sentinel rule.
|
||||
- Expose CLI triggers:
|
||||
- `clawdis heartbeat` (web provider, defaults to first `routing.allowFrom`; optional `--to` override)
|
||||
- `--session-id <uuid>` forces resuming a specific session for that heartbeat
|
||||
- `clawdis gateway --heartbeat-now` to run the gateway loop with an immediate heartbeat
|
||||
- Gateway supports `--heartbeat-now` to fire once at startup.
|
||||
- When multiple sessions are active or `routing.allowFrom` is only `"*"`, require `--to <E.164>` or `--all` for manual heartbeats to avoid ambiguous targets.
|
||||
## Wake hook
|
||||
- The gateway exposes a heartbeat wake hook so cron/jobs/webhooks can request an
|
||||
immediate run (`requestHeartbeatNow`).
|
||||
- `wake` endpoints should enqueue system events and optionally trigger a wake; the
|
||||
heartbeat runner picks those up on the next tick or immediately.
|
||||
|
||||
@@ -86,10 +86,9 @@ Status: WhatsApp Web via Baileys only. Gateway owns the single session.
|
||||
|
||||
## Heartbeats
|
||||
- **Gateway heartbeat** logs connection health (`web.heartbeatSeconds`, default 60s).
|
||||
- **Reply heartbeat** asks agent on a timer (`agent.heartbeat.every`).
|
||||
- Uses `HEARTBEAT` prompt + `HEARTBEAT_TOKEN` skip behavior.
|
||||
- Skips if queue busy or last inbound was a group.
|
||||
- Falls back to last direct recipient if needed.
|
||||
- **Agent heartbeat** is global (`agent.heartbeat.*`) and runs in the main session.
|
||||
- Uses `HEARTBEAT` prompt + `HEARTBEAT_OK` skip behavior.
|
||||
- Delivery defaults to the last used channel (or configured target).
|
||||
|
||||
## Reconnect behavior
|
||||
- Backoff policy: `web.reconnect`:
|
||||
@@ -106,6 +105,8 @@ Status: WhatsApp Web via Baileys only. Gateway owns the single session.
|
||||
- `agent.mediaMaxMb`
|
||||
- `agent.heartbeat.every`
|
||||
- `agent.heartbeat.model` (optional override)
|
||||
- `agent.heartbeat.target`
|
||||
- `agent.heartbeat.to`
|
||||
- `session.*` (scope, idle, store, mainKey)
|
||||
- `web.heartbeatSeconds`
|
||||
- `web.reconnect.*`
|
||||
|
||||
Reference in New Issue
Block a user