fix: hard-abort clears queues on /stop
This commit is contained in:
@@ -1,28 +1,28 @@
|
||||
---
|
||||
summary: "Command queue design that serializes auto-reply command execution"
|
||||
summary: "Command queue design that serializes inbound auto-reply runs"
|
||||
read_when:
|
||||
- Changing auto-reply execution or concurrency
|
||||
---
|
||||
# Command Queue (2026-01-03)
|
||||
# Command Queue (2026-01-16)
|
||||
|
||||
We now serialize command-based auto-replies (WhatsApp Web listener) through a tiny in-process queue to prevent multiple commands from running at once, while allowing safe parallelism across sessions.
|
||||
We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.
|
||||
|
||||
## Why
|
||||
- Some auto-reply commands are expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
|
||||
- Serializing avoids competing for terminal/stdin, keeps logs readable, and reduces the chance of rate limits from upstream tools.
|
||||
- Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
|
||||
- Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.
|
||||
|
||||
## How it works
|
||||
- A lane-aware FIFO queue drains each lane synchronously.
|
||||
- A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1).
|
||||
- `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session.
|
||||
- Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agents.defaults.maxConcurrent`.
|
||||
- When verbose logging is enabled, queued commands emit a short notice if they waited more than ~2s before starting.
|
||||
- Typing indicators (`onReplyStart`) still fire immediately on enqueue so user experience is unchanged while we wait our turn.
|
||||
- When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.
|
||||
- Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.
|
||||
|
||||
## Queue modes (per channel)
|
||||
Inbound messages can steer the current run, wait for a followup turn, or do both:
|
||||
- `steer`: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
|
||||
- `followup`: enqueue for the next agent turn after the current run ends.
|
||||
- `collect`: coalesce all queued messages into a **single** followup turn (default).
|
||||
- `collect`: coalesce all queued messages into a **single** followup turn (default). If messages target different channels/threads, they drain individually to preserve routing.
|
||||
- `steer-backlog` (aka `steer+backlog`): steer now **and** preserve the message for a followup turn.
|
||||
- `interrupt` (legacy): abort the active run for that session, then run the newest message.
|
||||
- `queue` (legacy alias): same as `steer`.
|
||||
@@ -66,9 +66,9 @@ Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.
|
||||
- `/queue default` or `/queue reset` clears the session override.
|
||||
|
||||
## Scope and guarantees
|
||||
- Applies only to config-driven command replies; plain text replies are unaffected.
|
||||
- Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).
|
||||
- Default lane (`main`) is process-wide for inbound + main heartbeats; set `agents.defaults.maxConcurrent` to allow multiple sessions in parallel.
|
||||
- Additional lanes may exist (e.g. `cron`) so background jobs can run in parallel without blocking inbound replies.
|
||||
- Additional lanes may exist (e.g. `cron`, `subagent`) so background jobs can run in parallel without blocking inbound replies.
|
||||
- Per-session lanes guarantee that only one agent run touches a given session at a time.
|
||||
- No external dependencies or background worker threads; pure TypeScript + promises.
|
||||
|
||||
|
||||
@@ -105,7 +105,7 @@ Send these as standalone messages so they register.
|
||||
- `clawdbot gateway call sessions.list --params '{}'` — fetch sessions from the running gateway (use `--url`/`--token` for remote gateway access).
|
||||
- Send `/status` as a standalone message in chat to see whether the agent is reachable, how much of the session context is used, current thinking/verbose toggles, and when your WhatsApp web creds were last refreshed (helps spot relink needs).
|
||||
- Send `/context list` or `/context detail` to see what’s in the system prompt and injected workspace files (and the biggest context contributors).
|
||||
- Send `/stop` as a standalone message to abort the current run.
|
||||
- Send `/stop` as a standalone message to abort the current run and clear queued followups for that session.
|
||||
- Send `/compact` (optional instructions) as a standalone message to summarize older context and free up window space. See [/concepts/compaction](/concepts/compaction).
|
||||
- JSONL transcripts can be opened directly to review full turns.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user