78 lines
3.9 KiB
Markdown
78 lines
3.9 KiB
Markdown
---
|
|
summary: "Command queue design that serializes auto-reply command execution"
|
|
read_when:
|
|
- Changing auto-reply execution or concurrency
|
|
---
|
|
# Command Queue (2026-01-03)
|
|
|
|
We now serialize command-based auto-replies (WhatsApp Web listener) through a tiny in-process queue to prevent multiple commands from running at once, while allowing safe parallelism across sessions.
|
|
|
|
## Why
|
|
- Some auto-reply commands are expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
|
|
- Serializing avoids competing for terminal/stdin, keeps logs readable, and reduces the chance of rate limits from upstream tools.
|
|
|
|
## How it works
|
|
- `src/process/command-queue.ts` holds a lane-aware FIFO queue and drains each lane synchronously.
|
|
- `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session.
|
|
- Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agent.maxConcurrent`.
|
|
- When verbose logging is enabled, queued commands emit a short notice if they waited more than ~2s before starting.
|
|
- Typing indicators (`onReplyStart`) still fire immediately on enqueue so user experience is unchanged while we wait our turn.
|
|
|
|
## Queue modes (per surface)
|
|
Inbound messages can steer the current run, wait for a followup turn, or do both:
|
|
- `steer`: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
|
|
- `followup`: enqueue for the next agent turn after the current run ends.
|
|
- `collect`: coalesce all queued messages into a **single** followup turn (default).
|
|
- `steer-backlog` (aka `steer+backlog`): steer now **and** preserve the message for a followup turn.
|
|
- `interrupt` (legacy): abort the active run for that session, then run the newest message.
|
|
- `queue` (legacy alias): same as `steer`.
|
|
|
|
Steer-backlog means you can get a followup response after the steered run, so
|
|
streaming surfaces can look like duplicates. Prefer `collect`/`steer` if you want
|
|
one response per inbound message.
|
|
|
|
Defaults (when unset in config):
|
|
- All surfaces → `collect`
|
|
|
|
Configure globally or per surface via `routing.queue`:
|
|
|
|
```json5
|
|
{
|
|
routing: {
|
|
queue: {
|
|
mode: "collect",
|
|
debounceMs: 1000,
|
|
cap: 20,
|
|
drop: "summarize",
|
|
bySurface: { discord: "collect" }
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Queue options
|
|
Options apply to `followup`, `collect`, and `steer-backlog` (and to `steer` when it falls back to followup):
|
|
- `debounceMs`: wait for quiet before starting a followup turn (prevents “continue, continue”).
|
|
- `cap`: max queued messages per session.
|
|
- `drop`: overflow policy (`old`, `new`, `summarize`).
|
|
|
|
Summarize keeps a short bullet list of dropped messages and injects it as a synthetic followup prompt.
|
|
Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.
|
|
|
|
## Per-session overrides
|
|
- `/queue <mode>` as a standalone command stores the mode for the current session.
|
|
- `/queue <mode>` embedded in a message applies **once** (no persistence).
|
|
- Options can be combined: `/queue collect debounce:2s cap:25 drop:summarize`
|
|
- `/queue default` or `/queue reset` clears the session override.
|
|
|
|
## Scope and guarantees
|
|
- Applies only to config-driven command replies; plain text replies are unaffected.
|
|
- Default lane (`main`) is process-wide for inbound + main heartbeats; set `agent.maxConcurrent` to allow multiple sessions in parallel.
|
|
- Additional lanes may exist (e.g. `cron`) so background jobs can run in parallel without blocking inbound replies.
|
|
- Per-session lanes guarantee that only one agent run touches a given session at a time.
|
|
- No external dependencies or background worker threads; pure TypeScript + promises.
|
|
|
|
## Troubleshooting
|
|
- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
|
|
- `enqueueCommand` exposes a lightweight `getQueueSize()` helper if you need to surface queue depth in future diagnostics.
|