Files
clawdbot/docs/concepts/streaming.md
Lloyd ab994d2c63 feat(agent): add human-like delay between block replies
Adds `agent.humanDelay` config option to create natural rhythm between
streamed message bubbles. When enabled, introduces a random delay
(default 800-2500ms) between block replies, making multi-message
responses feel more like natural human texting.

Config example:
```json
{
  "agent": {
    "blockStreamingDefault": "on",
    "humanDelay": {
      "enabled": true,
      "minMs": 800,
      "maxMs": 2500
    }
  }
}
```

- First message sends immediately
- Subsequent messages wait a random delay before sending
- Works with iMessage, Signal, and Discord providers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 17:12:50 +01:00

119 lines
5.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
summary: "Streaming + chunking behavior (block replies, draft streaming, limits)"
read_when:
- Explaining how streaming or chunking works on providers
- Changing block streaming or provider chunking behavior
- Debugging duplicate/early block replies or draft streaming
---
# Streaming + chunking
Clawdbot has two separate “streaming” layers:
- **Block streaming (providers):** emit completed **blocks** as the assistant writes. These are normal provider messages (not token deltas).
- **Token-ish streaming (Telegram only):** update a **draft bubble** with partial text while generating; final message is sent at the end.
There is **no real token streaming** to external provider messages today. Telegram draft streaming is the only partial-stream surface.
## Block streaming (provider messages)
Block streaming sends assistant output in coarse chunks as it becomes available.
```
Model output
└─ text_delta/events
├─ (blockStreamingBreak=text_end)
│ └─ chunker emits blocks as buffer grows
└─ (blockStreamingBreak=message_end)
└─ chunker flushes at message_end
└─ provider send (block replies)
```
Legend:
- `text_delta/events`: model stream events (may be sparse for non-streaming models).
- `chunker`: `EmbeddedBlockChunker` applying min/max bounds + break preference.
- `provider send`: actual outbound messages (block replies).
**Controls:**
- `agents.defaults.blockStreamingDefault`: `"on"`/`"off"` (default off).
- Provider overrides: `*.blockStreaming` (and per-account variants) to force `"on"`/`"off"` per provider.
- `agents.defaults.blockStreamingBreak`: `"text_end"` or `"message_end"`.
- `agents.defaults.blockStreamingChunk`: `{ minChars, maxChars, breakPreference? }`.
- `agents.defaults.blockStreamingCoalesce`: `{ minChars?, maxChars?, idleMs? }` (merge streamed blocks before send).
- Provider hard cap: `*.textChunkLimit` (e.g., `whatsapp.textChunkLimit`).
- Discord soft cap: `discord.maxLinesPerMessage` (default 17) splits tall replies to avoid UI clipping.
**Boundary semantics:**
- `text_end`: stream blocks as soon as chunker emits; flush on each `text_end`.
- `message_end`: wait until assistant message finishes, then flush buffered output.
`message_end` still uses the chunker if the buffered text exceeds `maxChars`, so it can emit multiple chunks at the end.
## Chunking algorithm (low/high bounds)
Block chunking is implemented by `EmbeddedBlockChunker`:
- **Low bound:** dont emit until buffer >= `minChars` (unless forced).
- **High bound:** prefer splits before `maxChars`; if forced, split at `maxChars`.
- **Break preference:** `paragraph``newline``sentence``whitespace` → hard break.
- **Code fences:** never split inside fences; when forced at `maxChars`, close + reopen the fence to keep Markdown valid.
`maxChars` is clamped to the provider `textChunkLimit`, so you cant exceed per-provider caps.
## Coalescing (merge streamed blocks)
When block streaming is enabled, Clawdbot can **merge consecutive block chunks**
before sending them out. This reduces “single-line spam” while still providing
progressive output.
- Coalescing waits for **idle gaps** (`idleMs`) before flushing.
- Buffers are capped by `maxChars` and will flush if they exceed it.
- `minChars` prevents tiny fragments from sending until enough text accumulates
(final flush always sends remaining text).
- Joiner is derived from `blockStreamingChunk.breakPreference`
(`paragraph``\n\n`, `newline``\n`, `sentence` → space).
- Provider overrides are available via `*.blockStreamingCoalesce` (including per-account configs).
- Default coalesce `minChars` is bumped to 1500 for Signal/Slack/Discord unless overridden.
## Human-like pacing between blocks
When block streaming is enabled, you can add a **randomized pause** between
block replies (after the first block). This makes multi-bubble responses feel
more natural.
- Config: `agents.defaults.humanDelay` (override per agent via `agents.list[].humanDelay`).
- Modes: `off` (default), `natural` (8002500ms), `custom` (`minMs`/`maxMs`).
- Applies only to **block replies**, not final replies or tool summaries.
## “Stream chunks or everything”
This maps to:
- **Stream chunks:** `blockStreamingDefault: "on"` + `blockStreamingBreak: "text_end"` (emit as you go). Non-Telegram providers also need `*.blockStreaming: true`.
- **Stream everything at end:** `blockStreamingBreak: "message_end"` (flush once, possibly multiple chunks if very long).
- **No block streaming:** `blockStreamingDefault: "off"` (only final reply).
**Provider note:** For non-Telegram providers, block streaming is **off unless**
`*.blockStreaming` is explicitly set to `true`. Telegram can stream drafts
(`telegram.streamMode`) without block replies.
## Telegram draft streaming (token-ish)
Telegram is the only provider with draft streaming:
- Uses Bot API `sendMessageDraft` in **private chats with topics**.
- `telegram.streamMode: "partial" | "block" | "off"`.
- `partial`: draft updates with the latest stream text.
- `block`: draft updates in chunked blocks (same chunker rules).
- `off`: no draft streaming.
- Draft streaming is separate from block streaming; block replies are off by default and only enabled by `*.blockStreaming: true` on non-Telegram providers.
- Final reply is still a normal message.
- `/reasoning stream` writes reasoning into the draft bubble (Telegram only).
When draft streaming is active, Clawdbot disables block streaming for that reply to avoid double-streaming.
```
Telegram (private + topics)
└─ sendMessageDraft (draft bubble)
├─ streamMode=partial → update latest text
└─ streamMode=block → chunker updates draft
└─ final reply → normal message
```
Legend:
- `sendMessageDraft`: Telegram draft bubble (not a real message).
- `final reply`: normal Telegram message send.