Files
clawdbot/docs/concepts/streaming.md
Lloyd ab994d2c63 feat(agent): add human-like delay between block replies
Adds `agent.humanDelay` config option to create natural rhythm between
streamed message bubbles. When enabled, introduces a random delay
(default 800-2500ms) between block replies, making multi-message
responses feel more like natural human texting.

Config example:
```json
{
  "agent": {
    "blockStreamingDefault": "on",
    "humanDelay": {
      "enabled": true,
      "minMs": 800,
      "maxMs": 2500
    }
  }
}
```

- First message sends immediately
- Subsequent messages wait a random delay before sending
- Works with iMessage, Signal, and Discord providers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 17:12:50 +01:00

5.8 KiB
Raw Blame History

summary, read_when
summary read_when
Streaming + chunking behavior (block replies, draft streaming, limits)
Explaining how streaming or chunking works on providers
Changing block streaming or provider chunking behavior
Debugging duplicate/early block replies or draft streaming

Streaming + chunking

Clawdbot has two separate “streaming” layers:

  • Block streaming (providers): emit completed blocks as the assistant writes. These are normal provider messages (not token deltas).
  • Token-ish streaming (Telegram only): update a draft bubble with partial text while generating; final message is sent at the end.

There is no real token streaming to external provider messages today. Telegram draft streaming is the only partial-stream surface.

Block streaming (provider messages)

Block streaming sends assistant output in coarse chunks as it becomes available.

Model output
  └─ text_delta/events
       ├─ (blockStreamingBreak=text_end)
       │    └─ chunker emits blocks as buffer grows
       └─ (blockStreamingBreak=message_end)
            └─ chunker flushes at message_end
                   └─ provider send (block replies)

Legend:

  • text_delta/events: model stream events (may be sparse for non-streaming models).
  • chunker: EmbeddedBlockChunker applying min/max bounds + break preference.
  • provider send: actual outbound messages (block replies).

Controls:

  • agents.defaults.blockStreamingDefault: "on"/"off" (default off).
  • Provider overrides: *.blockStreaming (and per-account variants) to force "on"/"off" per provider.
  • agents.defaults.blockStreamingBreak: "text_end" or "message_end".
  • agents.defaults.blockStreamingChunk: { minChars, maxChars, breakPreference? }.
  • agents.defaults.blockStreamingCoalesce: { minChars?, maxChars?, idleMs? } (merge streamed blocks before send).
  • Provider hard cap: *.textChunkLimit (e.g., whatsapp.textChunkLimit).
  • Discord soft cap: discord.maxLinesPerMessage (default 17) splits tall replies to avoid UI clipping.

Boundary semantics:

  • text_end: stream blocks as soon as chunker emits; flush on each text_end.
  • message_end: wait until assistant message finishes, then flush buffered output.

message_end still uses the chunker if the buffered text exceeds maxChars, so it can emit multiple chunks at the end.

Chunking algorithm (low/high bounds)

Block chunking is implemented by EmbeddedBlockChunker:

  • Low bound: dont emit until buffer >= minChars (unless forced).
  • High bound: prefer splits before maxChars; if forced, split at maxChars.
  • Break preference: paragraphnewlinesentencewhitespace → hard break.
  • Code fences: never split inside fences; when forced at maxChars, close + reopen the fence to keep Markdown valid.

maxChars is clamped to the provider textChunkLimit, so you cant exceed per-provider caps.

Coalescing (merge streamed blocks)

When block streaming is enabled, Clawdbot can merge consecutive block chunks before sending them out. This reduces “single-line spam” while still providing progressive output.

  • Coalescing waits for idle gaps (idleMs) before flushing.
  • Buffers are capped by maxChars and will flush if they exceed it.
  • minChars prevents tiny fragments from sending until enough text accumulates (final flush always sends remaining text).
  • Joiner is derived from blockStreamingChunk.breakPreference (paragraph\n\n, newline\n, sentence → space).
  • Provider overrides are available via *.blockStreamingCoalesce (including per-account configs).
  • Default coalesce minChars is bumped to 1500 for Signal/Slack/Discord unless overridden.

Human-like pacing between blocks

When block streaming is enabled, you can add a randomized pause between block replies (after the first block). This makes multi-bubble responses feel more natural.

  • Config: agents.defaults.humanDelay (override per agent via agents.list[].humanDelay).
  • Modes: off (default), natural (8002500ms), custom (minMs/maxMs).
  • Applies only to block replies, not final replies or tool summaries.

“Stream chunks or everything”

This maps to:

  • Stream chunks: blockStreamingDefault: "on" + blockStreamingBreak: "text_end" (emit as you go). Non-Telegram providers also need *.blockStreaming: true.
  • Stream everything at end: blockStreamingBreak: "message_end" (flush once, possibly multiple chunks if very long).
  • No block streaming: blockStreamingDefault: "off" (only final reply).

Provider note: For non-Telegram providers, block streaming is off unless *.blockStreaming is explicitly set to true. Telegram can stream drafts (telegram.streamMode) without block replies.

Telegram draft streaming (token-ish)

Telegram is the only provider with draft streaming:

  • Uses Bot API sendMessageDraft in private chats with topics.
  • telegram.streamMode: "partial" | "block" | "off".
    • partial: draft updates with the latest stream text.
    • block: draft updates in chunked blocks (same chunker rules).
    • off: no draft streaming.
  • Draft streaming is separate from block streaming; block replies are off by default and only enabled by *.blockStreaming: true on non-Telegram providers.
  • Final reply is still a normal message.
  • /reasoning stream writes reasoning into the draft bubble (Telegram only).

When draft streaming is active, Clawdbot disables block streaming for that reply to avoid double-streaming.

Telegram (private + topics)
  └─ sendMessageDraft (draft bubble)
       ├─ streamMode=partial → update latest text
       └─ streamMode=block   → chunker updates draft
  └─ final reply → normal message

Legend:

  • sendMessageDraft: Telegram draft bubble (not a real message).
  • final reply: normal Telegram message send.