- Add isHeartbeat to AgentRunContext to track heartbeat runs
- Pass isHeartbeat flag through agent runner execution
- Suppress webchat broadcast (deltas + final) for heartbeat runs when showOk is false
- Webchat uses channels.defaults.heartbeat settings (no per-channel config)
- Default behavior: hide HEARTBEAT_OK from webchat (matches other channels)
This allows users to control whether heartbeat responses appear in
the webchat UI via channels.defaults.heartbeat.showOk (defaults to false).
CLI backends (claude-cli etc) don't emit streaming assistant events,
causing TUI to show "(no output)" despite correct processing. Now emits
assistant event with final text before lifecycle end so server-chat
buffer gets populated for WebSocket clients.
- Updated message processing to include full message IDs alongside short IDs for better context resolution.
- Improved reply handling by caching inbound messages, allowing for accurate sender and body resolution without exposing dropped content.
- Adjusted tests to validate the new full ID properties and their integration into the message handling workflow.
The regex `/context window/i` was matching "Model context window too small"
and rewriting it to generic "Context overflow" message, hiding the actual
problem from users.
Add exclusion for "too small|minimum is" patterns so the original
informative error message passes through.
🤖 AI-assisted (Claude) - tested on local instance
Co-Authored-By: Claude <noreply@anthropic.com>
When context limit is exceeded, the error message now suggests
setting agents.defaults.compaction.reserveTokensFloor to 4000
or higher to prevent future occurrences.
Previously, when auto-compaction failed due to context overflow, the system
would reset the session and silently continue the execution loop without
sending any response to the user. This made it appear as if messages were
being ignored.
This change ensures users receive a clear error message explaining that
the context limit was exceeded and the conversation has been reset,
consistent with how role ordering conflicts are already handled.
Fixes the silent failure case where message + compaction exceeds context limits.
Previously, when block streaming was disabled (the default), text generated
between tool calls would only appear after all tools completed. This was
because onBlockReply wasn't passed to the subscription when block streaming
was off, so flushBlockReplyBuffer() before tool execution did nothing.
Now onBlockReply is always passed, and when block streaming is disabled,
block replies are sent directly during tool flush. Directly sent payloads
are tracked to avoid duplicates in final payloads.
Also fixes a race condition where tool summaries could be emitted before
the typing indicator started by awaiting onAgentEvent in tool handlers.
Adds support for template variables in `messages.responsePrefix` that
resolve dynamically at runtime with the actual model used (including
after fallback).
Supported variables (case-insensitive):
- {model} - short model name (e.g., "claude-opus-4-5", "gpt-4o")
- {modelFull} - full model identifier (e.g., "anthropic/claude-opus-4-5")
- {provider} - provider name (e.g., "anthropic", "openai")
- {thinkingLevel} or {think} - thinking level ("high", "low", "off")
- {identity.name} or {identityName} - agent identity name
Example: "[{model} | think:{thinkingLevel}]" → "[claude-opus-4-5 | think:high]"
Variables show the actual model used after fallback, not the intended
model. Unresolved variables remain as literal text.
Implementation:
- New module: src/auto-reply/reply/response-prefix-template.ts
- Template interpolation in normalize-reply.ts via context provider
- onModelSelected callback in agent-runner-execution.ts
- Updated all 6 provider message handlers (web, signal, discord,
telegram, slack, imessage)
- 27 unit tests covering all variables and edge cases
- Documentation in docs/gateway/configuration.md and JSDoc
Fixes#923