feat: stream reply blocks immediately
This commit is contained in:
@@ -71,6 +71,8 @@ Legacy Pi/Tau session folders are **not** read.
|
||||
## Steering while streaming
|
||||
|
||||
Incoming user messages are queued while the agent is streaming. The queue is checked **after each tool call**. If a queued message is present, remaining tool calls from the current assistant message are skipped (error tool results with "Skipped due to queued user message."), then the queued user message is injected before the next assistant response.
|
||||
Block streaming sends completed assistant blocks as soon as they finish; disable
|
||||
via `agent.blockStreamingDefault: "off"` if you only want the final response.
|
||||
|
||||
## Configuration (minimal)
|
||||
|
||||
|
||||
@@ -330,6 +330,7 @@ Controls the embedded agent runtime (model/thinking/verbose/timeouts).
|
||||
},
|
||||
thinkingDefault: "low",
|
||||
verboseDefault: "off",
|
||||
blockStreamingDefault: "on",
|
||||
timeoutSeconds: 600,
|
||||
mediaMaxMb: 5,
|
||||
heartbeat: {
|
||||
@@ -354,6 +355,10 @@ deprecation fallback.
|
||||
Z.AI models are available as `zai/<model>` (e.g. `zai/glm-4.7`) and require
|
||||
`ZAI_API_KEY` (or legacy `Z_AI_API_KEY`) in the environment.
|
||||
|
||||
`agent.blockStreamingDefault` controls whether completed assistant blocks
|
||||
(`message_end` chunks) are sent immediately (default: `on`). Set to `off` to
|
||||
only deliver the final consolidated reply.
|
||||
|
||||
`agent.heartbeat` configures periodic heartbeat runs:
|
||||
- `every`: duration string (`ms`, `s`, `m`, `h`); default unit minutes. Omit or set
|
||||
`0m` to disable.
|
||||
|
||||
Reference in New Issue
Block a user