docs: add session pruning docs
This commit is contained in:
92
docs/concepts/session-pruning.md
Normal file
92
docs/concepts/session-pruning.md
Normal file
@@ -0,0 +1,92 @@
|
||||
---
|
||||
summary: "Session pruning: opt-in tool-result trimming to reduce context bloat"
|
||||
read_when:
|
||||
- You want to reduce LLM context growth from tool outputs
|
||||
- You are tuning agent.contextPruning
|
||||
---
|
||||
# Session Pruning
|
||||
|
||||
Session pruning trims **old tool results** from the in-memory context right before each LLM call. It is **opt-in** and does **not** rewrite the on-disk session history (`*.jsonl`).
|
||||
|
||||
## When it runs
|
||||
- Before each LLM request (context hook).
|
||||
- Only affects the messages sent to the model for that request.
|
||||
|
||||
## What can be pruned
|
||||
- Only `toolResult` messages.
|
||||
- User + assistant messages are **never** modified.
|
||||
- The last `keepLastAssistants` assistant messages are protected; tool results after that cutoff are not pruned.
|
||||
- If there aren’t enough assistant messages to establish the cutoff, pruning is skipped.
|
||||
- Tool results containing **image blocks** are skipped (never trimmed/cleared).
|
||||
|
||||
## Context window estimation
|
||||
Pruning uses an estimated context window (chars ≈ tokens × 4). The window size is resolved in this order:
|
||||
1) Model definition `contextWindow` (from the model registry).
|
||||
2) `models.providers.*.models[].contextWindow` override.
|
||||
3) `agent.contextTokens`.
|
||||
4) Default `200000` tokens.
|
||||
|
||||
## Modes
|
||||
### adaptive
|
||||
- If estimated context ratio ≥ `softTrimRatio`: soft-trim oversized tool results.
|
||||
- If still ≥ `hardClearRatio` **and** prunable tool text ≥ `minPrunableToolChars`: hard-clear oldest eligible tool results.
|
||||
|
||||
### aggressive
|
||||
- Always hard-clears eligible tool results before the cutoff.
|
||||
- Ignores `hardClear.enabled` (always clears when eligible).
|
||||
|
||||
## Soft vs hard pruning
|
||||
- **Soft-trim**: only for oversized tool results.
|
||||
- Keeps head + tail, inserts `...`, and appends a note with the original size.
|
||||
- Skips results with image blocks.
|
||||
- **Hard-clear**: replaces the entire tool result with `hardClear.placeholder`.
|
||||
|
||||
## Tool selection
|
||||
- `tools.allow` / `tools.deny` support `*` wildcards.
|
||||
- Deny wins.
|
||||
- Empty allow list => all tools allowed.
|
||||
|
||||
## Interaction with other limits
|
||||
- Built-in tools already truncate their own output; session pruning is an extra layer that prevents long-running chats from accumulating too much tool output in the model context.
|
||||
- Compaction is separate: compaction summarizes and persists, pruning is transient per request.
|
||||
|
||||
## Defaults (when enabled)
|
||||
- `keepLastAssistants`: `3`
|
||||
- `softTrimRatio`: `0.3`
|
||||
- `hardClearRatio`: `0.5`
|
||||
- `minPrunableToolChars`: `50000`
|
||||
- `softTrim`: `{ maxChars: 4000, headChars: 1500, tailChars: 1500 }`
|
||||
- `hardClear`: `{ enabled: true, placeholder: "[Old tool result content cleared]" }`
|
||||
|
||||
## Examples
|
||||
Minimal (adaptive):
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
contextPruning: { mode: "adaptive" }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Aggressive:
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
contextPruning: { mode: "aggressive" }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Restrict pruning to specific tools:
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
contextPruning: {
|
||||
mode: "adaptive",
|
||||
tools: { allow: ["bash", "read"], deny: ["*image*"] }
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
See config reference: [Gateway Configuration](/gateway/configuration)
|
||||
@@ -21,6 +21,10 @@ All session state is **owned by the gateway** (the “master” Clawdbot). UI cl
|
||||
- Group entries may include `displayName`, `provider`, `subject`, `room`, and `space` to label sessions in UIs.
|
||||
- Clawdbot does **not** read legacy Pi/Tau session folders.
|
||||
|
||||
## Session pruning (optional)
|
||||
Clawdbot can trim **old tool results** from the in-memory context right before LLM calls (opt-in).
|
||||
This does **not** rewrite JSONL history. See [/concepts/session-pruning](/concepts/session-pruning).
|
||||
|
||||
## Mapping transports → session keys
|
||||
- Direct chats collapse to the per-agent primary key: `agent:<agentId>:<mainKey>`.
|
||||
- Multiple phone numbers and providers can map to the same agent main key; they act as transports into one conversation.
|
||||
|
||||
@@ -546,6 +546,7 @@
|
||||
"concepts/agent-workspace",
|
||||
"concepts/multi-agent",
|
||||
"concepts/session",
|
||||
"concepts/session-pruning",
|
||||
"concepts/sessions",
|
||||
"concepts/session-tool",
|
||||
"concepts/presence",
|
||||
|
||||
@@ -813,6 +813,87 @@ If you configure the same alias name (case-insensitive) yourself, your value win
|
||||
}
|
||||
```
|
||||
|
||||
#### `agent.contextPruning` (opt-in tool-result pruning)
|
||||
|
||||
`agent.contextPruning` prunes **old tool results** from the in-memory context right before a request is sent to the LLM.
|
||||
It does **not** modify the session history on disk (`*.jsonl` remains complete).
|
||||
|
||||
This is intended to reduce token usage for chatty agents that accumulate large tool outputs over time.
|
||||
|
||||
High level:
|
||||
- Never touches user/assistant messages.
|
||||
- Protects the last `keepLastAssistants` assistant messages (no tool results after that point are pruned).
|
||||
- Modes:
|
||||
- `adaptive`: soft-trims oversized tool results (keep head/tail) when the estimated context ratio crosses `softTrimRatio`.
|
||||
Then hard-clears the oldest eligible tool results when the estimated context ratio crosses `hardClearRatio` **and**
|
||||
there’s enough prunable tool-result bulk (`minPrunableToolChars`).
|
||||
- `aggressive`: always replaces eligible tool results before the cutoff with the `hardClear.placeholder` (no ratio checks).
|
||||
|
||||
Soft vs hard pruning (what changes in the context sent to the LLM):
|
||||
- **Soft-trim**: only for *oversized* tool results. Keeps the beginning + end and inserts `...` in the middle.
|
||||
- Before: `toolResult("…very long output…")`
|
||||
- After: `toolResult("HEAD…\n...\n…TAIL\n\n[Tool result trimmed: …]")`
|
||||
- **Hard-clear**: replaces the entire tool result with the placeholder.
|
||||
- Before: `toolResult("…very long output…")`
|
||||
- After: `toolResult("[Old tool result content cleared]")`
|
||||
|
||||
Notes / current limitations:
|
||||
- Tool results containing **image blocks are skipped** (never trimmed/cleared) right now.
|
||||
- The estimated “context ratio” is based on **characters** (approximate), not exact tokens.
|
||||
- If the session doesn’t contain at least `keepLastAssistants` assistant messages yet, pruning is skipped.
|
||||
- In `aggressive` mode, `hardClear.enabled` is ignored (eligible tool results are always replaced with `hardClear.placeholder`).
|
||||
|
||||
Example (minimal):
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
contextPruning: {
|
||||
mode: "adaptive"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Defaults (when `mode` is `"adaptive"` or `"aggressive"`):
|
||||
- `keepLastAssistants`: `3`
|
||||
- `softTrimRatio`: `0.3` (adaptive only)
|
||||
- `hardClearRatio`: `0.5` (adaptive only)
|
||||
- `minPrunableToolChars`: `50000` (adaptive only)
|
||||
- `softTrim`: `{ maxChars: 4000, headChars: 1500, tailChars: 1500 }` (adaptive only)
|
||||
- `hardClear`: `{ enabled: true, placeholder: "[Old tool result content cleared]" }`
|
||||
|
||||
Example (aggressive, minimal):
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
contextPruning: {
|
||||
mode: "aggressive"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Example (adaptive tuned):
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
contextPruning: {
|
||||
mode: "adaptive",
|
||||
keepLastAssistants: 3,
|
||||
softTrimRatio: 0.3,
|
||||
hardClearRatio: 0.5,
|
||||
minPrunableToolChars: 50000,
|
||||
softTrim: { maxChars: 4000, headChars: 1500, tailChars: 1500 },
|
||||
hardClear: { enabled: true, placeholder: "[Old tool result content cleared]" },
|
||||
// Optional: restrict pruning to specific tools (deny wins; supports "*" wildcards)
|
||||
tools: { deny: ["browser", "canvas"] },
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
See [/concepts/session-pruning](/concepts/session-pruning) for behavior details.
|
||||
|
||||
Block streaming:
|
||||
- `agent.blockStreamingDefault`: `"on"`/`"off"` (default on).
|
||||
- `agent.blockStreamingBreak`: `"text_end"` or `"message_end"` (default: text_end).
|
||||
|
||||
@@ -38,6 +38,7 @@ Use these hubs to discover every page, including deep dives and reference docs t
|
||||
- [Multi-agent routing](https://docs.clawd.bot/concepts/multi-agent)
|
||||
- [Sessions](https://docs.clawd.bot/concepts/session)
|
||||
- [Sessions (alias)](https://docs.clawd.bot/concepts/sessions)
|
||||
- [Session pruning](https://docs.clawd.bot/concepts/session-pruning)
|
||||
- [Session tools](https://docs.clawd.bot/concepts/session-tool)
|
||||
- [Queue](https://docs.clawd.bot/concepts/queue)
|
||||
- [Slash commands](https://docs.clawd.bot/tools/slash-commands)
|
||||
|
||||
Reference in New Issue
Block a user