docs: clarify cache-ttl pruning window

2026-01-21 19:58:28 +00:00
parent c145a0d116
commit 6866cca6d7
2 changed files with 2 additions and 0 deletions
--- a/docs/concepts/session-pruning.md
+++ b/docs/concepts/session-pruning.md
@@ -13,6 +13,7 @@ Session pruning trims **old tool results** from the in-memory context right befo
 - Only affects the messages sent to the model for that request.
 - Only active for Anthropic API calls (and OpenRouter Anthropic models).
 - For best results, match `ttl` to your model `cacheControlTtl`.
+ - After a prune, the TTL window resets so subsequent requests keep cache until `ttl` expires again.

 ## What can be pruned
 - Only `toolResult` messages.
--- a/docs/gateway/configuration.md
+++ b/docs/gateway/configuration.md
@@ -1597,6 +1597,7 @@ Notes / current limitations:
 - The estimated “context ratio” is based on **characters** (approximate), not exact tokens.
 - If the session doesn’t contain at least `keepLastAssistants` assistant messages yet, pruning is skipped.
 - `cache-ttl` only activates for Anthropic API calls (and OpenRouter Anthropic models).
+- After a prune, the TTL window resets so subsequent requests keep cache until `ttl` expires again.
 - For best results, match `contextPruning.ttl` to the model `cacheControlTtl` you set in `agents.defaults.models.*.params`.

 Default (off):