10 KiB
summary, read_when
| summary | read_when | ||
|---|---|---|---|
| How Clawdbot memory works (workspace files + automatic memory flush) |
|
Memory
Clawdbot memory is plain Markdown in the agent workspace. The files are the source of truth; the model only "remembers" what gets written to disk.
Memory files (Markdown)
The default workspace layout uses two memory layers:
memory/YYYY-MM-DD.md- Daily log (append-only).
- Read today + yesterday at session start.
MEMORY.md(optional)- Curated long-term memory.
- Only load in the main, private session (never in group contexts).
These files live under the workspace (agents.defaults.workspace, default
~/clawd). See Agent workspace for the full layout.
When to write memory
- Decisions, preferences, and durable facts go to
MEMORY.md. - Day-to-day notes and running context go to
memory/YYYY-MM-DD.md. - If someone says "remember this," write it down (do not keep it in RAM).
Automatic memory flush (pre-compaction ping)
When a session is close to auto-compaction, Clawdbot triggers a silent,
agentic turn that reminds the model to write durable memory before the
context is compacted. The default prompts explicitly say the model may reply,
but usually NO_REPLY is the correct response so the user never sees this turn.
This is controlled by agents.defaults.compaction.memoryFlush:
{
agents: {
defaults: {
compaction: {
reserveTokensFloor: 20000,
memoryFlush: {
enabled: true,
softThresholdTokens: 4000,
systemPrompt: "Session nearing compaction. Store durable memories now.",
prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store."
}
}
}
}
}
Details:
- Soft threshold: flush triggers when the session token estimate crosses
contextWindow - reserveTokensFloor - softThresholdTokens. - Silent by default: prompts include
NO_REPLYso nothing is delivered. - Two prompts: a user prompt plus a system prompt append the reminder.
- One flush per compaction cycle (tracked in
sessions.json). - Workspace must be writable: if the session runs sandboxed with
workspaceAccess: "ro"or"none", the flush is skipped.
For the full compaction lifecycle, see Session management + compaction.
Vector memory search
Clawdbot can build a small vector index over MEMORY.md and memory/*.md so
semantic queries can find related notes even when wording differs.
Defaults:
- Enabled by default.
- Watches memory files for changes (debounced).
- Uses remote embeddings (OpenAI) unless configured for local.
- Local mode uses node-llama-cpp and may require
pnpm approve-builds. - Uses sqlite-vec (when available) to accelerate vector search inside SQLite.
Remote embeddings require an API key for the embedding provider. By default
this is OpenAI (OPENAI_API_KEY or models.providers.openai.apiKey). Codex
OAuth only covers chat/completions and does not satisfy embeddings for
memory search. When using a custom OpenAI-compatible endpoint, set
memorySearch.remote.apiKey (and optional memorySearch.remote.headers).
If you want to use a custom OpenAI-compatible endpoint (like Gemini, OpenRouter, or a proxy),
you can use the remote configuration:
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
remote: {
baseUrl: "https://generativelanguage.googleapis.com/v1beta/openai/",
apiKey: "YOUR_GEMINI_API_KEY",
headers: { "X-Custom-Header": "value" }
}
}
}
}
If you don't want to set an API key, use memorySearch.provider = "local" or set
memorySearch.fallback = "none".
Batch indexing (OpenAI only):
- Enabled by default for OpenAI embeddings. Set
agents.defaults.memorySearch.remote.batch.enabled = falseto disable. - Default behavior waits for batch completion; tune
remote.batch.wait,remote.batch.pollIntervalMs, andremote.batch.timeoutMinutesif needed. - Set
remote.batch.concurrencyto control how many batch jobs we submit in parallel (default: 2). - Batch mode currently applies only when
memorySearch.provider = "openai"and uses your OpenAI API key.
Why OpenAI batch is fast + cheap:
- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.
- OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously.
- See the OpenAI Batch API docs and pricing for details:
Config example:
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
fallback: "openai",
remote: {
batch: { enabled: true, concurrency: 2 }
},
sync: { watch: true }
}
}
}
Tools:
memory_search— returns snippets with file + line ranges.memory_get— read memory file content by path.
Local mode:
- Set
agents.defaults.memorySearch.provider = "local". - Provide
agents.defaults.memorySearch.local.modelPath(GGUF orhf:URI). - Optional: set
agents.defaults.memorySearch.fallback = "none"to avoid remote fallback.
How the memory tools work
memory_searchsemantically searches Markdown chunks (~400 token target, 80-token overlap) fromMEMORY.md+memory/**/*.md. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local → remote embeddings. No full file payload is returned.memory_getreads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outsideMEMORY.md/memory/are rejected.- Both tools are enabled only when
memorySearch.enabledresolves true for the agent.
What gets indexed (and when)
- File type: Markdown only (
MEMORY.md,memory/**/*.md). - Index storage: per-agent SQLite at
~/.clawdbot/memory/<agentId>.sqlite(configurable viaagents.defaults.memorySearch.store.path, supports{agentId}token). - Freshness: watcher on
MEMORY.md+memory/marks the index dirty (debounce 1.5s). Sync runs on session start, on first search when dirty, and optionally on an interval. - Reindex triggers: the index stores the embedding provider/model + endpoint fingerprint + chunking params. If any of those change, Clawdbot automatically resets and reindexes the entire store.
Embedding cache
Clawdbot can cache chunk embeddings in SQLite so reindexing and frequent updates (especially session transcripts) don't re-embed unchanged text.
Config:
agents: {
defaults: {
memorySearch: {
cache: {
enabled: true,
maxEntries: 50000
}
}
}
}
Session memory search (experimental)
You can optionally index session transcripts and surface them via memory_search.
This is gated behind an experimental flag.
agents: {
defaults: {
memorySearch: {
experimental: { sessionMemory: true },
sources: ["memory", "sessions"]
}
}
}
Notes:
- Session indexing is opt-in (off by default).
- Session updates are debounced and indexed lazily on the next
memory_search(or manualclawdbot memory index). - Results still include snippets only;
memory_getremains limited to memory files. - Session indexing is isolated per agent (only that agent’s session logs are indexed).
- Session logs live on disk (
~/.clawdbot/agents/<agentId>/sessions/*.jsonl). Any process/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts.
SQLite vector acceleration (sqlite-vec)
When the sqlite-vec extension is available, Clawdbot stores embeddings in a
SQLite virtual table (vec0) and performs vector distance queries in the
database. This keeps search fast without loading every embedding into JS.
Configuration (optional):
agents: {
defaults: {
memorySearch: {
store: {
vector: {
enabled: true,
extensionPath: "/path/to/sqlite-vec"
}
}
}
}
}
Notes:
enableddefaults to true; when disabled, search falls back to in-process cosine similarity over stored embeddings.- If the sqlite-vec extension is missing or fails to load, Clawdbot logs the error and continues with the JS fallback (no vector table).
extensionPathoverrides the bundled sqlite-vec path (useful for custom builds or non-standard install locations).
Local embedding auto-download
- Default local embedding model:
hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf(~0.6 GB). - When
memorySearch.provider = "local",node-llama-cppresolvesmodelPath; if the GGUF is missing it auto-downloads to the cache (orlocal.modelCacheDirif set), then loads it. Downloads resume on retry. - Native build requirement: run
pnpm approve-builds, picknode-llama-cpp, thenpnpm rebuild node-llama-cpp. - Fallback: if local setup fails and
memorySearch.fallback = "openai", we automatically switch to remote embeddings (openai/text-embedding-3-smallunless overridden) and record the reason.
Custom OpenAI-compatible endpoint example
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
remote: {
baseUrl: "https://api.example.com/v1/",
apiKey: "YOUR_REMOTE_API_KEY",
headers: {
"X-Organization": "org-id",
"X-Project": "project-id"
}
}
}
}
}
Notes:
remote.*takes precedence overmodels.providers.openai.*.remote.headersmerge with OpenAI headers; remote wins on key conflicts. Omitremote.headersto use the OpenAI defaults.