let5see/clawdbot

Fork 0

Files

Peter Steinberger 5a08471dcd feat: add sqlite-vec memory search acceleration

2026-01-17 18:02:34 +00:00

7.6 KiB

Raw Blame History

summary, read_when

summary

read_when

How Clawdbot memory works (workspace files + automatic memory flush)

You want the memory file layout and workflow

You want to tune the automatic pre-compaction memory flush

Memory

Clawdbot memory is plain Markdown in the agent workspace. The files are the source of truth; the model only "remembers" what gets written to disk.

Memory files (Markdown)

The default workspace layout uses two memory layers:

memory/YYYY-MM-DD.md
- Daily log (append-only).
- Read today + yesterday at session start.
MEMORY.md (optional)
- Curated long-term memory.
- Only load in the main, private session (never in group contexts).

These files live under the workspace (agents.defaults.workspace, default ~/clawd). See Agent workspace for the full layout.

When to write memory

Decisions, preferences, and durable facts go to MEMORY.md.
Day-to-day notes and running context go to memory/YYYY-MM-DD.md.
If someone says "remember this," write it down (do not keep it in RAM).

Automatic memory flush (pre-compaction ping)

When a session is close to auto-compaction, Clawdbot triggers a silent, agentic turn that reminds the model to write durable memory before the context is compacted. The default prompts explicitly say the model may reply, but usually NO_REPLY is the correct response so the user never sees this turn.

This is controlled by agents.defaults.compaction.memoryFlush:

{
  agents: {
    defaults: {
      compaction: {
        reserveTokensFloor: 20000,
        memoryFlush: {
          enabled: true,
          softThresholdTokens: 4000,
          systemPrompt: "Session nearing compaction. Store durable memories now.",
          prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store."
        }
      }
    }
  }
}

Details:

Soft threshold: flush triggers when the session token estimate crosses contextWindow - reserveTokensFloor - softThresholdTokens.
Silent by default: prompts include NO_REPLY so nothing is delivered.
Two prompts: a user prompt plus a system prompt append the reminder.
One flush per compaction cycle (tracked in sessions.json).
Workspace must be writable: if the session runs sandboxed with workspaceAccess: "ro" or "none", the flush is skipped.

For the full compaction lifecycle, see Session management + compaction.

Vector memory search

Clawdbot can build a small vector index over MEMORY.md and memory/*.md so semantic queries can find related notes even when wording differs.

Defaults:

Enabled by default.
Watches memory files for changes (debounced).
Uses remote embeddings (OpenAI) unless configured for local.
Local mode uses node-llama-cpp and may require pnpm approve-builds.
Uses sqlite-vec (when available) to accelerate vector search inside SQLite.

Remote embeddings require an API key for the embedding provider. By default this is OpenAI (OPENAI_API_KEY or models.providers.openai.apiKey). Codex OAuth only covers chat/completions and does not satisfy embeddings for memory search. When using a custom OpenAI-compatible endpoint, set memorySearch.remote.apiKey (and optional memorySearch.remote.headers).

If you want to use a custom OpenAI-compatible endpoint (like Gemini, OpenRouter, or a proxy), you can use the remote configuration:

agents: {
  defaults: {
    memorySearch: {
      provider: "openai",
      model: "text-embedding-3-small",
      remote: {
        baseUrl: "https://generativelanguage.googleapis.com/v1beta/openai/",
        apiKey: "YOUR_GEMINI_API_KEY",
        headers: { "X-Custom-Header": "value" }
      }
    }
  }
}

If you don't want to set an API key, use memorySearch.provider = "local" or set memorySearch.fallback = "none".

Config example:

agents: {
  defaults: {
    memorySearch: {
      provider: "openai",
      model: "text-embedding-3-small",
      fallback: "openai",
      sync: { watch: true }
    }
  }
}

Tools:

memory_search — returns snippets with file + line ranges.
memory_get — read memory file content by path.

Local mode:

Set agents.defaults.memorySearch.provider = "local".
Provide agents.defaults.memorySearch.local.modelPath (GGUF or hf: URI).
Optional: set agents.defaults.memorySearch.fallback = "none" to avoid remote fallback.

How the memory tools work

memory_search semantically searches Markdown chunks (~400 token target, 80-token overlap) from MEMORY.md + memory/**/*.md. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local → remote embeddings. No full file payload is returned.
memory_get reads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outside MEMORY.md / memory/ are rejected.
Both tools are enabled only when memorySearch.enabled resolves true for the agent.

What gets indexed (and when)

File type: Markdown only (MEMORY.md, memory/**/*.md).
Index storage: per-agent SQLite at ~/.clawdbot/state/memory/<agentId>.sqlite (configurable via agents.defaults.memorySearch.store.path, supports {agentId} token).
Freshness: watcher on MEMORY.md + memory/ marks the index dirty (debounce 1.5s). Sync runs on session start, on first search when dirty, and optionally on an interval. Reindex triggers when embedding model/provider or chunk sizes change.

SQLite vector acceleration (sqlite-vec)

When the sqlite-vec extension is available, Clawdbot stores embeddings in a SQLite virtual table (vec0) and performs vector distance queries in the database. This keeps search fast without loading every embedding into JS.

Configuration (optional):

agents: {
  defaults: {
    memorySearch: {
      store: {
        vector: {
          enabled: true,
          extensionPath: "/path/to/sqlite-vec"
        }
      }
    }
  }
}

Notes:

enabled defaults to true; when disabled, search falls back to in-process cosine similarity over stored embeddings.
If the sqlite-vec extension is missing or fails to load, Clawdbot logs the error and continues with the JS fallback (no vector table).
extensionPath overrides the bundled sqlite-vec path (useful for custom builds or non-standard install locations).

Local embedding auto-download

Default local embedding model: hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf (~0.6 GB).
When memorySearch.provider = "local", node-llama-cpp resolves modelPath; if the GGUF is missing it auto-downloads to the cache (or local.modelCacheDir if set), then loads it. Downloads resume on retry.
Native build requirement: run pnpm approve-builds, pick node-llama-cpp, then pnpm rebuild node-llama-cpp.
Fallback: if local setup fails and memorySearch.fallback = "openai", we automatically switch to remote embeddings (openai/text-embedding-3-small unless overridden) and record the reason.

Custom OpenAI-compatible endpoint example

agents: {
  defaults: {
    memorySearch: {
      provider: "openai",
      model: "text-embedding-3-small",
      remote: {
        baseUrl: "https://api.example.com/v1/",
        apiKey: "YOUR_REMOTE_API_KEY",
        headers: {
          "X-Organization": "org-id",
          "X-Project": "project-id"
        }
      }
    }
  }
}