clawdbot/docs/refactor/vector-memory.md

---
summary: "Vector memory search design plan (per-agent, watch/lazy sync, storage)"
read_when:
  - Designing or implementing vector memory search
  - Adding embedding providers or sync behavior
---

# Vector Memory Search — Design Plan

Goal: semantic search over **agent memory files** only, with minimal deps and
good UX defaults. Default enabled. Per-agent overrides.

## Scope
- Sources: `MEMORY.md` + `memory/YYYY-MM-DD.md` inside the agent workspace.
- No indexing outside the workspace. No hidden paths.
- No QMD-style query expansion or rerank in v1.

## Config Shape
Location: `agents.defaults.memorySearch` + `agents.list[].memorySearch`.

```json5
agents: {
  defaults: {
    memorySearch: {
      enabled: true,
      provider: "openai", // "openai" | "local"
      fallback: "openai", // "openai" | "none"
      model: "text-embedding-3-small",
      store: {
        driver: "sqlite",
        path: "~/.clawdbot/memory/{agentId}.sqlite"
      },
      chunking: {
        tokens: 400,
        overlap: 80
      },
      sync: {
        onSessionStart: true,
        onSearch: true,        // LazySync
        watch: true,           // default on
        watchDebounceMs: 1500,
        intervalMinutes: 0
      },
      query: {
        maxResults: 6,
        minScore: 0.35
      }
    }
  },
  list: [
    { id: "peter", memorySearch: { provider: "local", sync: { watch: false } } }
  ]
}
```

## Storage
Per-agent DB (default): `~/.clawdbot/memory/{agentId}.sqlite`.

Tables (v1):
- `files(path PRIMARY KEY, hash, mtime, size)`
- `chunks(id PRIMARY KEY, path, start_line, end_line, hash, text, embedding, updated_at)`

Notes:
- `hash` = content hash of chunk text.
- `embedding` stored as float[] (sqlite vec extension optional); if not using vec,
  store as JSON and do linear scan in memory for small corpora.

## Embedding Providers
Interface (core):
- `embedQuery(text): number[]`
- `embedBatch(texts[]): number[][]`

Providers:
- `openai` (default): OpenAI embeddings via existing keys.
- `local` (optional): node-llama-cpp (GGUF).
- Fallback: when `provider: "local"` fails, fallback to OpenAI unless `fallback: "none"`.

## Index Pipeline
1) Resolve memory file list (workspace only).
2) Read file, compute file hash/mtime.
3) Chunk by headings + token cap (overlap).
4) Embed only changed chunks (hash compare).
5) Upsert `chunks` rows, prune deleted files.

Chunking:
- Prefer heading-aware splits.
- Max tokens + overlap; keep line ranges for snippets.

## Sync Strategy
Default: **watch + lazy + session-start**
- `watch`: chokidar on `MEMORY.md` + `memory/**/*.md` (debounced).
- `onSearch`: if dirty, sync before search (LazySync).
- `onSessionStart`: warm index once per session.
- `intervalMinutes`: optional for long-lived sessions.

If workspace access is read-only or missing: disable writes; return “not indexed”.

## Query Flow
1) Embed query.
2) Cosine similarity over all chunk embeddings.
3) Return top K with `{path, startLine, endLine, snippet, score}`.
4) Model may call `memory_get` when full context needed.

Optional v2: add FTS5 + RRF merge (FTS + vector) for quality.

## Tool + CLI
Tools:
- `memory_search { query, maxResults?, minScore? }`
- `memory_get { path, from?, lines? }`

CLI (optional):
- `clawdbot memory index|search|status`

## Security + Permissions
- Indexer reads only memory files in workspace.
- No scanning outside workspace; no “sneak” reads.
- Respect sandbox `workspaceAccess` (ro = read-only; none = disabled).

## Tests
- Chunking boundaries + line ranges.
- Hash-based incremental updates.
- Search ranking (cosine).
- Watcher debounce (fake fs).

## Rollout
- Default enabled; if no memory files, index is empty (silent).
- No migration needed.