feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>
This commit is contained in:
Peter Steinberger
2026-01-18 15:29:16 +00:00
parent 7252938339
commit be7191879a
11 changed files with 536 additions and 352 deletions

View File

@@ -79,37 +79,46 @@ semantic queries can find related notes even when wording differs.
Defaults:
- Enabled by default.
- Watches memory files for changes (debounced).
- Uses remote embeddings (OpenAI) unless configured for local.
- Uses remote embeddings by default. If `memorySearch.provider` is not set, Clawdbot auto-selects:
1. `local` if a `memorySearch.local.modelPath` is configured and the file exists.
2. `openai` if an OpenAI key can be resolved.
3. `gemini` if a Gemini key can be resolved.
4. Otherwise memory search stays disabled until configured.
- Local mode uses node-llama-cpp and may require `pnpm approve-builds`.
- Uses sqlite-vec (when available) to accelerate vector search inside SQLite.
Remote embeddings **require** an API key for the embedding provider. By default
this is OpenAI (`OPENAI_API_KEY` or `models.providers.openai.apiKey`). Codex
OAuth only covers chat/completions and does **not** satisfy embeddings for
memory search. When using a custom OpenAI-compatible endpoint, set
`memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
Remote embeddings **require** an API key for the embedding provider. Clawdbot
resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
variables. Codex OAuth only covers chat/completions and does **not** satisfy
embeddings for memory search. For Gemini, use `GEMINI_API_KEY` or
`models.providers.google.apiKey`. When using a custom OpenAI-compatible endpoint,
set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
If you want to use **Gemini embeddings** directly, set the provider to `gemini`:
### Gemini embeddings (native)
Set the provider to `gemini` to use the Gemini embeddings API directly:
```json5
agents: {
defaults: {
memorySearch: {
provider: "gemini",
model: "gemini-embedding-001", // default
model: "gemini-embedding-001",
remote: {
apiKey: "${GEMINI_API_KEY}"
apiKey: "YOUR_GEMINI_API_KEY"
}
}
}
}
```
Gemini uses `GEMINI_API_KEY` (or `models.providers.google.apiKey`). Override
`memorySearch.remote.baseUrl` to point at a custom Gemini-compatible endpoint.
Notes:
- `remote.baseUrl` is optional (defaults to the Gemini API base URL).
- `remote.headers` lets you add extra headers if needed.
- Default model: `gemini-embedding-001`.
If you want to use a **custom OpenAI-compatible endpoint** (like OpenRouter or a proxy),
you can use the `remote` configuration:
If you want to use a **custom OpenAI-compatible endpoint** (OpenRouter, vLLM, or a proxy),
you can use the `remote` configuration with the OpenAI provider:
```json5
agents: {
@@ -118,8 +127,8 @@ agents: {
provider: "openai",
model: "text-embedding-3-small",
remote: {
baseUrl: "https://proxy.example/v1",
apiKey: "YOUR_PROXY_KEY",
baseUrl: "https://api.example.com/v1/",
apiKey: "YOUR_OPENAI_COMPAT_API_KEY",
headers: { "X-Custom-Header": "value" }
}
}
@@ -130,11 +139,16 @@ agents: {
If you don't want to set an API key, use `memorySearch.provider = "local"` or set
`memorySearch.fallback = "none"`.
Batch indexing (OpenAI only):
- Enabled by default for OpenAI embeddings. Set `agents.defaults.memorySearch.remote.batch.enabled = false` to disable.
Fallbacks:
- `memorySearch.fallback` can be `openai`, `gemini`, `local`, or `none`.
- The fallback provider is only used when the primary embedding provider fails.
Batch indexing (OpenAI + Gemini):
- Enabled by default for OpenAI and Gemini embeddings. Set `agents.defaults.memorySearch.remote.batch.enabled = false` to disable.
- Default behavior waits for batch completion; tune `remote.batch.wait`, `remote.batch.pollIntervalMs`, and `remote.batch.timeoutMinutes` if needed.
- Set `remote.batch.concurrency` to control how many batch jobs we submit in parallel (default: 2).
- Batch mode currently applies only when `memorySearch.provider = "openai"` and uses your OpenAI API key.
- Batch mode applies when `memorySearch.provider = "openai"` or `"gemini"` and uses the corresponding API key.
- Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.
Why OpenAI batch is fast + cheap:
- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.