feat(memory): add gemini embeddings + auto select providers
Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>
This commit is contained in:
@@ -79,37 +79,46 @@ semantic queries can find related notes even when wording differs.
|
||||
Defaults:
|
||||
- Enabled by default.
|
||||
- Watches memory files for changes (debounced).
|
||||
- Uses remote embeddings (OpenAI) unless configured for local.
|
||||
- Uses remote embeddings by default. If `memorySearch.provider` is not set, Clawdbot auto-selects:
|
||||
1. `local` if a `memorySearch.local.modelPath` is configured and the file exists.
|
||||
2. `openai` if an OpenAI key can be resolved.
|
||||
3. `gemini` if a Gemini key can be resolved.
|
||||
4. Otherwise memory search stays disabled until configured.
|
||||
- Local mode uses node-llama-cpp and may require `pnpm approve-builds`.
|
||||
- Uses sqlite-vec (when available) to accelerate vector search inside SQLite.
|
||||
|
||||
Remote embeddings **require** an API key for the embedding provider. By default
|
||||
this is OpenAI (`OPENAI_API_KEY` or `models.providers.openai.apiKey`). Codex
|
||||
OAuth only covers chat/completions and does **not** satisfy embeddings for
|
||||
memory search. When using a custom OpenAI-compatible endpoint, set
|
||||
`memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
|
||||
Remote embeddings **require** an API key for the embedding provider. Clawdbot
|
||||
resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
|
||||
variables. Codex OAuth only covers chat/completions and does **not** satisfy
|
||||
embeddings for memory search. For Gemini, use `GEMINI_API_KEY` or
|
||||
`models.providers.google.apiKey`. When using a custom OpenAI-compatible endpoint,
|
||||
set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
|
||||
|
||||
If you want to use **Gemini embeddings** directly, set the provider to `gemini`:
|
||||
### Gemini embeddings (native)
|
||||
|
||||
Set the provider to `gemini` to use the Gemini embeddings API directly:
|
||||
|
||||
```json5
|
||||
agents: {
|
||||
defaults: {
|
||||
memorySearch: {
|
||||
provider: "gemini",
|
||||
model: "gemini-embedding-001", // default
|
||||
model: "gemini-embedding-001",
|
||||
remote: {
|
||||
apiKey: "${GEMINI_API_KEY}"
|
||||
apiKey: "YOUR_GEMINI_API_KEY"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Gemini uses `GEMINI_API_KEY` (or `models.providers.google.apiKey`). Override
|
||||
`memorySearch.remote.baseUrl` to point at a custom Gemini-compatible endpoint.
|
||||
Notes:
|
||||
- `remote.baseUrl` is optional (defaults to the Gemini API base URL).
|
||||
- `remote.headers` lets you add extra headers if needed.
|
||||
- Default model: `gemini-embedding-001`.
|
||||
|
||||
If you want to use a **custom OpenAI-compatible endpoint** (like OpenRouter or a proxy),
|
||||
you can use the `remote` configuration:
|
||||
If you want to use a **custom OpenAI-compatible endpoint** (OpenRouter, vLLM, or a proxy),
|
||||
you can use the `remote` configuration with the OpenAI provider:
|
||||
|
||||
```json5
|
||||
agents: {
|
||||
@@ -118,8 +127,8 @@ agents: {
|
||||
provider: "openai",
|
||||
model: "text-embedding-3-small",
|
||||
remote: {
|
||||
baseUrl: "https://proxy.example/v1",
|
||||
apiKey: "YOUR_PROXY_KEY",
|
||||
baseUrl: "https://api.example.com/v1/",
|
||||
apiKey: "YOUR_OPENAI_COMPAT_API_KEY",
|
||||
headers: { "X-Custom-Header": "value" }
|
||||
}
|
||||
}
|
||||
@@ -130,11 +139,16 @@ agents: {
|
||||
If you don't want to set an API key, use `memorySearch.provider = "local"` or set
|
||||
`memorySearch.fallback = "none"`.
|
||||
|
||||
Batch indexing (OpenAI only):
|
||||
- Enabled by default for OpenAI embeddings. Set `agents.defaults.memorySearch.remote.batch.enabled = false` to disable.
|
||||
Fallbacks:
|
||||
- `memorySearch.fallback` can be `openai`, `gemini`, `local`, or `none`.
|
||||
- The fallback provider is only used when the primary embedding provider fails.
|
||||
|
||||
Batch indexing (OpenAI + Gemini):
|
||||
- Enabled by default for OpenAI and Gemini embeddings. Set `agents.defaults.memorySearch.remote.batch.enabled = false` to disable.
|
||||
- Default behavior waits for batch completion; tune `remote.batch.wait`, `remote.batch.pollIntervalMs`, and `remote.batch.timeoutMinutes` if needed.
|
||||
- Set `remote.batch.concurrency` to control how many batch jobs we submit in parallel (default: 2).
|
||||
- Batch mode currently applies only when `memorySearch.provider = "openai"` and uses your OpenAI API key.
|
||||
- Batch mode applies when `memorySearch.provider = "openai"` or `"gemini"` and uses the corresponding API key.
|
||||
- Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.
|
||||
|
||||
Why OpenAI batch is fast + cheap:
|
||||
- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.
|
||||
|
||||
Reference in New Issue
Block a user