feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>
2026-01-18 15:29:16 +00:00
parent 7252938339
commit be7191879a
11 changed files with 536 additions and 352 deletions
--- a/docs/concepts/memory.md
+++ b/docs/concepts/memory.md
@@ -79,37 +79,46 @@ semantic queries can find related notes even when wording differs.
 Defaults:
 - Enabled by default.
 - Watches memory files for changes (debounced).
- Uses remote embeddings (OpenAI) unless configured for local.
+- Uses remote embeddings by default. If `memorySearch.provider` is not set, Clawdbot auto-selects:
+  1. `local` if a `memorySearch.local.modelPath` is configured and the file exists.
+  2. `openai` if an OpenAI key can be resolved.
+  3. `gemini` if a Gemini key can be resolved.
+  4. Otherwise memory search stays disabled until configured.
 - Local mode uses node-llama-cpp and may require `pnpm approve-builds`.
 - Uses sqlite-vec (when available) to accelerate vector search inside SQLite.

-Remote embeddings **require** an API key for the embedding provider. By default
-this is OpenAI (`OPENAI_API_KEY` or `models.providers.openai.apiKey`). Codex
-OAuth only covers chat/completions and does **not** satisfy embeddings for
-memory search. When using a custom OpenAI-compatible endpoint, set
-`memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
+Remote embeddings **require** an API key for the embedding provider. Clawdbot
+resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
+variables. Codex OAuth only covers chat/completions and does **not** satisfy
+embeddings for memory search. For Gemini, use `GEMINI_API_KEY` or
+`models.providers.google.apiKey`. When using a custom OpenAI-compatible endpoint,
+set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).

-If you want to use **Gemini embeddings** directly, set the provider to `gemini`:
+### Gemini embeddings (native)
+
+Set the provider to `gemini` to use the Gemini embeddings API directly:

 ```json5
 agents: {
  defaults: {
    memorySearch: {
      provider: "gemini",
-      model: "gemini-embedding-001", // default
+      model: "gemini-embedding-001",
      remote: {
-        apiKey: "${GEMINI_API_KEY}"
+        apiKey: "YOUR_GEMINI_API_KEY"
      }
    }
  }
 }
 ```

-Gemini uses `GEMINI_API_KEY` (or `models.providers.google.apiKey`). Override
-`memorySearch.remote.baseUrl` to point at a custom Gemini-compatible endpoint.
+Notes:
+- `remote.baseUrl` is optional (defaults to the Gemini API base URL).
+- `remote.headers` lets you add extra headers if needed.
+- Default model: `gemini-embedding-001`.

-If you want to use a **custom OpenAI-compatible endpoint** (like OpenRouter or a proxy),
-you can use the `remote` configuration:
+If you want to use a **custom OpenAI-compatible endpoint** (OpenRouter, vLLM, or a proxy),
+you can use the `remote` configuration with the OpenAI provider:

 ```json5
 agents: {
@@ -118,8 +127,8 @@ agents: {
      provider: "openai",
      model: "text-embedding-3-small",
      remote: {
-        baseUrl: "https://proxy.example/v1",
-        apiKey: "YOUR_PROXY_KEY",
+        baseUrl: "https://api.example.com/v1/",
+        apiKey: "YOUR_OPENAI_COMPAT_API_KEY",
        headers: { "X-Custom-Header": "value" }
      }
    }
@@ -130,11 +139,16 @@ agents: {
 If you don't want to set an API key, use `memorySearch.provider = "local"` or set
 `memorySearch.fallback = "none"`.

-Batch indexing (OpenAI only):
- Enabled by default for OpenAI embeddings. Set `agents.defaults.memorySearch.remote.batch.enabled = false` to disable.
+Fallbacks:
+- `memorySearch.fallback` can be `openai`, `gemini`, `local`, or `none`.
+- The fallback provider is only used when the primary embedding provider fails.
+
+Batch indexing (OpenAI + Gemini):
+- Enabled by default for OpenAI and Gemini embeddings. Set `agents.defaults.memorySearch.remote.batch.enabled = false` to disable.
 - Default behavior waits for batch completion; tune `remote.batch.wait`, `remote.batch.pollIntervalMs`, and `remote.batch.timeoutMinutes` if needed.
 - Set `remote.batch.concurrency` to control how many batch jobs we submit in parallel (default: 2).
- Batch mode currently applies only when `memorySearch.provider = "openai"` and uses your OpenAI API key.
+- Batch mode applies when `memorySearch.provider = "openai"` or `"gemini"` and uses the corresponding API key.
+- Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.

 Why OpenAI batch is fast + cheap:
 - For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.