feat: add image model config + tool

This commit is contained in:
Peter Steinberger
2026-01-04 19:35:00 +01:00
parent 0716a624a8
commit 78998dba9e
20 changed files with 856 additions and 144 deletions

View File

@@ -431,6 +431,8 @@ Controls the embedded agent runtime (model/thinking/verbose/timeouts).
(omit to show the full catalog).
`modelAliases` adds short names for `/model` (alias -> provider/model).
`modelFallbacks` lists ordered fallback models to try when the default fails.
`imageModel` selects an image-capable model for the `image` tool.
`imageModelFallbacks` lists ordered fallback image models for the `image` tool.
```json5
{
@@ -448,6 +450,10 @@ Controls the embedded agent runtime (model/thinking/verbose/timeouts).
"openrouter/deepseek/deepseek-r1:free",
"openrouter/meta-llama/llama-3.3-70b-instruct:free"
],
imageModel: "openrouter/qwen/qwen-2.5-vl-72b-instruct:free",
imageModelFallbacks: [
"openrouter/google/gemini-2.0-flash-vision:free"
],
thinkingDefault: "low",
verboseDefault: "off",
elevatedDefault: "on",

View File

@@ -19,16 +19,22 @@ that prefers tool-call + image-capable models and maintains ordered fallbacks.
- show default model + aliases + fallbacks + allowlist
- `clawdbot models set <modelOrAlias>`
- writes `agent.model` in config
- `clawdbot models set-image <modelOrAlias>`
- writes `agent.imageModel` in config
- `clawdbot models aliases list|add|remove`
- writes `agent.modelAliases`
- `clawdbot models fallbacks list|add|remove|clear`
- writes `agent.modelFallbacks`
- `clawdbot models image-fallbacks list|add|remove|clear`
- writes `agent.imageModelFallbacks`
- `clawdbot models scan`
- OpenRouter :free scan; probe tool-call + image; interactive selection
## Config changes
- Add `agent.modelFallbacks: string[]` (ordered list of provider/model IDs).
- Add `agent.imageModel?: string` (optional image-capable model for image tool).
- Add `agent.imageModelFallbacks?: string[]` (ordered list for image tool).
- Keep existing:
- `agent.model` (default)
- `agent.allowedModels` (list filter)
@@ -49,8 +55,8 @@ Probes (direct pi-ai complete)
- Prompt includes 1x1 PNG; success if no "unsupported image" error.
Scoring/selection
- Prefer models passing tool + image.
- Fallback to tool-only if no tool+image pass.
- Prefer models passing tool + image for text/tool fallbacks.
- Prefer image-only models for image tool fallback (even if tool probe fails).
- Rank by: image ok, then lower tool latency, then larger context, then params.
Interactive selection (TTY)
@@ -61,7 +67,9 @@ Interactive selection (TTY)
Output
- Writes `agent.modelFallbacks` ordered.
- Writes `agent.imageModelFallbacks` ordered (image-capable models).
- Optional `--set-default` to set `agent.model`.
- Optional `--set-image` to set `agent.imageModel`.
## Runtime fallback

View File

@@ -101,6 +101,19 @@ Notes:
- Videos return `FILE:<path>` (mp4).
- Location returns a JSON payload (lat/lon/accuracy/timestamp).
### `image`
Analyze an image with the configured image model.
Core parameters:
- `image` (required path or URL)
- `prompt` (optional; defaults to "Describe the image.")
- `model` (optional override)
- `maxBytesMb` (optional size cap)
Notes:
- Only available when `agent.imageModel` or `agent.imageModelFallbacks` is set.
- Uses the image model directly (independent of the main chat model).
### `cron`
Manage Gateway cron jobs and wakeups.