diff --git a/CHANGELOG.md b/CHANGELOG.md index 61188eba6..bd5732ae3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,7 +11,8 @@ ### Changes - CLI/Onboarding: simplify MiniMax auth choice to a single M2.1 option. - CLI: configure section selection now loops until Continue. -- Docs: explain MiniMax vs MiniMax Lightning (speed vs cost). +- Docs: explain MiniMax vs MiniMax Lightning (speed vs cost) and restore LM Studio example. +- Docs: add Cerebras GLM 4.6/4.7 config example (OpenAI-compatible endpoint). - Onboarding/CLI: group model/auth choice by provider and label Z.AI as GLM 4.7. - Auto-reply: add compact `/model` picker (models + available providers) and show provider endpoints in `/model status`. - Plugins: add extension loader (tools/RPC/CLI/services), discovery paths, and config schema + Control UI labels (uiHints). diff --git a/docs/concepts/model-providers.md b/docs/concepts/model-providers.md index 18979e972..8ae6722ea 100644 --- a/docs/concepts/model-providers.md +++ b/docs/concepts/model-providers.md @@ -100,6 +100,8 @@ Clawdbot ships with the pi‑ai catalog. These providers require **no** - xAI: `xai` (`XAI_API_KEY`) - Groq: `groq` (`GROQ_API_KEY`) - Cerebras: `cerebras` (`CEREBRAS_API_KEY`) + - GLM models on Cerebras use ids `zai-glm-4.7` and `zai-glm-4.6`. + - OpenAI-compatible base URL: `https://api.cerebras.ai/v1`. - Mistral: `mistral` (`MISTRAL_API_KEY`) - GitHub Copilot: `github-copilot` (`COPILOT_GITHUB_TOKEN` / `GH_TOKEN` / `GITHUB_TOKEN`) diff --git a/docs/gateway/configuration.md b/docs/gateway/configuration.md index a270b5655..f346e17af 100644 --- a/docs/gateway/configuration.md +++ b/docs/gateway/configuration.md @@ -1850,6 +1850,46 @@ Notes: - Available model: `MiniMax-M2.1` (default). - Update pricing in `models.json` if you need exact cost tracking. +### Cerebras (GLM 4.6 / 4.7) + +Use Cerebras via their OpenAI-compatible endpoint: + +```json5 +{ + env: { CEREBRAS_API_KEY: "sk-..." }, + agents: { + defaults: { + model: { + primary: "cerebras/zai-glm-4.7", + fallbacks: ["cerebras/zai-glm-4.6"] + }, + models: { + "cerebras/zai-glm-4.7": { alias: "GLM 4.7 (Cerebras)" }, + "cerebras/zai-glm-4.6": { alias: "GLM 4.6 (Cerebras)" } + } + } + }, + models: { + mode: "merge", + providers: { + cerebras: { + baseUrl: "https://api.cerebras.ai/v1", + apiKey: "${CEREBRAS_API_KEY}", + api: "openai-completions", + models: [ + { id: "zai-glm-4.7", name: "GLM 4.7 (Cerebras)" }, + { id: "zai-glm-4.6", name: "GLM 4.6 (Cerebras)" } + ] + } + } + } +} +``` + +Notes: +- Use `cerebras/zai-glm-4.7` for Cerebras; use `zai/glm-4.7` for Z.AI direct. +- Set `CEREBRAS_API_KEY` in the environment or config. + Notes: - Supported APIs: `openai-completions`, `openai-responses`, `anthropic-messages`, `google-generative-ai` diff --git a/docs/providers/minimax.md b/docs/providers/minimax.md index 1e0896521..2a8a293d9 100644 --- a/docs/providers/minimax.md +++ b/docs/providers/minimax.md @@ -27,7 +27,7 @@ MiniMax highlights these improvements in M2.1: ## MiniMax M2.1 vs MiniMax M2.1 Lightning -- **Speed:** MiniMax docs list ~60 tps output for M2.1 and ~100 tps for Lightning. +- **Speed:** Lightning is the “fast” variant in MiniMax’s pricing docs. - **Cost:** Pricing shows the same input cost, but Lightning has higher output cost. ## Choose a setup @@ -69,6 +69,46 @@ Configure via CLI: } ``` +### Optional: Local via LM Studio (manual) + +**Best for:** local inference with LM Studio. +We have seen strong results with MiniMax M2.1 on powerful hardware (e.g. a +desktop/server) using LM Studio's local server. + +Configure manually via `clawdbot.json`: + +```json5 +{ + agents: { + defaults: { + model: { primary: "lmstudio/minimax-m2.1-gs32" }, + models: { "lmstudio/minimax-m2.1-gs32": { alias: "Minimax" } } + } + }, + models: { + mode: "merge", + providers: { + lmstudio: { + baseUrl: "http://127.0.0.1:1234/v1", + apiKey: "lmstudio", + api: "openai-responses", + models: [ + { + id: "minimax-m2.1-gs32", + name: "MiniMax M2.1 GS32", + reasoning: false, + input: ["text"], + cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, + contextWindow: 196608, + maxTokens: 8192 + } + ] + } + } + } +} +``` + ## Configure via `clawdbot configure` Use the interactive config wizard to set MiniMax without editing JSON: