docs: refresh model notes and stacks

This commit is contained in:
Peter Steinberger
2026-01-08 08:50:04 +01:00
parent 44441dd5d8
commit ae6a476ffc

View File

@@ -31,33 +31,37 @@ Related:
## Model recommendations ## Model recommendations
Through testing, weve found [Claude Opus 4.5](https://www.anthropic.com/claude/opus) is the most useful general-purpose model for anything coding-related. We suggest [GPT-5.2-Codex](https://developers.openai.com/codex/models) for coding and sub-agents. For personal assistant work, nothing comes close to Opus. If youre going all-in on Claude, we recommend the [Claude Max $200 subscription](https://www.anthropic.com/pricing/). - [Claude Opus 4.5](https://www.anthropic.com/claude/opus): default primary for assistant + general work. Its pricey and cap-prone, so consider the [Claude Max $200 subscription](https://www.anthropic.com/pricing/) if you live here.
- [Claude Sonnet 4.5](https://www.anthropic.com/claude/sonnet): default fallback when Opus caps out. Similar behavior with fewer limit headaches.
- [GPT-5.2-Codex](https://developers.openai.com/codex/models): recommended for coding and sub-agents. Prefer the [Codex CLI](https://developers.openai.com/codex/cli) if you want the strongest feel.
Suggested stacks:
- Assistant-first: Opus 4.5 primary → Sonnet 4.5 fallback.
- Agentic coding: Opus 4.5 primary → GPT-5.2-Codex for sub-agents → Sonnet 4.5 fallback.
## Model discussions (community notes) ## Model discussions (community notes)
Anecdotal notes from the Discord thread on January 45, 2026. Treat as “what people reported,” not guarantees. Anecdotal notes from the Discord thread on January 45, 2026. Treat as “reported by users,” not a benchmark.
**Reported working well** **Reported working well**
- [Claude Opus 4.5](https://www.anthropic.com/claude/opus): best quality, but expensive and easy to hit limits. - [Claude Opus 4.5](https://www.anthropic.com/claude/opus): best overall quality in Clawdbot, especially for “assistant” work. Tradeoff is cost and hitting usage limits quickly.
- [Claude Sonnet 4.5](https://www.anthropic.com/claude/sonnet): solid fallback when Opus caps out. - [Claude Sonnet 4.5](https://www.anthropic.com/claude/sonnet): common fallback when Opus caps out. Similar behavior with fewer limit headaches.
- [GLM](https://www.zhipuai.cn/en/): used as a worker model under orchestration. - [Gemini 3 Pro](https://deepmind.google/en/models/gemini/pro/): some users felt it maps well to Clawdbots structure. Vibe was “fits the framework” more than “best at everything.”
- [MiniMax M2.1](https://platform.minimax.io/docs/guides/models-intro): “good enough” fallback for grunt tasks. - [GLM](https://www.zhipuai.cn/en/): used successfully as a worker model under orchestration. Seen as strong for delegated/secondary tasks, not the primary brain.
- “Temu-Sonnet” (community shorthand) for MiniMax quality vs Claude Sonnet. - [MiniMax M2.1](https://platform.minimax.io/docs/guides/models-intro): “good enough” for grunt work or a cheap fallback. Community nickname was “Temu-Sonnet,” i.e. usable but not Sonnet-level polish.
- [Gemini 3 Pro](https://deepmind.google/en/models/gemini/pro/): some users said it maps Clawdbot structure well.
**Mixed / unclear** **Mixed / unclear**
- [Antigravity](https://blog.google/technology/ai/google-ai-updates-november-2025/) (Claude Opus access): some reported extra Opus quota, pricing/limits unclear. - [Antigravity](https://blog.google/technology/ai/google-ai-updates-november-2025/) (Claude Opus access): some reported extra Opus quota. Pricing/limits were unclear, so the value is hard to predict.
**Reported weak in Clawdbot** **Reported weak in Clawdbot**
- [GPT-5.2-Codex](https://developers.openai.com/codex/models) inside Clawdbot: considered rough for conversation or assistant tasks. - [GPT-5.2-Codex](https://developers.openai.com/codex/models) inside Clawdbot: reported as rough for conversation/assistant tasks when embedded. Same notes said Codex felt stronger via the [Codex CLI](https://developers.openai.com/codex/cli) than embedded use.
- [Grok](https://docs.x.ai/docs/models/grok-4): tried, abandoned. - [Grok](https://docs.x.ai/docs/models/grok-4): people tried it and then abandoned it. No strong upside showed up in the notes.
**Tooling note**
- [Codex CLI](https://developers.openai.com/codex/cli) felt stronger than embedded use.
**Theme** **Theme**
- Token burn feels higher than expected in long sessions; people suspect context buildup + tool outputs. Pruning/compaction helps. Check session logs before blaming providers. See [/concepts/session](/concepts/session) and [/concepts/model-failover](/concepts/model-failover). - Token burn feels higher than expected in long sessions; people suspect context buildup + tool outputs. Pruning/compaction helps. Check session logs before blaming providers. See [/concepts/session](/concepts/session) and [/concepts/model-failover](/concepts/model-failover).
Want a tailored stack? Share whether youre using Clawdbot or Clawdis and your main workload (agentic coding vs “assistant” work), and we can suggest a primary + fallback set based on these reports.
## Models CLI ## Models CLI
See [/cli](/cli) for the full command tree and CLI flags. See [/cli](/cli) for the full command tree and CLI flags.