docs: harden local model guidance

This commit is contained in:
Peter Steinberger
2026-01-12 17:10:56 +00:00
parent 05ac67c520
commit 1b2c1545a0
2 changed files with 4 additions and 4 deletions

View File

@@ -7,9 +7,9 @@ read_when:
---
# Local models
Local is doable, but Clawdbot expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency.
Local is doable, but Clawdbot expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).
## Recommended: LM Studio + MiniMax M2.1 (Responses API)
## Recommended: LM Studio + MiniMax M2.1 (Responses API, full-size)
Best current local stack. Load MiniMax M2.1 in LM Studio, enable the local server (default `http://127.0.0.1:1234`), and use Responses API to keep reasoning separate from final text.
@@ -50,7 +50,7 @@ Best current local stack. Load MiniMax M2.1 in LM Studio, enable the local serve
**Setup checklist**
- Install LM Studio: https://lmstudio.ai
- In LM Studio, download MiniMax M2.1, start the server, confirm `http://127.0.0.1:1234/v1/models` lists it.
- In LM Studio, download the **largest MiniMax M2.1 build available** (avoid “small”/heavily quantized variants), start the server, confirm `http://127.0.0.1:1234/v1/models` lists it.
- Keep the model loaded; cold-load adds startup latency.
- Adjust `contextWindow`/`maxTokens` if your LM Studio build differs.
- For WhatsApp, stick to Responses API so only final text is sent.

View File

@@ -118,7 +118,7 @@ Clawdbot supports **OpenAI Code (Codex)** via OAuth or by reusing your Codex CLI
### Is a local model OK for casual chats?
Usually no. Clawdbot needs large context + strong safety; small cards truncate. See [/gateway/local-models](/gateway/local-models) for hardware expectations and the LM Studio MiniMax M2.1 setup.
Usually no. Clawdbot needs large context + strong safety; small cards truncate and leak. If you must, run the **largest** MiniMax M2.1 build you can locally (LM Studio) and see [/gateway/local-models](/gateway/local-models). Smaller/quantized models increase prompt-injection risk — see [Security](/gateway/security).
### Can I use Bun?