docs: add model allowlist + reasoning safety notes

This commit is contained in:
Peter Steinberger
2026-01-09 02:07:33 +01:00
parent 9c33080f12
commit 14096fb629
4 changed files with 22 additions and 0 deletions

View File

@@ -77,6 +77,13 @@ Even with strong system prompts, **prompt injection is not solved**. What helps
- Run sensitive tool execution in a sandbox; keep secrets out of the agents reachable filesystem.
- **Model choice matters:** we recommend Anthropic Opus 4.5 because its quite good at recognizing prompt injections (see [“A step forward on safety”](https://www.anthropic.com/news/claude-opus-4-5)). Using weaker models increases risk.
## Reasoning & verbose output in groups
`/reasoning` and `/verbose` can expose internal reasoning or tool output that
was not meant for a public channel. In group settings, treat them as **debug
only** and keep them off unless you explicitly need them. If you enable them,
do so only in trusted DMs or tightly controlled rooms.
## Lessons Learned (The Hard Way)
### The `find ~` Incident 🦞