docs: reorganize documentation structure

This commit is contained in:
Peter Steinberger
2026-01-07 00:41:31 +01:00
parent b8db8502aa
commit db4d0b8e75
126 changed files with 881 additions and 270 deletions

View File

@@ -0,0 +1,34 @@
---
summary: "RPC protocol notes for onboarding wizard and config schema"
read_when: "Changing onboarding wizard steps or config schema endpoints"
---
# Onboarding + Config Protocol
Purpose: shared onboarding + config surfaces across CLI, macOS app, and Web UI.
## Components
- Wizard engine: `src/wizard` (session + prompts + onboarding state).
- CLI: [`src/commands/onboard-*.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/onboard-*.ts) uses the wizard with the CLI prompter.
- Gateway RPC: wizard + config schema endpoints serve UI clients.
- macOS: SwiftUI onboarding uses the wizard step model.
- Web UI: config form renders from JSON Schema + hints.
## Gateway RPC
- `wizard.start` params: `{ mode?: "local"|"remote", workspace?: string }`
- `wizard.next` params: `{ sessionId, answer?: { stepId, value? } }`
- `wizard.cancel` params: `{ sessionId }`
- `wizard.status` params: `{ sessionId }`
- `config.schema` params: `{}`
Responses (shape)
- Wizard: `{ sessionId, done, step?, status?, error? }`
- Config schema: `{ schema, uiHints, version, generatedAt }`
## UI Hints
- `uiHints` keyed by path; optional metadata (label/help/group/order/advanced/sensitive/placeholder).
- Sensitive fields render as password inputs; no redaction layer.
- Unsupported schema nodes fall back to the raw JSON editor.
## Notes
- This doc is the single place to track protocol refactors for onboarding/config.

View File

@@ -0,0 +1,72 @@
---
summary: "Harden cron.add input handling, align schemas, and improve cron UI/agent tooling"
owner: "clawdbot"
status: "complete"
last_updated: "2026-01-05"
---
# Cron Add Hardening & Schema Alignment
## Context
Recent gateway logs show repeated `cron.add` failures with invalid parameters (missing `sessionTarget`, `wakeMode`, `payload`, and malformed `schedule`). This indicates that at least one client (likely the agent tool call path) is sending wrapped or partially specified job payloads. Separately, there is drift between cron provider enums in TypeScript, gateway schema, CLI flags, and UI form types, plus a UI mismatch for `cron.status` (expects `jobCount` while gateway returns `jobs`).
## Goals
- Stop `cron.add` INVALID_REQUEST spam by normalizing common wrapper payloads and inferring missing `kind` fields.
- Align cron provider lists across gateway schema, cron types, CLI docs, and UI forms.
- Make agent cron tool schema explicit so the LLM produces correct job payloads.
- Fix the Control UI cron status job count display.
- Add tests to cover normalization and tool behavior.
## Non-goals
- Change cron scheduling semantics or job execution behavior.
- Add new schedule kinds or cron expression parsing.
- Overhaul the UI/UX for cron beyond the necessary field fixes.
## Findings (current gaps)
- `CronPayloadSchema` in gateway excludes `signal` + `imessage`, while TS types include them.
- Control UI CronStatus expects `jobCount`, but gateway returns `jobs`.
- Agent cron tool schema allows arbitrary `job` objects, enabling malformed inputs.
- Gateway strictly validates `cron.add` with no normalization, so wrapped payloads fail.
## Proposed Approach
1. **Normalize** incoming `cron.add` payloads (unwrap `data`/`job`, infer `schedule.kind` and `payload.kind`, default `wakeMode` + `sessionTarget` when safe).
2. **Harden** the agent cron tool schema using the canonical gateway `CronAddParamsSchema` and normalize before sending to the gateway.
3. **Align** provider enums and cron status fields across gateway schema, TS types, CLI descriptions, and UI form controls.
4. **Test** normalization in gateway tests and tool behavior in agent tests.
## Multi-phase Execution Plan
### Phase 1 — Schema + type alignment
- [x] Expand gateway `CronPayloadSchema` provider enum to include `signal` and `imessage`.
- [x] Update CLI `--provider` descriptions to include `slack` (already supported by gateway).
- [x] Update UI Cron payload/provider union types to include all supported providers.
- [x] Fix UI CronStatus type to match gateway (`jobs` instead of `jobCount`).
- [x] Update cron UI provider select to include Discord/Slack/Signal/iMessage.
- [x] Update macOS CronJobEditor provider picker + enum to include Slack/Signal/iMessage.
- [x] Document cron compatibility normalization policy in [`docs/cron-jobs.md`](/cron-jobs).
### Phase 2 — Input normalization + tooling hardening
- [x] Add shared cron input normalization helpers (`normalizeCronJobCreate`/`normalizeCronJobPatch`).
- [x] Apply normalization in gateway `cron.add` (and patch normalization in `cron.update`).
- [x] Tighten agent cron tool schema to `CronAddParamsSchema` and normalize job/patch before sending.
### Phase 3 — Tests
- [x] Add gateway test covering wrapped `cron.add` payload normalization.
- [x] Add cron tool test to assert normalization and defaulting for `cron.add`.
- [x] Add gateway test covering `cron.update` normalization.
- [x] Add UI + Swift conformance test for cron channels + status fields.
### Phase 4 — Verification
- [x] Run tests (full suite executed via `pnpm test -- cron-tool`).
## Rollout/Monitoring
- Watch gateway logs for reduced `cron.add` INVALID_REQUEST errors.
- Confirm Control UI cron status shows job count after refresh.
- If errors persist, extend normalization for additional common shapes (e.g., `schedule.at`, `payload.message` without `kind`).
## Optional Follow-ups
- Manual Control UI smoke: add cron job per provider + verify status job count.
## Open Questions
- Should `cron.add` accept explicit `state` from clients (currently disallowed by schema)?
- Should we allow `webchat` as an explicit delivery provider (currently filtered in delivery resolution)?

View File

@@ -0,0 +1,126 @@
---
summary: "Spec: groupPolicy hardening for Telegram allowlist parity"
read_when:
- Reviewing historical Telegram allowlist normalization changes
---
# Engineering Execution Spec: groupPolicy Hardening (Telegram Allowlist Parity)
**Date**: 2026-01-05
**Status**: Complete
**PR**: #216 (feat/whatsapp-group-policy)
---
## Executive Summary
Follow-up hardening work ensures Telegram allowlists behave consistently across inbound group/DM filtering and outbound send normalization. The focus is on prefix parity (`telegram:` / `tg:`), case-insensitive matching for prefixes, and resilience to accidental whitespace in config entries. Documentation and tests were updated to reflect and lock in this behavior.
---
## Findings Analysis
### [MED] F1: Telegram Allowlist Prefix Handling Is Case-Sensitive and Excludes `tg:`
**Location**: [`src/telegram/bot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.ts)
**Problem**: Inbound allowlist normalization only stripped a lowercase `telegram:` prefix. This rejected `TG:123` / `Telegram:123` and did not accept the `tg:` shorthand even though outbound send normalization already accepts `tg:` and case-insensitive prefixes.
**Impact**:
- DMs and group allowlists fail when users copy/paste prefixed IDs from logs or existing send format.
- Behavior is inconsistent between inbound filtering and outbound send normalization.
**Fix**: Normalize allowlist entries by trimming whitespace and stripping `telegram:` / `tg:` prefixes case-insensitively at pre-compute time.
---
### [LOW] F2: Allowlist Entries Are Not Trimmed
**Location**: [`src/telegram/bot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.ts)
**Problem**: Allowlist entries are not trimmed; accidental whitespace causes mismatches.
**Fix**: Trim and drop empty entries while normalizing allowlist inputs.
---
## Implementation Phases
### Phase 1: Normalize Telegram Allowlist Inputs
**File**: [`src/telegram/bot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.ts)
**Changes**:
1. Trim allowlist entries and drop empty values.
2. Strip `telegram:` / `tg:` prefixes case-insensitively.
3. Simplify DM allowlist check to rely on normalized values.
---
### Phase 2: Add Coverage for Prefix + Whitespace
**File**: [`src/telegram/bot.test.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.test.ts)
**Add Tests**:
- DM allowlist accepts `TG:` prefix with surrounding whitespace.
- Group allowlist accepts `TG:` prefix case-insensitively.
---
### Phase 3: Documentation Updates
**Files**:
- [`docs/groups.md`](/groups)
- [`docs/telegram.md`](/telegram)
**Changes**:
- Document `tg:` alias and case-insensitive prefixes for Telegram allowlists.
---
### Phase 4: Verification
1. Run targeted Telegram tests (`pnpm test -- src/telegram/bot.test.ts`).
2. If time allows, run full suite (`pnpm test`).
---
## Files Modified
| File | Change Type | Description |
|------|-------------|-------------|
| [`src/telegram/bot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.ts) | Fix | Trim allowlist values; strip `telegram:` / `tg:` prefixes case-insensitively |
| [`src/telegram/bot.test.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/telegram/bot.test.ts) | Test | Add DM + group allowlist coverage for `TG:` prefix + whitespace |
| [`docs/groups.md`](/groups) | Docs | Mention `tg:` alias + case-insensitive prefixes |
| [`docs/telegram.md`](/telegram) | Docs | Mention `tg:` alias + case-insensitive prefixes |
---
## Success Criteria
- [x] Telegram allowlist accepts `telegram:` / `tg:` prefixes case-insensitively.
- [x] Telegram allowlist tolerates whitespace in config entries.
- [x] DM and group allowlist tests cover prefixed cases.
- [x] Docs updated to reflect allowlist formats.
- [x] Targeted tests pass.
- [x] Full test suite passes.
---
## Risk Assessment
| Risk | Severity | Mitigation |
|------|----------|------------|
| Behavior change for malformed entries | Low | Normalization is additive and trims only whitespace |
| Test fragility | Low | Isolated unit tests; no external dependencies |
| Doc drift | Low | Updated docs alongside code |
---
## Estimated Complexity
- **Phase 1**: Low (normalization helpers)
- **Phase 2**: Low (2 new tests)
- **Phase 3**: Low (doc edits)
- **Phase 4**: Low (verification)
**Total**: ~20 minutes

View File

@@ -0,0 +1,147 @@
---
summary: "Proposal: model config, auth profiles, and fallback behavior"
read_when:
- Designing model selection, auth profiles, or fallback behavior
- Migrating model config schema
---
# Model config proposal
Goals
- Multi OAuth + multi API key per provider
- Model selection via `/model` with sensible fallback
- Global (not per-session) fallback logic
- Keep last-known-good auth profile when switching models
- Profile override only when explicitly requested
- Image routing override only when explicitly configured
Non-goals (v1)
- Auto-discovery of provider capabilities beyond catalog input tags
- Per-model auth profile order (see open questions)
## Proposed config shape
```json
{
"auth": {
"profiles": {
"anthropic:default": {
"provider": "anthropic",
"mode": "oauth"
},
"anthropic:work": {
"provider": "anthropic",
"mode": "api_key"
},
"openai:default": {
"provider": "openai",
"mode": "oauth"
}
},
"order": {
"anthropic": ["anthropic:default", "anthropic:work"],
"openai": ["openai:default"]
}
},
"agent": {
"models": {
"anthropic/claude-opus-4-5": {
"alias": "Opus"
},
"openai/gpt-5.2": {
"alias": "gpt52"
}
},
"model": {
"primary": "anthropic/claude-opus-4-5",
"fallbacks": ["openai/gpt-5.2"]
},
"imageModel": {
"primary": "openai/gpt-5.2",
"fallbacks": ["anthropic/claude-opus-4-5"]
}
}
}
```
Notes
- Canonical model keys are full `provider/model`.
- `alias` optional; used by `/model` resolution.
- `auth.profiles` is keyed. Default CLI login creates `provider:default`.
- `auth.order[provider]` controls rotation order for that provider.
## CLI / UX
Login
- `clawdbot login anthropic` → create/replace `anthropic:default`.
- `clawdbot login anthropic --profile work` → create/replace `anthropic:work`.
Model selection
- `/model Opus` → resolve alias to full id.
- `/model anthropic/claude-opus-4-5` → explicit.
- Optional: `/model Opus@anthropic:work` (explicit profile override for session only).
Model listing
- `/model` list shows:
- model id
- alias
- provider
- auth order (from `auth.order`)
- auth source for the current provider (auth-profiles.json/env/shell env/models.json)
## Fallback behavior (global)
Fallback list
- Use `agent.model.fallbacks` globally.
- No per-session fallback list; last-known-good is per-session but uses global ordering.
Auth profile rotation
- If provider auth error (401/403/invalid refresh):
- advance to next profile in `auth.order[provider]`.
- if all fail, fall back to next model.
Rate limiting
- If rate limit / quota error:
- rotate auth profile first (same provider)
- if still failing, fall back to next model.
Model not found / capability mismatch
- immediate model fallback.
## Image routing
Rule
- Only use `agent.imageModel` when explicitly configured.
- If `agent.imageModel` is configured and the current text model lacks image input, use it.
Support detection
- From model catalog `input` tags when available (e.g. `image` in models.json).
- If unknown: treat as text-only and use `agent.imageModel`.
## Migration (doctor + gateway auto-run)
Inputs
- Legacy keys (pre-migration):
- `agent.model` (string)
- `agent.modelFallbacks` (string[])
- `agent.imageModel` (string)
- `agent.imageModelFallbacks` (string[])
- `agent.allowedModels` (string[])
- `agent.modelAliases` (record)
Outputs
- `agent.models` map with keys for all referenced models
- `agent.model.primary/fallbacks`
- `agent.imageModel.primary/fallbacks`
- Auth profile store seeded from current auth-profiles.json/auth.json + oauth.json + env (as `provider:default`)
- `auth.order` seeded with `["provider:default"]` when config is updated
Auto-run
- Gateway start detects legacy keys and runs doctor migration.
## Decisions
- Auth order is per-provider (`auth.order`).
- Doctor migration is required; gateway will auto-run on startup when legacy keys detected.
- `/model Opus@profile` is explicit session override only.
- Image routing override only when `agent.imageModel` is explicitly configured.

View File

@@ -0,0 +1,227 @@
---
summary: "Proposal + research notes: offline memory system for Clawd workspaces (Markdown source-of-truth + derived index)"
read_when:
- Designing workspace memory (~/clawd) beyond daily Markdown logs
- Deciding: standalone CLI vs deep Clawdbot integration
- Adding offline recall + reflection (retain/recall/reflect)
---
# Workspace Memory v2 (offline): proposal + research
Target: Clawd-style workspace (`agent.workspace`, default `~/clawd`) where “memory” is stored as one Markdown file per day (`memory/YYYY-MM-DD.md`) plus a small set of stable files (e.g. `memory.md`, `SOUL.md`).
This doc proposes an **offline-first** memory architecture that keeps Markdown as the canonical, reviewable source of truth, but adds **structured recall** (search, entity summaries, confidence updates) via a derived index.
## Why change?
The current setup (one file per day) is excellent for:
- “append-only” journaling
- human editing
- git-backed durability + auditability
- low-friction capture (“just write it down”)
Its weak for:
- high-recall retrieval (“what did we decide about X?”, “last time we tried Y?”)
- entity-centric answers (“tell me about Alice / The Castle / warelay”) without rereading many files
- opinion/preference stability (and evidence when it changes)
- time constraints (“what was true during Nov 2025?”) and conflict resolution
## Design goals
- **Offline**: works without network; can run on laptop/Castle; no cloud dependency.
- **Explainable**: retrieved items should be attributable (file + location) and separable from inference.
- **Low ceremony**: daily logging stays Markdown, no heavy schema work.
- **Incremental**: v1 is useful with FTS only; semantic/vector and graphs are optional upgrades.
- **Agent-friendly**: makes “recall within token budgets” easy (return small bundles of facts).
## North star model (Hindsight × Letta)
Two pieces to blend:
1) **Letta/MemGPT-style control loop**
- keep a small “core” always in context (persona + key user facts)
- everything else is out-of-context and retrieved via tools
- memory writes are explicit tool calls (append/replace/insert), persisted, then re-injected next turn
2) **Hindsight-style memory substrate**
- separate whats observed vs whats believed vs whats summarized
- support retain/recall/reflect
- confidence-bearing opinions that can evolve with evidence
- entity-aware retrieval + temporal queries (even without full knowledge graphs)
## Proposed architecture (Markdown source-of-truth + derived index)
### Canonical store (git-friendly)
Keep `~/clawd` as canonical human-readable memory.
Suggested workspace layout:
```
~/clawd/
memory.md # small: durable facts + preferences (core-ish)
memory/
YYYY-MM-DD.md # daily log (append; narrative)
bank/ # “typed” memory pages (stable, reviewable)
world.md # objective facts about the world
experience.md # what the agent did (first-person)
opinions.md # subjective prefs/judgments + confidence + evidence pointers
entities/
Peter.md
The-Castle.md
warelay.md
...
```
Notes:
- **Daily log stays daily log**. No need to turn it into JSON.
- The `bank/` files are **curated**, produced by reflection jobs, and can still be edited by hand.
- `memory.md` remains “small + core-ish”: the things you want Clawd to see every session.
### Derived store (machine recall)
Add a derived index under the workspace (not necessarily git tracked):
```
~/clawd/.memory/index.sqlite
```
Back it with:
- SQLite schema for facts + entity links + opinion metadata
- SQLite **FTS5** for lexical recall (fast, tiny, offline)
- optional embeddings table for semantic recall (still offline)
The index is always **rebuildable from Markdown**.
## Retain / Recall / Reflect (operational loop)
### Retain: normalize daily logs into “facts”
Hindsights key insight that matters here: store **narrative, self-contained facts**, not tiny snippets.
Practical rule for `memory/YYYY-MM-DD.md`:
- at end of day (or during), add a `## Retain` section with 25 bullets that are:
- narrative (cross-turn context preserved)
- self-contained (standalone makes sense later)
- tagged with type + entity mentions
Example:
```
## Retain
- W @Peter: Currently in Marrakech (Nov 27Dec 1, 2025) for Andys birthday.
- B @warelay: I fixed the Baileys WS crash by wrapping connection.update handlers in try/catch (see memory/2025-11-27.md).
- O(c=0.95) @Peter: Prefers concise replies (<1500 chars) on WhatsApp; long content goes into files.
```
Minimal parsing:
- Type prefix: `W` (world), `B` (experience/biographical), `O` (opinion), `S` (observation/summary; usually generated)
- Entities: `@Peter`, `@warelay`, etc (slugs map to `bank/entities/*.md`)
- Opinion confidence: `O(c=0.0..1.0)` optional
If you dont want authors to think about it: the reflect job can infer these bullets from the rest of the log, but having an explicit `## Retain` section is the easiest “quality lever”.
### Recall: queries over the derived index
Recall should support:
- **lexical**: “find exact terms / names / commands” (FTS5)
- **entity**: “tell me about X” (entity pages + entity-linked facts)
- **temporal**: “what happened around Nov 27” / “since last week”
- **opinion**: “what does Peter prefer?” (with confidence + evidence)
Return format should be agent-friendly and cite sources:
- `kind` (`world|experience|opinion|observation`)
- `timestamp` (source day, or extracted time range if present)
- `entities` (`["Peter","warelay"]`)
- `content` (the narrative fact)
- `source` (`memory/2025-11-27.md#L12` etc)
### Reflect: produce stable pages + update beliefs
Reflection is a scheduled job (daily or heartbeat `ultrathink`) that:
- updates `bank/entities/*.md` from recent facts (entity summaries)
- updates `bank/opinions.md` confidence based on reinforcement/contradiction
- optionally proposes edits to `memory.md` (“core-ish” durable facts)
Opinion evolution (simple, explainable):
- each opinion has:
- statement
- confidence `c ∈ [0,1]`
- last_updated
- evidence links (supporting + contradicting fact IDs)
- when new facts arrive:
- find candidate opinions by entity overlap + similarity (FTS first, embeddings later)
- update confidence by small deltas; big jumps require strong contradiction + repeated evidence
## CLI integration: standalone vs deep integration
Recommendation: **deep integration in Clawdbot**, but keep a separable core library.
### Why integrate into Clawdbot?
- Clawdbot already knows:
- the workspace path (`agent.workspace`)
- the session model + heartbeats
- logging + troubleshooting patterns
- You want the agent itself to call the tools:
- `clawdbot memory recall "…" --k 25 --since 30d`
- `clawdbot memory reflect --since 7d`
### Why still split a library?
- keep memory logic testable without gateway/runtime
- reuse from other contexts (local scripts, future desktop app, etc.)
Shape:
- `src/memory/*` (library-ish core; pure functions + sqlite adapter)
- [`src/commands/memory/*.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/commands/memory/*.ts) (CLI glue)
## “S-Collide” / SuCo: when to use it (research)
If “S-Collide” refers to **SuCo (Subspace Collision)**: its an ANN retrieval approach that targets strong recall/latency tradeoffs by using learned/structured collisions in subspaces (paper: arXiv 2411.14754, 2024).
Pragmatic take for `~/clawd`:
- **dont start** with SuCo.
- start with SQLite FTS + (optional) simple embeddings; youll get most UX wins immediately.
- consider SuCo/HNSW/ScaNN-class solutions only once:
- corpus is big (tens/hundreds of thousands of chunks)
- brute-force embedding search becomes too slow
- recall quality is meaningfully bottlenecked by lexical search
Offline-friendly alternatives (in increasing complexity):
- SQLite FTS5 + metadata filters (zero ML)
- Embeddings + brute force (works surprisingly far if chunk count is low)
- HNSW index (common, robust; needs a library binding)
- SuCo (research-grade; attractive if theres a solid implementation you can embed)
Open question:
- whats the **best** offline embedding model for “personal assistant memory” on your machines (MacBook + Castle)?
- if you already have Ollama: embed with a local model; otherwise ship a small embedding model in the toolchain.
## Implementation plan (phased, shippable)
### Phase 0: workspace conventions (no code)
- add `bank/` files + entity pages
- add `## Retain` convention to daily logs
### Phase 1: `clawdbot memory index|recall` (FTS-only)
- parse Markdown (`memory/*.md`, `bank/*.md`) into chunks
- write to SQLite: `facts`, `entities`, `fact_entities`, `opinions`
- FTS5 table over `facts.content`
- `recall` returns citations (path + line) + trimmed content budget
### Phase 2: entity summaries + opinion tracking
- `reflect` updates `bank/entities/*.md`
- opinion confidence updates with evidence pointers (no embeddings required yet)
### Phase 3: semantic recall (offline embeddings)
- compute embeddings during indexing (incremental)
- retrieval = `hybrid(FTS, vector)` with simple fusion
### Phase 4: “graph-ish” traversal (still simple)
- entity links enable multi-hop: “related to Peter via warelay”
- optional: “topic” nodes, lightweight edges (not a full KG)
## References
- Letta / MemGPT concepts: “core memory blocks” + “archival memory” + tool-driven self-editing memory.
- Hindsight Technical Report: “retain / recall / reflect”, four-network memory, narrative fact extraction, opinion confidence evolution.
- SuCo: arXiv 2411.14754 (2024): “Subspace Collision” approximate nearest neighbor retrieval.