fix: harden doctor state integrity checks

This commit is contained in:
Peter Steinberger
2026-01-07 23:22:58 +01:00
parent 17d052bcda
commit ee28b20419
5 changed files with 498 additions and 50 deletions

View File

@@ -141,6 +141,19 @@ gh auth login
gh repo create clawd-workspace --private --source . --remote origin --push
```
Option C: GitLab web UI
1. Create a new **private** repository on GitLab.
2. Do not initialize with a README (avoids merge conflicts).
3. Copy the HTTPS remote URL.
4. Add the remote and push:
```bash
git branch -M main
git remote add origin <https-url>
git push -u origin main
```
### 3) Ongoing updates
```bash

View File

@@ -6,52 +6,10 @@ read_when:
---
# Doctor
`clawdbot doctor` is the repair + migration tool for Clawdbot. It runs a quick health check, audits skills, and can migrate deprecated config entries to the new schema.
`clawdbot doctor` is the repair + migration tool for Clawdbot. It fixes stale
config/state, checks health, and provides actionable repair steps.
## What it does
- Runs a health check and offers to restart the gateway if it looks unhealthy.
- Prints a skills status summary (eligible/missing/blocked).
- Detects deprecated config keys and offers to migrate them.
- Migrates legacy `~/.clawdis/clawdis.json` when no Clawdbot config exists.
- Checks sandbox Docker images when sandboxing is enabled (offers to build or switch to legacy names).
- Detects legacy Clawdis services (launchd/systemd; legacy schtasks for native Windows) and offers to migrate them.
- Detects other gateway-like services and prints cleanup hints (optional deep scan for system services).
- On Linux, checks if systemd user lingering is enabled and can enable it (required to keep the Gateway alive after logout).
- Migrates legacy on-disk state layouts (sessions, agentDir, provider auth dirs) into the current per-agent/per-account structure.
## Legacy config file migration
If `~/.clawdis/clawdis.json` exists and `~/.clawdbot/clawdbot.json` does not, doctor will migrate the file and normalize old paths/image names.
## Legacy config migrations
When the config contains deprecated keys, other commands will refuse to run and ask you to run `clawdbot doctor`.
Doctor will:
- Explain which legacy keys were found.
- Show the migration it applied.
- Rewrite `~/.clawdbot/clawdbot.json` with the updated schema.
The Gateway also auto-runs doctor migrations on startup when it detects a legacy
config format, so stale configs are repaired without manual intervention.
Current migrations:
- `routing.allowFrom``whatsapp.allowFrom`
- `agent.model`/`allowedModels`/`modelAliases`/`modelFallbacks`/`imageModelFallbacks`
`agent.models` + `agent.model.primary/fallbacks` + `agent.imageModel.primary/fallbacks`
## Legacy state migrations (disk layout)
Doctor can migrate older on-disk layouts into the current structure:
- Sessions store + transcripts:
- from `~/.clawdbot/sessions/` to `~/.clawdbot/agents/<agentId>/sessions/`
- Agent dir:
- from `~/.clawdbot/agent/` to `~/.clawdbot/agents/<agentId>/agent/`
- WhatsApp auth state (Baileys):
- from legacy `~/.clawdbot/credentials/*.json` (except `oauth.json`)
- to `~/.clawdbot/credentials/whatsapp/<accountId>/...` (default account id: `default`)
These migrations are best-effort and idempotent; doctor will emit warnings when it leaves any legacy folders behind as backups.
The Gateway/CLI also auto-migrates the legacy sessions + agent dir on startup so history/auth/models land in the per-agent path without a manual doctor run. WhatsApp auth is intentionally only migrated via `clawdbot doctor`.
## Usage
## Quick start
```bash
clawdbot doctor
@@ -83,7 +41,112 @@ If you want to review changes before writing, open the config file first:
cat ~/.clawdbot/clawdbot.json
```
## Legacy service migrations
Doctor checks for older Clawdis gateway services (launchd/systemd/schtasks). WSL2 installs use systemd.
If found, it offers to remove them and install the Clawdbot service using the current gateway port.
Remote mode skips the install step, and Nix mode only reports what it finds.
## What it does (summary)
- Health check + restart prompt.
- Skills status summary (eligible/missing/blocked).
- Legacy config migration and normalization.
- Legacy on-disk state migration (sessions/agent dir/WhatsApp auth).
- State integrity and permissions checks (sessions, transcripts, state dir).
- Sandbox image repair when sandboxing is enabled.
- Legacy service migration and extra gateway detection.
- Security warnings for open DM policies.
- systemd linger check on Linux.
- Writes updated config + wizard metadata.
## Detailed behavior and rationale
### 1) Legacy config file migration
If `~/.clawdis/clawdis.json` exists and `~/.clawdbot/clawdbot.json` does not,
doctor migrates the file and normalizes old paths/image names. This prevents
new installs from silently booting with the wrong schema.
### 2) Legacy config key migrations
When the config contains deprecated keys, other commands refuse to run and ask
you to run `clawdbot doctor`.
Doctor will:
- Explain which legacy keys were found.
- Show the migration it applied.
- Rewrite `~/.clawdbot/clawdbot.json` with the updated schema.
The Gateway also auto-runs doctor migrations on startup when it detects a
legacy config format, so stale configs are repaired without manual intervention.
Current migrations:
- `routing.allowFrom``whatsapp.allowFrom`
- `agent.model`/`allowedModels`/`modelAliases`/`modelFallbacks`/`imageModelFallbacks`
`agent.models` + `agent.model.primary/fallbacks` + `agent.imageModel.primary/fallbacks`
### 3) Legacy state migrations (disk layout)
Doctor can migrate older on-disk layouts into the current structure:
- Sessions store + transcripts:
- from `~/.clawdbot/sessions/` to `~/.clawdbot/agents/<agentId>/sessions/`
- Agent dir:
- from `~/.clawdbot/agent/` to `~/.clawdbot/agents/<agentId>/agent/`
- WhatsApp auth state (Baileys):
- from legacy `~/.clawdbot/credentials/*.json` (except `oauth.json`)
- to `~/.clawdbot/credentials/whatsapp/<accountId>/...` (default account id: `default`)
These migrations are best-effort and idempotent; doctor will emit warnings when
it leaves any legacy folders behind as backups. The Gateway/CLI also auto-migrates
the legacy sessions + agent dir on startup so history/auth/models land in the
per-agent path without a manual doctor run. WhatsApp auth is intentionally only
migrated via `clawdbot doctor`.
### 4) State integrity checks (session persistence, routing, and safety)
The state directory is the operational brainstem. If it vanishes, you lose
sessions, credentials, logs, and config (unless you have backups elsewhere).
Doctor checks:
- **State dir missing**: warns about catastrophic state loss, prompts to recreate
the directory, and reminds you that it cannot recover missing data.
- **State dir permissions**: verifies writability; offers to repair permissions
(and emits a `chown` hint when owner/group mismatch is detected).
- **Session dirs missing**: `sessions/` and the session store directory are
required to persist history and avoid `ENOENT` crashes.
- **Transcript mismatch**: warns when recent session entries have missing
transcript files.
- **Main session “1-line JSONL”**: flags when the main transcript has only one
line (history is not accumulating).
- **Multiple state dirs**: warns when multiple `~/.clawdbot` folders exist across
home directories or when `CLAWDBOT_STATE_DIR` points elsewhere (history can
split between installs).
- **Remote mode reminder**: if `gateway.mode=remote`, doctor reminds you to run
it on the remote host (the state lives there).
### 5) Sandbox image repair
When sandboxing is enabled, doctor checks Docker images and offers to build or
switch to legacy names if the current image is missing.
### 6) Gateway service migrations and cleanup hints
Doctor detects legacy Clawdis gateway services (launchd/systemd/schtasks) and
offers to remove them and install the Clawdbot service using the current gateway
port. It can also scan for extra gateway-like services and print cleanup hints
to ensure only one gateway runs per machine.
### 7) Security warnings
Doctor emits warnings when a provider is open to DMs without an allowlist, or
when a policy is configured in a dangerous way.
### 8) systemd linger (Linux)
If running as a systemd user service, doctor ensures lingering is enabled so the
gateway stays alive after logout.
### 9) Skills status
Doctor prints a quick summary of eligible/missing/blocked skills for the current
workspace.
### 10) Gateway health check + restart
Doctor runs a health check and offers to restart the gateway when it looks
unhealthy.
### 11) Config write + wizard metadata
Doctor persists any config changes and stamps wizard metadata to record the
doctor run.
### 12) Workspace tips (backup + memory system)
Doctor suggests a workspace memory system when missing and prints a backup tip
if the workspace is not already under git.
See [/concepts/agent-workspace](/concepts/agent-workspace) for a full guide to
workspace structure and git backup (recommended private GitHub or GitLab).