--- summary: "Testing kit: unit/e2e/live suites, Docker runners, and what each test covers" read_when: - Running tests locally or in CI - Adding regressions for model/provider bugs - Debugging gateway + agent behavior --- # Testing Clawdbot has three Vitest suites (unit, e2e, live) plus a couple Docker helpers for “run with my real keys” smoke checks. ## Quick start - Full gate (what we expect before push): `pnpm lint && pnpm build && pnpm test` - Coverage gate: `pnpm test:coverage` - E2E suite: `pnpm test:e2e` - Live suite (opt-in): `LIVE=1 pnpm test:live` ## Test suites (what runs where) ### Unit / integration (default) - Command: `pnpm test` - Config: `vitest.config.ts` - Files: `src/**/*.test.ts` - Scope: pure unit tests + in-process integration tests (gateway server auth, routing, tooling, parsing, config). ### E2E (gateway smoke) - Command: `pnpm test:e2e` - Config: `vitest.e2e.config.ts` - Files: `src/**/*.e2e.test.ts` - Scope: multi-instance gateway end-to-end behavior (WebSocket/HTTP/node pairing), heavier networking surface. ### Live (real providers + real models) - Command: `pnpm test:live` - Config: `vitest.live.config.ts` - Files: `src/**/*.live.test.ts` - Default: **skipped** unless `LIVE=1` (or `CLAWDBOT_LIVE_TEST=1`) - Scope: “does this provider/model actually work today with real creds”. ## Live: model smoke (profile keys) Two layers: 1. Direct model completion (no gateway): - Test: `src/agents/models.profiles.live.test.ts` - Goal: enumerate discovered models, use `getApiKeyForModel` to pick ones you have creds for, then run a small completion. - Selection: - `CLAWDBOT_LIVE_ALL_MODELS=1` (required to run the suite) - `CLAWDBOT_LIVE_MODELS=all` or comma allowlist (`openai/gpt-5.2,anthropic/claude-opus-4-5,...`) - `CLAWDBOT_LIVE_REQUIRE_PROFILE_KEYS=1` to ensure creds come from the profile store (not ad-hoc env). - Regression hook: OpenAI Responses tool-only → follow-up path (the `reasoning` replay class) is covered here. 2. Gateway + dev agent smoke (what “@clawdbot” actually does): - Test: `src/gateway/gateway-models.profiles.live.test.ts` - Goal: spin up an in-process gateway, create/patch a `agent:dev:*` session, iterate models-with-keys, and assert “meaningful” responses. - Selection: - `CLAWDBOT_LIVE_GATEWAY=1` - `CLAWDBOT_LIVE_GATEWAY_ALL_MODELS=1` (scan all discovered models with available keys) - `CLAWDBOT_LIVE_GATEWAY_MODELS=all` or comma allowlist - Extra regression: for OpenAI Responses/Codex Responses models, force a tool-call-only turn followed by a user question (the exact failure mode that produced `400 … reasoning … required following item`). ## Credentials (never commit) Live tests discover credentials the same way the CLI does: - Profile store: `~/.clawdbot/credentials/` (preferred; what “profile keys” means in the tests) - Config: `~/.clawdbot/clawdbot.json` (or `CLAWDBOT_CONFIG_PATH`) If you want to rely on env keys (e.g. exported in your `~/.profile`), run local tests after `source ~/.profile`, or use the Docker runners below (they can mount `~/.profile` into the container). ## Docker runners (optional “works in Linux” checks) These run `pnpm test:live` inside the repo Docker image, mounting your local config dir and workspace: - Direct models: `scripts/test-live-models-docker.sh` - Gateway + dev agent: `scripts/test-live-gateway-models-docker.sh` Useful env vars: - `CLAWDBOT_CONFIG_DIR=...` (default: `~/.clawdbot`) mounted to `/home/node/.clawdbot` - `CLAWDBOT_WORKSPACE_DIR=...` (default: `~/clawd`) mounted to `/home/node/clawd` - `CLAWDBOT_LIVE_GATEWAY_MODELS=...` / `CLAWDBOT_LIVE_MODELS=...` to narrow the run ## Docs sanity Run docs checks after doc edits: `pnpm docs:list`.