test: add setup-token live smoke
This commit is contained in:
@@ -133,7 +133,7 @@ Live tests are split into two layers so we can isolate failures:
|
|||||||
- Optional tool-calling stress:
|
- Optional tool-calling stress:
|
||||||
- `CLAWDBOT_LIVE_GATEWAY_TOOL_PROBE=1` enables an extra “bash writes file → read reads it back → echo nonce” check.
|
- `CLAWDBOT_LIVE_GATEWAY_TOOL_PROBE=1` enables an extra “bash writes file → read reads it back → echo nonce” check.
|
||||||
- This is specifically meant to catch tool-calling compatibility issues across providers (formatting, history replay, tool_result pairing, etc.).
|
- This is specifically meant to catch tool-calling compatibility issues across providers (formatting, history replay, tool_result pairing, etc.).
|
||||||
- Optional image send smoke:
|
- Optional image send smoke:
|
||||||
- `CLAWDBOT_LIVE_GATEWAY_IMAGE_PROBE=1` sends a real image attachment through the gateway agent pipeline (multimodal message) and asserts the model can read back a per-run code from the image.
|
- `CLAWDBOT_LIVE_GATEWAY_IMAGE_PROBE=1` sends a real image attachment through the gateway agent pipeline (multimodal message) and asserts the model can read back a per-run code from the image.
|
||||||
- Flow (high level):
|
- Flow (high level):
|
||||||
- Test generates a tiny PNG with “CAT” + random code (`src/gateway/live-image-probe.ts`)
|
- Test generates a tiny PNG with “CAT” + random code (`src/gateway/live-image-probe.ts`)
|
||||||
@@ -142,6 +142,26 @@ Live tests are split into two layers so we can isolate failures:
|
|||||||
- Embedded agent forwards a multimodal user message to the model
|
- Embedded agent forwards a multimodal user message to the model
|
||||||
- Assertion: reply contains `cat` + the code (OCR tolerance: minor mistakes allowed)
|
- Assertion: reply contains `cat` + the code (OCR tolerance: minor mistakes allowed)
|
||||||
|
|
||||||
|
## Live: Anthropic setup-token smoke
|
||||||
|
|
||||||
|
- Test: `src/agents/anthropic.setup-token.live.test.ts`
|
||||||
|
- Goal: verify Claude CLI setup-token (or a pasted setup-token profile) can complete an Anthropic prompt.
|
||||||
|
- Enable:
|
||||||
|
- `CLAWDBOT_LIVE_TEST=1` or `LIVE=1`
|
||||||
|
- `CLAWDBOT_LIVE_SETUP_TOKEN=1`
|
||||||
|
- Token sources (pick one):
|
||||||
|
- Profile: `CLAWDBOT_LIVE_SETUP_TOKEN_PROFILE=anthropic:setup-token-test`
|
||||||
|
- Raw token: `CLAWDBOT_LIVE_SETUP_TOKEN_VALUE=sk-ant-oat01-...`
|
||||||
|
- Model override (optional):
|
||||||
|
- `CLAWDBOT_LIVE_SETUP_TOKEN_MODEL=anthropic/claude-opus-4-5`
|
||||||
|
|
||||||
|
Setup example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
clawdbot models auth paste-token --provider anthropic --profile-id anthropic:setup-token-test
|
||||||
|
CLAWDBOT_LIVE_TEST=1 CLAWDBOT_LIVE_SETUP_TOKEN=1 CLAWDBOT_LIVE_SETUP_TOKEN_PROFILE=anthropic:setup-token-test pnpm test:live src/agents/anthropic.setup-token.live.test.ts
|
||||||
|
```
|
||||||
|
|
||||||
### Recommended live recipes
|
### Recommended live recipes
|
||||||
|
|
||||||
Narrow, explicit allowlists are fastest and least flaky:
|
Narrow, explicit allowlists are fastest and least flaky:
|
||||||
@@ -153,22 +173,41 @@ Narrow, explicit allowlists are fastest and least flaky:
|
|||||||
- `LIVE=1 CLAWDBOT_LIVE_GATEWAY=1 CLAWDBOT_LIVE_GATEWAY_ALL_MODELS=1 CLAWDBOT_LIVE_GATEWAY_MODELS="openai/gpt-5.2" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
|
- `LIVE=1 CLAWDBOT_LIVE_GATEWAY=1 CLAWDBOT_LIVE_GATEWAY_ALL_MODELS=1 CLAWDBOT_LIVE_GATEWAY_MODELS="openai/gpt-5.2" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
|
||||||
|
|
||||||
- Tool calling across several providers (bash + read probe):
|
- Tool calling across several providers (bash + read probe):
|
||||||
- `LIVE=1 CLAWDBOT_LIVE_GATEWAY=1 CLAWDBOT_LIVE_GATEWAY_ALL_MODELS=1 CLAWDBOT_LIVE_GATEWAY_TOOL_PROBE=1 CLAWDBOT_LIVE_GATEWAY_MODELS="openai/gpt-5.2,anthropic/claude-opus-4-5,google/gemini-flash-latest,zai/glm-4.7,minimax/minimax-m2.1" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
|
- `LIVE=1 CLAWDBOT_LIVE_GATEWAY=1 CLAWDBOT_LIVE_GATEWAY_ALL_MODELS=1 CLAWDBOT_LIVE_GATEWAY_TOOL_PROBE=1 CLAWDBOT_LIVE_GATEWAY_MODELS="openai/gpt-5.2,anthropic/claude-opus-4-5,google/gemini-3-flash-preview,zai/glm-4.7,minimax/minimax-m2.1" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
|
||||||
|
|
||||||
- Google focus (Gemini API key + Antigravity):
|
- Google focus (Gemini API key + Antigravity):
|
||||||
- Gemini (API key): `LIVE=1 CLAWDBOT_LIVE_GATEWAY=1 CLAWDBOT_LIVE_GATEWAY_ALL_MODELS=1 CLAWDBOT_LIVE_GATEWAY_TOOL_PROBE=1 CLAWDBOT_LIVE_GATEWAY_IMAGE_PROBE=1 CLAWDBOT_LIVE_GATEWAY_MODELS="google/gemini-flash-latest" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
|
- Gemini (API key): `LIVE=1 CLAWDBOT_LIVE_GATEWAY=1 CLAWDBOT_LIVE_GATEWAY_ALL_MODELS=1 CLAWDBOT_LIVE_GATEWAY_TOOL_PROBE=1 CLAWDBOT_LIVE_GATEWAY_IMAGE_PROBE=1 CLAWDBOT_LIVE_GATEWAY_MODELS="google/gemini-3-flash-preview" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
|
||||||
- Antigravity (OAuth): `LIVE=1 CLAWDBOT_LIVE_GATEWAY=1 CLAWDBOT_LIVE_GATEWAY_ALL_MODELS=1 CLAWDBOT_LIVE_GATEWAY_TOOL_PROBE=1 CLAWDBOT_LIVE_GATEWAY_IMAGE_PROBE=1 CLAWDBOT_LIVE_GATEWAY_MODELS="google-antigravity/claude-opus-4-5-thinking,google-antigravity/gemini-3-pro-high" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
|
- Antigravity (OAuth): `LIVE=1 CLAWDBOT_LIVE_GATEWAY=1 CLAWDBOT_LIVE_GATEWAY_ALL_MODELS=1 CLAWDBOT_LIVE_GATEWAY_TOOL_PROBE=1 CLAWDBOT_LIVE_GATEWAY_IMAGE_PROBE=1 CLAWDBOT_LIVE_GATEWAY_MODELS="google-antigravity/claude-opus-4-5-thinking,google-antigravity/gemini-3-pro-high" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- `google/...` uses the Gemini API (API key).
|
||||||
|
- `google-antigravity/...` uses the Antigravity OAuth bridge (Cloud Code Assist-style agent endpoint).
|
||||||
|
- `google-gemini-cli/...` uses the local Gemini CLI on your machine (separate auth + tooling quirks).
|
||||||
|
|
||||||
## Live: model matrix (what we cover)
|
## Live: model matrix (what we cover)
|
||||||
|
|
||||||
There is no fixed “CI model list” (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys.
|
There is no fixed “CI model list” (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys.
|
||||||
|
|
||||||
|
### Modern smoke set (tool calling + image)
|
||||||
|
|
||||||
|
This is the “common models” run we expect to keep working:
|
||||||
|
- OpenAI (non-Codex): `openai/gpt-5.2` (optional: `openai/gpt-5.1`)
|
||||||
|
- OpenAI Codex: `openai-codex/gpt-5.2` (optional: `openai-codex/gpt-5.2-codex`)
|
||||||
|
- Anthropic: `anthropic/claude-opus-4-5` (or `anthropic/claude-sonnet-4-5`)
|
||||||
|
- Google (Gemini API): `google/gemini-3-pro-preview` and `google/gemini-3-flash-preview`
|
||||||
|
- Google (Antigravity): `google-antigravity/claude-opus-4-5-thinking` and `google-antigravity/gemini-3-flash`
|
||||||
|
- Z.AI (GLM): `zai/glm-4.7`
|
||||||
|
- MiniMax: `minimax/minimax-m2.1`
|
||||||
|
|
||||||
|
Run gateway smoke with tools + image:
|
||||||
|
`LIVE=1 CLAWDBOT_LIVE_GATEWAY=1 CLAWDBOT_LIVE_GATEWAY_TOOL_PROBE=1 CLAWDBOT_LIVE_GATEWAY_IMAGE_PROBE=1 CLAWDBOT_LIVE_GATEWAY_MODELS="openai/gpt-5.2,openai-codex/gpt-5.2,anthropic/claude-opus-4-5,google/gemini-3-pro-preview,google/gemini-3-flash-preview,google-antigravity/claude-opus-4-5-thinking,google-antigravity/gemini-3-flash,zai/glm-4.7,minimax/minimax-m2.1" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
|
||||||
|
|
||||||
### Baseline: tool calling (Read + optional Bash)
|
### Baseline: tool calling (Read + optional Bash)
|
||||||
|
|
||||||
Pick at least one per provider family:
|
Pick at least one per provider family:
|
||||||
- OpenAI: `openai/gpt-5.2` (or `openai/gpt-5-mini`)
|
- OpenAI: `openai/gpt-5.2` (or `openai/gpt-5-mini`)
|
||||||
- Anthropic: `anthropic/claude-opus-4-5` (or `anthropic/claude-sonnet-4-5`)
|
- Anthropic: `anthropic/claude-opus-4-5` (or `anthropic/claude-sonnet-4-5`)
|
||||||
- Google: `google/gemini-flash-latest` (or `google/gemini-2.5-pro`)
|
- Google: `google/gemini-3-flash-preview` (or `google/gemini-3-pro-preview`)
|
||||||
- Z.AI (GLM): `zai/glm-4.7`
|
- Z.AI (GLM): `zai/glm-4.7`
|
||||||
- MiniMax: `minimax/minimax-m2.1`
|
- MiniMax: `minimax/minimax-m2.1`
|
||||||
|
|
||||||
|
|||||||
239
src/agents/anthropic.setup-token.live.test.ts
Normal file
239
src/agents/anthropic.setup-token.live.test.ts
Normal file
@@ -0,0 +1,239 @@
|
|||||||
|
import { randomUUID } from "node:crypto";
|
||||||
|
import fs from "node:fs/promises";
|
||||||
|
import os from "node:os";
|
||||||
|
import path from "node:path";
|
||||||
|
|
||||||
|
import { type Api, completeSimple, type Model } from "@mariozechner/pi-ai";
|
||||||
|
import {
|
||||||
|
discoverAuthStorage,
|
||||||
|
discoverModels,
|
||||||
|
} from "@mariozechner/pi-coding-agent";
|
||||||
|
import { describe, expect, it } from "vitest";
|
||||||
|
import {
|
||||||
|
ANTHROPIC_SETUP_TOKEN_PREFIX,
|
||||||
|
validateAnthropicSetupToken,
|
||||||
|
} from "../commands/auth-token.js";
|
||||||
|
import { loadConfig } from "../config/config.js";
|
||||||
|
import { resolveClawdbotAgentDir } from "./agent-paths.js";
|
||||||
|
import {
|
||||||
|
type AuthProfileCredential,
|
||||||
|
ensureAuthProfileStore,
|
||||||
|
saveAuthProfileStore,
|
||||||
|
} from "./auth-profiles.js";
|
||||||
|
import { getApiKeyForModel } from "./model-auth.js";
|
||||||
|
import { normalizeProviderId, parseModelRef } from "./model-selection.js";
|
||||||
|
import { ensureClawdbotModelsJson } from "./models-config.js";
|
||||||
|
|
||||||
|
const LIVE = process.env.LIVE === "1" || process.env.CLAWDBOT_LIVE_TEST === "1";
|
||||||
|
const SETUP_TOKEN_RAW = process.env.CLAWDBOT_LIVE_SETUP_TOKEN?.trim() ?? "";
|
||||||
|
const SETUP_TOKEN_VALUE =
|
||||||
|
process.env.CLAWDBOT_LIVE_SETUP_TOKEN_VALUE?.trim() ?? "";
|
||||||
|
const SETUP_TOKEN_PROFILE =
|
||||||
|
process.env.CLAWDBOT_LIVE_SETUP_TOKEN_PROFILE?.trim() ?? "";
|
||||||
|
const SETUP_TOKEN_MODEL =
|
||||||
|
process.env.CLAWDBOT_LIVE_SETUP_TOKEN_MODEL?.trim() ?? "";
|
||||||
|
|
||||||
|
const ENABLED =
|
||||||
|
LIVE && Boolean(SETUP_TOKEN_RAW || SETUP_TOKEN_VALUE || SETUP_TOKEN_PROFILE);
|
||||||
|
const describeLive = ENABLED ? describe : describe.skip;
|
||||||
|
|
||||||
|
type TokenSource = {
|
||||||
|
agentDir: string;
|
||||||
|
profileId: string;
|
||||||
|
cleanup?: () => Promise<void>;
|
||||||
|
};
|
||||||
|
|
||||||
|
function isSetupToken(value: string): boolean {
|
||||||
|
return value.startsWith(ANTHROPIC_SETUP_TOKEN_PREFIX);
|
||||||
|
}
|
||||||
|
|
||||||
|
function listSetupTokenProfiles(store: {
|
||||||
|
profiles: Record<string, AuthProfileCredential>;
|
||||||
|
}): string[] {
|
||||||
|
return Object.entries(store.profiles)
|
||||||
|
.filter(([, cred]) => {
|
||||||
|
if (cred.type !== "token") return false;
|
||||||
|
if (normalizeProviderId(cred.provider) !== "anthropic") return false;
|
||||||
|
return isSetupToken(cred.token);
|
||||||
|
})
|
||||||
|
.map(([id]) => id);
|
||||||
|
}
|
||||||
|
|
||||||
|
function pickSetupTokenProfile(candidates: string[]): string {
|
||||||
|
const preferred = [
|
||||||
|
"anthropic:setup-token-test",
|
||||||
|
"anthropic:setup-token",
|
||||||
|
"anthropic:default",
|
||||||
|
];
|
||||||
|
for (const id of preferred) {
|
||||||
|
if (candidates.includes(id)) return id;
|
||||||
|
}
|
||||||
|
return candidates[0] ?? "";
|
||||||
|
}
|
||||||
|
|
||||||
|
async function resolveTokenSource(): Promise<TokenSource> {
|
||||||
|
const explicitToken =
|
||||||
|
(SETUP_TOKEN_RAW && isSetupToken(SETUP_TOKEN_RAW) ? SETUP_TOKEN_RAW : "") ||
|
||||||
|
SETUP_TOKEN_VALUE;
|
||||||
|
|
||||||
|
if (explicitToken) {
|
||||||
|
const error = validateAnthropicSetupToken(explicitToken);
|
||||||
|
if (error) {
|
||||||
|
throw new Error(`Invalid setup-token: ${error}`);
|
||||||
|
}
|
||||||
|
const tempDir = await fs.mkdtemp(
|
||||||
|
path.join(os.tmpdir(), "clawdbot-setup-token-"),
|
||||||
|
);
|
||||||
|
const profileId = `anthropic:setup-token-live-${randomUUID()}`;
|
||||||
|
const store = ensureAuthProfileStore(tempDir, {
|
||||||
|
allowKeychainPrompt: false,
|
||||||
|
});
|
||||||
|
store.profiles[profileId] = {
|
||||||
|
type: "token",
|
||||||
|
provider: "anthropic",
|
||||||
|
token: explicitToken,
|
||||||
|
};
|
||||||
|
saveAuthProfileStore(store, tempDir);
|
||||||
|
return {
|
||||||
|
agentDir: tempDir,
|
||||||
|
profileId,
|
||||||
|
cleanup: async () => {
|
||||||
|
await fs.rm(tempDir, { recursive: true, force: true });
|
||||||
|
},
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
const agentDir = resolveClawdbotAgentDir();
|
||||||
|
const store = ensureAuthProfileStore(agentDir, {
|
||||||
|
allowKeychainPrompt: false,
|
||||||
|
});
|
||||||
|
|
||||||
|
const candidates = listSetupTokenProfiles(store);
|
||||||
|
if (SETUP_TOKEN_PROFILE) {
|
||||||
|
if (!candidates.includes(SETUP_TOKEN_PROFILE)) {
|
||||||
|
const available =
|
||||||
|
candidates.length > 0 ? candidates.join(", ") : "(none)";
|
||||||
|
throw new Error(
|
||||||
|
`Setup-token profile "${SETUP_TOKEN_PROFILE}" not found. Available: ${available}.`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return { agentDir, profileId: SETUP_TOKEN_PROFILE };
|
||||||
|
}
|
||||||
|
|
||||||
|
if (
|
||||||
|
SETUP_TOKEN_RAW &&
|
||||||
|
SETUP_TOKEN_RAW !== "1" &&
|
||||||
|
SETUP_TOKEN_RAW !== "auto"
|
||||||
|
) {
|
||||||
|
throw new Error(
|
||||||
|
"CLAWDBOT_LIVE_SETUP_TOKEN did not look like a setup-token. Use CLAWDBOT_LIVE_SETUP_TOKEN_VALUE for raw tokens.",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (candidates.length === 0) {
|
||||||
|
throw new Error(
|
||||||
|
"No Anthropics setup-token profiles found. Set CLAWDBOT_LIVE_SETUP_TOKEN_VALUE or CLAWDBOT_LIVE_SETUP_TOKEN_PROFILE.",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return { agentDir, profileId: pickSetupTokenProfile(candidates) };
|
||||||
|
}
|
||||||
|
|
||||||
|
function pickModel(models: Array<Model<Api>>, raw?: string): Model<Api> | null {
|
||||||
|
const normalized = raw?.trim() ?? "";
|
||||||
|
if (normalized) {
|
||||||
|
const parsed = parseModelRef(normalized, "anthropic");
|
||||||
|
if (!parsed) return null;
|
||||||
|
return (
|
||||||
|
models.find(
|
||||||
|
(model) =>
|
||||||
|
normalizeProviderId(model.provider) === parsed.provider &&
|
||||||
|
model.id === parsed.model,
|
||||||
|
) ?? null
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const preferred = [
|
||||||
|
"claude-opus-4-5",
|
||||||
|
"claude-sonnet-4-5",
|
||||||
|
"claude-sonnet-4-0",
|
||||||
|
"claude-haiku-3-5",
|
||||||
|
];
|
||||||
|
for (const id of preferred) {
|
||||||
|
const match = models.find((model) => model.id === id);
|
||||||
|
if (match) return match;
|
||||||
|
}
|
||||||
|
return models[0] ?? null;
|
||||||
|
}
|
||||||
|
|
||||||
|
describeLive("live anthropic setup-token", () => {
|
||||||
|
it(
|
||||||
|
"completes using a setup-token profile",
|
||||||
|
async () => {
|
||||||
|
const tokenSource = await resolveTokenSource();
|
||||||
|
try {
|
||||||
|
const cfg = loadConfig();
|
||||||
|
await ensureClawdbotModelsJson(cfg, tokenSource.agentDir);
|
||||||
|
|
||||||
|
const authStorage = discoverAuthStorage(tokenSource.agentDir);
|
||||||
|
const modelRegistry = discoverModels(authStorage, tokenSource.agentDir);
|
||||||
|
const all = Array.isArray(modelRegistry)
|
||||||
|
? modelRegistry
|
||||||
|
: modelRegistry.getAll();
|
||||||
|
const candidates = all.filter(
|
||||||
|
(model) => normalizeProviderId(model.provider) === "anthropic",
|
||||||
|
) as Array<Model<Api>>;
|
||||||
|
expect(candidates.length).toBeGreaterThan(0);
|
||||||
|
|
||||||
|
const model = pickModel(candidates, SETUP_TOKEN_MODEL);
|
||||||
|
if (!model) {
|
||||||
|
throw new Error(
|
||||||
|
SETUP_TOKEN_MODEL
|
||||||
|
? `Model not found: ${SETUP_TOKEN_MODEL}`
|
||||||
|
: "No Anthropic models available.",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const apiKeyInfo = await getApiKeyForModel({
|
||||||
|
model,
|
||||||
|
cfg,
|
||||||
|
profileId: tokenSource.profileId,
|
||||||
|
agentDir: tokenSource.agentDir,
|
||||||
|
});
|
||||||
|
const tokenError = validateAnthropicSetupToken(apiKeyInfo.apiKey);
|
||||||
|
if (tokenError) {
|
||||||
|
throw new Error(
|
||||||
|
`Resolved profile is not a setup-token: ${tokenError}`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const res = await completeSimple(
|
||||||
|
model,
|
||||||
|
{
|
||||||
|
messages: [
|
||||||
|
{
|
||||||
|
role: "user",
|
||||||
|
content: "Reply with the word ok.",
|
||||||
|
timestamp: Date.now(),
|
||||||
|
},
|
||||||
|
],
|
||||||
|
},
|
||||||
|
{
|
||||||
|
apiKey: apiKeyInfo.apiKey,
|
||||||
|
maxTokens: 64,
|
||||||
|
temperature: 0,
|
||||||
|
},
|
||||||
|
);
|
||||||
|
const text = res.content
|
||||||
|
.filter((block) => block.type === "text")
|
||||||
|
.map((block) => block.text.trim())
|
||||||
|
.join(" ");
|
||||||
|
expect(text.toLowerCase()).toContain("ok");
|
||||||
|
} finally {
|
||||||
|
if (tokenSource.cleanup) {
|
||||||
|
await tokenSource.cleanup();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
5 * 60 * 1000,
|
||||||
|
);
|
||||||
|
});
|
||||||
Reference in New Issue
Block a user