Verbose: send tool result metadata only

This commit is contained in:
Peter Steinberger
2025-12-03 09:40:05 +00:00
parent 394c751d7d
commit 318166f8b0
8 changed files with 108 additions and 31 deletions

View File

@@ -4,7 +4,7 @@
### Highlights ### Highlights
- **Thinking directives & state:** `/t|/think|/thinking <level>` (aliases off|minimal|low|medium|high|max/highest). Inline applies to that message; directive-only message pins the level for the session; `/think:off` clears. Resolution: inline > session override > `inbound.reply.thinkingDefault` > off. Pi/Tau get `--thinking <level>` (except off); other agents append cue words (`think``think hard``think harder``ultrathink`). Heartbeat probe uses `HEARTBEAT /think:high`. - **Thinking directives & state:** `/t|/think|/thinking <level>` (aliases off|minimal|low|medium|high|max/highest). Inline applies to that message; directive-only message pins the level for the session; `/think:off` clears. Resolution: inline > session override > `inbound.reply.thinkingDefault` > off. Pi/Tau get `--thinking <level>` (except off); other agents append cue words (`think``think hard``think harder``ultrathink`). Heartbeat probe uses `HEARTBEAT /think:high`.
- **Verbose directives + session hints:** `/v|/verbose on|full|off` mirrors thinking: inline > session > config default. Directive-only replies with an acknowledgement; invalid levels return a hint. When enabled, tool results from JSON-emitting agents (Pi/Tau, etc.) are forwarded as `[🛠️ <tool-name>]` messages (now streamed as they happen), and new sessions surface a `🧭 New session: <id>` hint. - **Verbose directives + session hints:** `/v|/verbose on|full|off` mirrors thinking: inline > session > config default. Directive-only replies with an acknowledgement; invalid levels return a hint. When enabled, tool results from JSON-emitting agents (Pi/Tau, etc.) are forwarded as metadata-only `[🛠️ <tool-name>]` messages (now streamed as they happen), and new sessions surface a `🧭 New session: <id>` hint.
- **Directive confirmations:** Directive-only messages now reply with an acknowledgement (`Thinking level set to high.` / `Thinking disabled.`) and reject unknown levels with a helpful hint (state is unchanged). - **Directive confirmations:** Directive-only messages now reply with an acknowledgement (`Thinking level set to high.` / `Thinking disabled.`) and reject unknown levels with a helpful hint (state is unchanged).
- **Pi/Tau stability:** RPC replies buffered until the assistant turn finishes; parsers return consistent `texts[]`; web auto-replies keep a warm Tau RPC process to avoid cold starts. - **Pi/Tau stability:** RPC replies buffered until the assistant turn finishes; parsers return consistent `texts[]`; web auto-replies keep a warm Tau RPC process to avoid cold starts.
- **Claude prompt flow:** One-time `sessionIntro` with per-message `/think:high` bodyPrefix; system prompt always sent on first turn even with `sendSystemOnce`. - **Claude prompt flow:** One-time `sessionIntro` with per-message `/think:high` bodyPrefix; system prompt always sent on first turn even with `sendSystemOnce`.

View File

@@ -166,7 +166,7 @@ warelay supports running on the same phone number you message from—you chat wi
- Levels: `on|full` (same) or `off` (default). Use `/v on`, `/verbose:full`, `/v off`, etc.; colon optional. - Levels: `on|full` (same) or `off` (default). Use `/v on`, `/verbose:full`, `/v off`, etc.; colon optional.
- Directive-only message sets a session-level verbose flag (`Verbose logging enabled./disabled.`); invalid levels reply with a hint and dont change state. - Directive-only message sets a session-level verbose flag (`Verbose logging enabled./disabled.`); invalid levels reply with a hint and dont change state.
- Inline directive applies only to that message; resolution: inline > session default > `inbound.reply.verboseDefault` (config) > off. - Inline directive applies only to that message; resolution: inline > session default > `inbound.reply.verboseDefault` (config) > off.
- When verbose is on **and the agent emits structured tool results (Pi/Tau and other JSON-emitting agents)**, tool results are sent back as separate messages prefixed with `🛠️`. - When verbose is on **and the agent emits structured tool results (Pi/Tau and other JSON-emitting agents)**, only tool metadata is forwarded: each tool result becomes `[🛠️ <tool-name>]` (output/body is not inlined).
- Starting a new session while verbose is on adds a first reply like `🧭 New session: <id>` so you can correlate runs. - Starting a new session while verbose is on adds a first reply like `🧭 New session: <id>` so you can correlate runs.
### Logging (optional) ### Logging (optional)

View File

@@ -28,7 +28,7 @@
- Levels: `on|full` or `off` (default). - Levels: `on|full` or `off` (default).
- Directive-only message toggles session verbose and replies `Verbose logging enabled.` / `Verbose logging disabled.`; invalid levels return a hint without changing state. - Directive-only message toggles session verbose and replies `Verbose logging enabled.` / `Verbose logging disabled.`; invalid levels return a hint without changing state.
- Inline directive affects only that message; session/global defaults apply otherwise. - Inline directive affects only that message; session/global defaults apply otherwise.
- When verbose is on, agents that emit structured tool results (Pi/Tau, other JSON agents) send each tool result back as its own message, prefixed with `🛠️`. - When verbose is on, agents that emit structured tool results (Pi/Tau, other JSON agents) send each tool result back as its own metadata-only message, prefixed with `[🛠️ <tool-name>]` (the tool output itself is not forwarded).
## Heartbeats ## Heartbeats
- Heartbeat probe body is `HEARTBEAT /think:high`, so it always asks for max thinking on the probe. Inline directive wins; session/global defaults are used only when no directive is present. - Heartbeat probe body is `HEARTBEAT /think:high`, so it always asks for max thinking on the probe. Inline directive wins; session/global defaults are used only when no directive is present.

View File

@@ -67,6 +67,15 @@ describe("agent buildArgs + parseOutput helpers", () => {
expect((parsed.meta?.usage as { output?: number })?.output).toBe(5); expect((parsed.meta?.usage as { output?: number })?.output).toBe(5);
}); });
it("piSpec carries tool names when present", () => {
const stdout =
'{"type":"message_end","message":{"role":"tool_result","name":"bash","content":[{"type":"text","text":"ls output"}]}}';
const parsed = piSpec.parseOutput(stdout);
const tool = parsed.toolResults?.[0] as { text?: string; toolName?: string };
expect(tool?.text).toBe("ls output");
expect(tool?.toolName).toBe("bash");
});
it("codexSpec parses agent_message and aggregates usage", () => { it("codexSpec parses agent_message and aggregates usage", () => {
const stdout = [ const stdout = [
'{"type":"item.completed","item":{"type":"agent_message","text":"hi there"}}', '{"type":"item.completed","item":{"type":"agent_message","text":"hi there"}}',

View File

@@ -1,6 +1,11 @@
import path from "node:path"; import path from "node:path";
import type { AgentMeta, AgentParseResult, AgentSpec } from "./types.js"; import type {
AgentMeta,
AgentParseResult,
AgentSpec,
AgentToolResult,
} from "./types.js";
type PiAssistantMessage = { type PiAssistantMessage = {
role?: string; role?: string;
@@ -9,15 +14,37 @@ type PiAssistantMessage = {
model?: string; model?: string;
provider?: string; provider?: string;
stopReason?: string; stopReason?: string;
name?: string;
toolName?: string;
tool_call_id?: string;
toolCallId?: string; toolCallId?: string;
}; };
function inferToolName(msg: PiAssistantMessage): string | undefined {
const candidates = [
msg.toolName,
msg.name,
msg.toolCallId,
msg.tool_call_id,
]
.map((c) => (typeof c === "string" ? c.trim() : ""))
.filter(Boolean);
if (candidates.length) return candidates[0];
if (msg.role && msg.role.includes(":")) {
const suffix = msg.role.split(":").slice(1).join(":").trim();
if (suffix) return suffix;
}
return undefined;
}
function parsePiJson(raw: string): AgentParseResult { function parsePiJson(raw: string): AgentParseResult {
const lines = raw.split(/\n+/).filter((l) => l.trim().startsWith("{")); const lines = raw.split(/\n+/).filter((l) => l.trim().startsWith("{"));
// Collect only completed assistant messages (skip streaming updates/toolcalls). // Collect only completed assistant messages (skip streaming updates/toolcalls).
const texts: string[] = []; const texts: string[] = [];
const toolResults: string[] = []; const toolResults: AgentToolResult[] = [];
let lastAssistant: PiAssistantMessage | undefined; let lastAssistant: PiAssistantMessage | undefined;
let lastPushed: string | undefined; let lastPushed: string | undefined;
@@ -59,7 +86,9 @@ function parsePiJson(raw: string): AgentParseResult {
.map((c) => c.text) .map((c) => c.text)
.join("\n") .join("\n")
.trim(); .trim();
if (toolText) toolResults.push(toolText); if (toolText) {
toolResults.push({ text: toolText, toolName: inferToolName(msg) });
}
} }
} catch { } catch {
// ignore malformed lines // ignore malformed lines

View File

@@ -15,11 +15,16 @@ export type AgentMeta = {
extra?: Record<string, unknown>; extra?: Record<string, unknown>;
}; };
export type AgentToolResult = {
text: string;
toolName?: string;
};
export type AgentParseResult = { export type AgentParseResult = {
// Plural to support agents that emit multiple assistant turns per prompt. // Plural to support agents that emit multiple assistant turns per prompt.
texts?: string[]; texts?: string[];
mediaUrls?: string[]; mediaUrls?: string[];
toolResults?: string[]; toolResults?: Array<string | AgentToolResult>;
meta?: AgentMeta; meta?: AgentMeta;
}; };

View File

@@ -2,7 +2,7 @@ import fs from "node:fs/promises";
import path from "node:path"; import path from "node:path";
import { type AgentKind, getAgentSpec } from "../agents/index.js"; import { type AgentKind, getAgentSpec } from "../agents/index.js";
import type { AgentMeta } from "../agents/types.js"; import type { AgentMeta, AgentToolResult } from "../agents/types.js";
import type { WarelayConfig } from "../config/config.js"; import type { WarelayConfig } from "../config/config.js";
import { isVerbose, logVerbose } from "../globals.js"; import { isVerbose, logVerbose } from "../globals.js";
import { logError } from "../logger.js"; import { logError } from "../logger.js";
@@ -53,6 +53,51 @@ export type CommandReplyResult = {
meta: CommandReplyMeta; meta: CommandReplyMeta;
}; };
type ToolMessageLike = {
name?: string;
toolName?: string;
tool_call_id?: string;
toolCallId?: string;
role?: string;
};
function inferToolName(message?: ToolMessageLike): string | undefined {
if (!message) return undefined;
const candidates = [
message.toolName,
message.name,
message.toolCallId,
message.tool_call_id,
]
.map((c) => (typeof c === "string" ? c.trim() : ""))
.filter(Boolean);
if (candidates.length) return candidates[0];
if (message.role && message.role.includes(":")) {
const suffix = message.role.split(":").slice(1).join(":").trim();
if (suffix) return suffix;
}
return undefined;
}
function normalizeToolResults(
toolResults?: Array<string | AgentToolResult>,
): AgentToolResult[] {
if (!toolResults) return [];
return toolResults
.map((tr) => (typeof tr === "string" ? { text: tr } : tr))
.map((tr) => ({
text: (tr.text ?? "").trim(),
toolName: tr.toolName?.trim() || undefined,
}))
.filter((tr) => tr.text.length > 0);
}
function formatToolPrefix(toolName?: string) {
const label = toolName?.trim() || "tool";
return `[🛠️ ${label}]`;
}
export function summarizeClaudeMetadata(payload: unknown): string | undefined { export function summarizeClaudeMetadata(payload: unknown): string | undefined {
if (!payload || typeof payload !== "object") return undefined; if (!payload || typeof payload !== "object") return undefined;
const obj = payload as Record<string, unknown>; const obj = payload as Record<string, unknown>;
@@ -289,23 +334,14 @@ export async function runCommandReply(
ev.message?.role === "tool_result" && ev.message?.role === "tool_result" &&
Array.isArray(ev.message.content) Array.isArray(ev.message.content)
) { ) {
const text = ( const toolName = inferToolName(ev.message);
ev.message.content as Array<{ text?: string }> const prefix = formatToolPrefix(toolName);
) const { text: cleanedText, mediaUrls: mediaFound } =
.map((c) => c.text) splitMediaFromOutput(prefix);
.filter((t): t is string => !!t) void onPartialReply({
.join("\n") text: cleanedText,
.trim(); mediaUrls: mediaFound?.length ? mediaFound : undefined,
if (text) { } as ReplyPayload);
const { text: cleanedText, mediaUrls: mediaFound } =
splitMediaFromOutput(`🛠️ ${text}`);
void onPartialReply({
text: cleanedText,
mediaUrls: mediaFound?.length
? mediaFound
: undefined,
} as ReplyPayload);
}
} }
} catch { } catch {
// ignore malformed lines // ignore malformed lines
@@ -341,8 +377,7 @@ export async function runCommandReply(
// Collect assistant texts and tool results from parseOutput (tau RPC can emit many). // Collect assistant texts and tool results from parseOutput (tau RPC can emit many).
const parsedTexts = const parsedTexts =
parsed?.texts?.map((t) => t.trim()).filter(Boolean) ?? []; parsed?.texts?.map((t) => t.trim()).filter(Boolean) ?? [];
const parsedToolResults = const parsedToolResults = normalizeToolResults(parsed?.toolResults);
parsed?.toolResults?.map((t) => t.trim()).filter(Boolean) ?? [];
type ReplyItem = { text: string; media?: string[] }; type ReplyItem = { text: string; media?: string[] };
const replyItems: ReplyItem[] = []; const replyItems: ReplyItem[] = [];
@@ -352,7 +387,7 @@ export async function runCommandReply(
if (includeToolResultsInline) { if (includeToolResultsInline) {
for (const tr of parsedToolResults) { for (const tr of parsedToolResults) {
const prefixed = `🛠️ ${tr}`; const prefixed = formatToolPrefix(tr.toolName);
const { text: cleanedText, mediaUrls: mediaFound } = const { text: cleanedText, mediaUrls: mediaFound } =
splitMediaFromOutput(prefixed); splitMediaFromOutput(prefixed);
replyItems.push({ replyItems.push({

View File

@@ -719,7 +719,7 @@ describe("config and templating", () => {
const rpcSpy = vi.spyOn(tauRpc, "runPiRpc").mockResolvedValue({ const rpcSpy = vi.spyOn(tauRpc, "runPiRpc").mockResolvedValue({
stdout: stdout:
'{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"summary"}]}}\n' + '{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"summary"}]}}\n' +
'{"type":"message_end","message":{"role":"tool_result","content":[{"type":"text","text":"ls output"}]}}', '{"type":"message_end","message":{"role":"tool_result","name":"bash","content":[{"type":"text","text":"ls output"}]}}',
stderr: "", stderr: "",
code: 0, code: 0,
signal: null, signal: null,
@@ -744,8 +744,7 @@ describe("config and templating", () => {
expect(rpcSpy).toHaveBeenCalled(); expect(rpcSpy).toHaveBeenCalled();
const payloads = Array.isArray(res) ? res : res ? [res] : []; const payloads = Array.isArray(res) ? res : res ? [res] : [];
expect(payloads.length).toBeGreaterThanOrEqual(2); expect(payloads.length).toBeGreaterThanOrEqual(2);
expect(payloads[0]?.text).toContain("🛠️"); expect(payloads[0]?.text).toBe("[🛠️ bash]");
expect(payloads[0]?.text).toContain("ls output");
expect(payloads[1]?.text).toContain("summary"); expect(payloads[1]?.text).toContain("summary");
}); });