refactor: align agent lifecycle
This commit is contained in:
61
docs/agent-loop.md
Normal file
61
docs/agent-loop.md
Normal file
@@ -0,0 +1,61 @@
|
||||
---
|
||||
summary: "Agent loop lifecycle, streams, and wait semantics"
|
||||
read_when:
|
||||
- You need an exact walkthrough of the agent loop or lifecycle events
|
||||
---
|
||||
# Agent Loop (Clawdis)
|
||||
|
||||
Short, exact flow of one agent run. Source of truth: current code in `src/`.
|
||||
|
||||
## Entry points
|
||||
- Gateway RPC: `agent` and `agent.wait` in `src/gateway/server-methods/agent.ts`.
|
||||
- CLI: `agentCommand` in `src/commands/agent.ts`.
|
||||
|
||||
## High-level flow
|
||||
1) `agent` RPC validates params, resolves session (sessionKey/sessionId), persists session metadata, returns `{ runId, acceptedAt }` immediately.
|
||||
2) `agentCommand` runs the agent:
|
||||
- resolves model + thinking/verbose defaults
|
||||
- loads skills snapshot
|
||||
- calls `runEmbeddedPiAgent` (pi-agent-core runtime)
|
||||
- emits **lifecycle end/error** if the embedded loop does not emit one
|
||||
3) `runEmbeddedPiAgent`:
|
||||
- builds `AgentSession` and subscribes to pi events
|
||||
- streams assistant deltas + tool events
|
||||
- enforces timeout -> aborts run if exceeded
|
||||
- returns payloads + usage metadata
|
||||
4) `subscribeEmbeddedPiSession` bridges pi-agent-core events to Clawdis `agent` stream:
|
||||
- tool events => `stream: "tool"`
|
||||
- assistant deltas => `stream: "assistant"`
|
||||
- lifecycle events => `stream: "lifecycle"` (`phase: "start" | "end" | "error"`)
|
||||
5) `agent.wait` uses `waitForAgentJob`:
|
||||
- waits for **lifecycle end/error** for `runId`
|
||||
- returns `{ status: ok|error|timeout, startedAt, endedAt, error? }`
|
||||
|
||||
## Event streams (today)
|
||||
- `lifecycle`: emitted by `subscribeEmbeddedPiSession` (and as a fallback by `agentCommand`)
|
||||
- `assistant`: streamed deltas from pi-agent-core
|
||||
- `tool`: streamed tool events from pi-agent-core
|
||||
|
||||
## Chat surface handling
|
||||
- `createAgentEventHandler` in `src/gateway/server-chat.ts`:
|
||||
- buffers assistant deltas
|
||||
- emits chat `delta` messages
|
||||
- emits chat `final` when **lifecycle end/error** arrives
|
||||
|
||||
## Timeouts
|
||||
- `agent.wait` default: 30s (just the wait). `timeoutMs` param overrides.
|
||||
- Agent runtime: `agent.timeoutSeconds` default 600s; enforced in `runEmbeddedPiAgent` abort timer.
|
||||
|
||||
## Where things can end early
|
||||
- Agent timeout (abort)
|
||||
- AbortSignal (cancel)
|
||||
- Gateway disconnect or RPC timeout
|
||||
- `agent.wait` timeout (wait-only, does not stop agent)
|
||||
|
||||
## Files
|
||||
- `src/gateway/server-methods/agent.ts`
|
||||
- `src/gateway/server-methods/agent-job.ts`
|
||||
- `src/commands/agent.ts`
|
||||
- `src/agents/pi-embedded-runner.ts`
|
||||
- `src/agents/pi-embedded-subscribe.ts`
|
||||
- `src/gateway/server-chat.ts`
|
||||
65
docs/refactor/agent-loop.md
Normal file
65
docs/refactor/agent-loop.md
Normal file
@@ -0,0 +1,65 @@
|
||||
---
|
||||
summary: "Refactor plan: unify agent lifecycle events and wait semantics"
|
||||
read_when:
|
||||
- Refactoring agent lifecycle events or wait behavior
|
||||
---
|
||||
# Refactor: Agent Loop
|
||||
|
||||
Goal: align Clawdis run lifecycle with pi/mom semantics, remove ambiguity between "job" and "agent_end".
|
||||
|
||||
## Problem
|
||||
- Two lifecycles today:
|
||||
- `job` (gateway wrapper) => used by `agent.wait` + chat final
|
||||
- pi-agent `agent_end` (inner loop) => only logged
|
||||
- This can finalize early (job done) while late assistant deltas still arrive.
|
||||
- `afterMs` and timeouts can cause false timeouts in `agent.wait`.
|
||||
|
||||
## Reference (mom)
|
||||
- Single lifecycle: `agent_start`/`agent_end` from pi-agent-core event stream.
|
||||
- `waitForIdle()` resolves on `agent_end`.
|
||||
- No separate job state exposed to clients.
|
||||
|
||||
## Proposed refactor (breaking allowed)
|
||||
1) Replace public `job` stream with `lifecycle` stream
|
||||
- `stream: "lifecycle"`
|
||||
- `data: { phase: "start" | "end" | "error", startedAt, endedAt, error? }`
|
||||
2) `agent.wait` waits on lifecycle end/error only
|
||||
- remove `afterMs`
|
||||
- return `{ runId, status, startedAt, endedAt, error? }`
|
||||
3) Chat final emitted on lifecycle end only
|
||||
- deltas still from `assistant` stream
|
||||
4) Centralize run registry
|
||||
- one map keyed by runId: sessionKey, startedAt, lastSeq, bufferedText
|
||||
- clear on lifecycle end
|
||||
|
||||
## Implementation outline
|
||||
- `src/agents/pi-embedded-subscribe.ts`
|
||||
- emit lifecycle start/end events (translate pi `agent_start`/`agent_end`)
|
||||
- `src/infra/agent-events.ts`
|
||||
- add `"lifecycle"` to stream type
|
||||
- `src/gateway/protocol/schema.ts`
|
||||
- update AgentEvent schema; update AgentWait params (remove afterMs, add status)
|
||||
- `src/gateway/server-methods/agent-job.ts`
|
||||
- rename to `agent-wait.ts` or similar; wait on lifecycle end/error
|
||||
- `src/gateway/server-chat.ts`
|
||||
- finalize on lifecycle end (not job)
|
||||
- `src/commands/agent.ts`
|
||||
- stop emitting `job` externally (keep internal log if needed)
|
||||
|
||||
## Migration notes (breaking)
|
||||
- Update all callers of `agent.wait` to new response shape.
|
||||
- Update tests that expect `timeout` based on job events.
|
||||
- If any UI relies on job state, map lifecycle instead.
|
||||
|
||||
## Risks
|
||||
- If lifecycle events are dropped, wait/chat could hang; add timeout in `agent.wait` to fail fast.
|
||||
- Late deltas after lifecycle end should be ignored; keep seq tracking + drop.
|
||||
|
||||
## Acceptance
|
||||
- One lifecycle visible to clients.
|
||||
- `agent.wait` resolves when agent loop ends, not wrapper completion.
|
||||
- Chat final never emits before last assistant delta.
|
||||
|
||||
## Rollout (if we wanted safety)
|
||||
- Gate with config flag `agent.lifecycleMode = "legacy"|"refactor"`.
|
||||
- Remove legacy after one release.
|
||||
Reference in New Issue
Block a user