feat(gateway): add OpenResponses /v1/responses endpoint

Add a new `/v1/responses` endpoint implementing the OpenResponses API standard for agentic workflows. This provides: - Item-based input (messages, function_call_output, reasoning) - Semantic streaming events (response.created, response.output_text.delta, response.completed, etc.) - Full SSE event support with both event: and data: lines - Configuration via gateway.http.endpoints.responses.enabled The endpoint is disabled by default and can be enabled independently from the existing Chat Completions endpoint. Phase 1 implementation supports: - String or ItemParam[] input - system/developer/user/assistant message roles - function_call_output items - instructions parameter - Agent routing via headers or model parameter - Session key management Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 10:44:48 +01:00
parent 7f6e87e918
commit f4b03599f0
11 changed files with 1563 additions and 0 deletions
--- a/docs/experiments/plans/openresponses-gateway.md
+++ b/docs/experiments/plans/openresponses-gateway.md
@@ -0,0 +1,110 @@
+---
+summary: "Plan: Add OpenResponses /v1/responses endpoint and deprecate chat completions cleanly"
+owner: "clawdbot"
+status: "draft"
+last_updated: "2026-01-19"
+---
+
+# OpenResponses Gateway Integration Plan
+
+## Context
+Clawdbot Gateway currently exposes a minimal OpenAI-compatible Chat Completions endpoint at
+`/v1/chat/completions` (see [OpenAI Chat Completions](/gateway/openai-http-api)).
+
+Open Responses is an open inference standard based on the OpenAI Responses API. It is designed
+for agentic workflows and uses item-based inputs plus semantic streaming events. The OpenResponses
+spec defines `/v1/responses`, not `/v1/chat/completions`.
+
+## Goals
+- Add a `/v1/responses` endpoint that adheres to OpenResponses semantics.
+- Keep Chat Completions as a compatibility layer that is easy to disable and eventually remove.
+- Standardize validation and parsing with isolated, reusable schemas.
+
+## Non-goals
+- Full OpenResponses feature parity in the first pass (images, files, hosted tools).
+- Replacing internal agent execution logic or tool orchestration.
+- Changing the existing `/v1/chat/completions` behavior during the first phase.
+
+## Research Summary
+Sources: OpenResponses OpenAPI, OpenResponses specification site, and the Hugging Face blog post.
+
+Key points extracted:
+- `POST /v1/responses` accepts `CreateResponseBody` fields like `model`, `input` (string or
+  `ItemParam[]`), `instructions`, `tools`, `tool_choice`, `stream`, `max_output_tokens`, and
+  `max_tool_calls`.
+- `ItemParam` is a discriminated union of:
+  - `message` items with roles `system`, `developer`, `user`, `assistant`
+  - `function_call` and `function_call_output`
+  - `reasoning`
+  - `item_reference`
+- Successful responses return a `ResponseResource` with `object: "response"`, `status`, and
+  `output` items.
+- Streaming uses semantic events such as:
+  - `response.created`, `response.in_progress`, `response.completed`, `response.failed`
+  - `response.output_item.added`, `response.output_item.done`
+  - `response.content_part.added`, `response.content_part.done`
+  - `response.output_text.delta`, `response.output_text.done`
+- The spec requires:
+  - `Content-Type: text/event-stream`
+  - `event:` must match the JSON `type` field
+  - terminal event must be literal `[DONE]`
+- Reasoning items may expose `content`, `encrypted_content`, and `summary`.
+- HF examples include `OpenResponses-Version: latest` in requests (optional header).
+
+## Proposed Architecture
+- Add `src/gateway/open-responses.schema.ts` containing Zod schemas only (no gateway imports).
+- Add `src/gateway/openresponses-http.ts` (or `open-responses-http.ts`) for `/v1/responses`.
+- Keep `src/gateway/openai-http.ts` intact as a legacy compatibility adapter.
+- Add config `gateway.http.endpoints.responses.enabled` (default `false`).
+- Keep `gateway.http.endpoints.chatCompletions.enabled` independent; allow both endpoints to be
+  toggled separately.
+- Emit a startup warning when Chat Completions is enabled to signal legacy status.
+
+## Deprecation Path for Chat Completions
+- Maintain strict module boundaries: no shared schema types between responses and chat completions.
+- Make Chat Completions opt-in by config so it can be disabled without code changes.
+- Update docs to label Chat Completions as legacy once `/v1/responses` is stable.
+- Optional future step: map Chat Completions requests to the Responses handler for a simpler
+  removal path.
+
+## Phase 1 Support Subset
+- Accept `input` as string or `ItemParam[]` with message roles and `function_call_output`.
+- Extract system and developer messages into `extraSystemPrompt`.
+- Use the most recent `user` or `function_call_output` as the current message for agent runs.
+- Reject unsupported content parts (image/file) with `invalid_request_error`.
+- Return a single assistant message with `output_text` content.
+- Return `usage` with zeroed values until token accounting is wired.
+
+## Validation Strategy (No SDK)
+- Implement Zod schemas for the supported subset of:
+  - `CreateResponseBody`
+  - `ItemParam` + message content part unions
+  - `ResponseResource`
+  - Streaming event shapes used by the gateway
+- Keep schemas in a single, isolated module to avoid drift and allow future codegen.
+
+## Streaming Implementation (Phase 1)
+- SSE lines with both `event:` and `data:`.
+- Required sequence (minimum viable):
+  - `response.created`
+  - `response.output_item.added`
+  - `response.content_part.added`
+  - `response.output_text.delta` (repeat as needed)
+  - `response.output_text.done`
+  - `response.content_part.done`
+  - `response.completed`
+  - `[DONE]`
+
+## Tests and Verification Plan
+- Add e2e coverage for `/v1/responses`:
+  - Auth required
+  - Non-stream response shape
+  - Stream event ordering and `[DONE]`
+  - Session routing with headers and `user`
+- Keep `src/gateway/openai-http.e2e.test.ts` unchanged.
+- Manual: curl to `/v1/responses` with `stream: true` and verify event ordering and terminal
+  `[DONE]`.
+
+## Doc Updates (Follow-up)
+- Add a new docs page for `/v1/responses` usage and examples.
+- Update `/gateway/openai-http-api` with a legacy note and pointer to `/v1/responses`.