- Add input_image and input_file support with SSRF protection - Add client-side tools (Hosted Tools) support - Add turn-based tool flow with function_call_output handling - Export buildAgentPrompt for testing
5.0 KiB
5.0 KiB
summary, owner, status, last_updated
| summary | owner | status | last_updated |
|---|---|---|---|
| Plan: Add OpenResponses /v1/responses endpoint and deprecate chat completions cleanly | clawdbot | draft | 2026-01-19 |
OpenResponses Gateway Integration Plan
Context
Clawdbot Gateway currently exposes a minimal OpenAI-compatible Chat Completions endpoint at
/v1/chat/completions (see OpenAI Chat Completions).
Open Responses is an open inference standard based on the OpenAI Responses API. It is designed
for agentic workflows and uses item-based inputs plus semantic streaming events. The OpenResponses
spec defines /v1/responses, not /v1/chat/completions.
Goals
- Add a
/v1/responsesendpoint that adheres to OpenResponses semantics. - Keep Chat Completions as a compatibility layer that is easy to disable and eventually remove.
- Standardize validation and parsing with isolated, reusable schemas.
Non-goals
- Full OpenResponses feature parity in the first pass (images, files, hosted tools).
- Replacing internal agent execution logic or tool orchestration.
- Changing the existing
/v1/chat/completionsbehavior during the first phase.
Research Summary
Sources: OpenResponses OpenAPI, OpenResponses specification site, and the Hugging Face blog post.
Key points extracted:
POST /v1/responsesacceptsCreateResponseBodyfields likemodel,input(string orItemParam[]),instructions,tools,tool_choice,stream,max_output_tokens, andmax_tool_calls.ItemParamis a discriminated union of:messageitems with rolessystem,developer,user,assistantfunction_callandfunction_call_outputreasoningitem_reference
- Successful responses return a
ResponseResourcewithobject: "response",status, andoutputitems. - Streaming uses semantic events such as:
response.created,response.in_progress,response.completed,response.failedresponse.output_item.added,response.output_item.doneresponse.content_part.added,response.content_part.doneresponse.output_text.delta,response.output_text.done
- The spec requires:
Content-Type: text/event-streamevent:must match the JSONtypefield- terminal event must be literal
[DONE]
- Reasoning items may expose
content,encrypted_content, andsummary. - HF examples include
OpenResponses-Version: latestin requests (optional header).
Proposed Architecture
- Add
src/gateway/open-responses.schema.tscontaining Zod schemas only (no gateway imports). - Add
src/gateway/openresponses-http.ts(oropen-responses-http.ts) for/v1/responses. - Keep
src/gateway/openai-http.tsintact as a legacy compatibility adapter. - Add config
gateway.http.endpoints.responses.enabled(defaultfalse). - Keep
gateway.http.endpoints.chatCompletions.enabledindependent; allow both endpoints to be toggled separately. - Emit a startup warning when Chat Completions is enabled to signal legacy status.
Deprecation Path for Chat Completions
- Maintain strict module boundaries: no shared schema types between responses and chat completions.
- Make Chat Completions opt-in by config so it can be disabled without code changes.
- Update docs to label Chat Completions as legacy once
/v1/responsesis stable. - Optional future step: map Chat Completions requests to the Responses handler for a simpler removal path.
Phase 1 Support Subset
- Accept
inputas string orItemParam[]with message roles andfunction_call_output. - Extract system and developer messages into
extraSystemPrompt. - Use the most recent
userorfunction_call_outputas the current message for agent runs. - Reject unsupported content parts (image/file) with
invalid_request_error. - Return a single assistant message with
output_textcontent. - Return
usagewith zeroed values until token accounting is wired.
Validation Strategy (No SDK)
- Implement Zod schemas for the supported subset of:
CreateResponseBodyItemParam+ message content part unionsResponseResource- Streaming event shapes used by the gateway
- Keep schemas in a single, isolated module to avoid drift and allow future codegen.
Streaming Implementation (Phase 1)
- SSE lines with both
event:anddata:. - Required sequence (minimum viable):
response.createdresponse.output_item.addedresponse.content_part.addedresponse.output_text.delta(repeat as needed)response.output_text.doneresponse.content_part.doneresponse.completed[DONE]
Tests and Verification Plan
- Add e2e coverage for
/v1/responses:- Auth required
- Non-stream response shape
- Stream event ordering and
[DONE] - Session routing with headers and
user
- Keep
src/gateway/openai-http.e2e.test.tsunchanged. - Manual: curl to
/v1/responseswithstream: trueand verify event ordering and terminal[DONE].
Doc Updates (Follow-up)
- Add a new docs page for
/v1/responsesusage and examples. - Update
/gateway/openai-http-apiwith a legacy note and pointer to/v1/responses.