feat(gateway): add OpenResponses /v1/responses endpoint

Add a new `/v1/responses` endpoint implementing the OpenResponses API standard for agentic workflows. This provides: - Item-based input (messages, function_call_output, reasoning) - Semantic streaming events (response.created, response.output_text.delta, response.completed, etc.) - Full SSE event support with both event: and data: lines - Configuration via gateway.http.endpoints.responses.enabled The endpoint is disabled by default and can be enabled independently from the existing Chat Completions endpoint. Phase 1 implementation supports: - String or ItemParam[] input - system/developer/user/assistant message roles - function_call_output items - instructions parameter - Agent routing via headers or model parameter - Session key management Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 10:44:48 +01:00
parent 7f6e87e918
commit f4b03599f0
11 changed files with 1563 additions and 0 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -35,6 +35,7 @@ Docs: https://docs.clawd.bot
 - **BREAKING:** Reject invalid/unknown config entries and refuse to start the gateway for safety; run `clawdbot doctor --fix` to repair. 
 ### Changes
 - Gateway: add `/v1/responses` endpoint (OpenResponses API) for agentic workflows with item-based input and semantic streaming events. Enable via `gateway.http.endpoints.responses.enabled: true`.
 - Usage: add `/usage cost` summaries and macOS menu cost submenu with daily charting.
 - Agents: clarify node_modules read-only guidance in agent instructions.
 - TUI: add syntax highlighting for code blocks. (#1200) — thanks @vignesh07.
--- a/docs/experiments/plans/openresponses-gateway.md
+++ b/docs/experiments/plans/openresponses-gateway.md
@@ -0,0 +1,110 @@
 ---
 summary: "Plan: Add OpenResponses /v1/responses endpoint and deprecate chat completions cleanly"
 owner: "clawdbot"
 status: "draft"
 last_updated: "2026-01-19"
 ---
 # OpenResponses Gateway Integration Plan
 ## Context
 Clawdbot Gateway currently exposes a minimal OpenAI-compatible Chat Completions endpoint at
 `/v1/chat/completions` (see [OpenAI Chat Completions](/gateway/openai-http-api)).
 Open Responses is an open inference standard based on the OpenAI Responses API. It is designed
 for agentic workflows and uses item-based inputs plus semantic streaming events. The OpenResponses
 spec defines `/v1/responses`, not `/v1/chat/completions`.
 ## Goals
 - Add a `/v1/responses` endpoint that adheres to OpenResponses semantics.
 - Keep Chat Completions as a compatibility layer that is easy to disable and eventually remove.
 - Standardize validation and parsing with isolated, reusable schemas.
 ## Non-goals
 - Full OpenResponses feature parity in the first pass (images, files, hosted tools).
 - Replacing internal agent execution logic or tool orchestration.
 - Changing the existing `/v1/chat/completions` behavior during the first phase.
 ## Research Summary
 Sources: OpenResponses OpenAPI, OpenResponses specification site, and the Hugging Face blog post.
 Key points extracted:
 - `POST /v1/responses` accepts `CreateResponseBody` fields like `model`, `input` (string or
  `ItemParam[]`), `instructions`, `tools`, `tool_choice`, `stream`, `max_output_tokens`, and
  `max_tool_calls`.
 - `ItemParam` is a discriminated union of:
  - `message` items with roles `system`, `developer`, `user`, `assistant`
  - `function_call` and `function_call_output`
  - `reasoning`
  - `item_reference`
 - Successful responses return a `ResponseResource` with `object: "response"`, `status`, and
  `output` items.
 - Streaming uses semantic events such as:
  - `response.created`, `response.in_progress`, `response.completed`, `response.failed`
  - `response.output_item.added`, `response.output_item.done`
  - `response.content_part.added`, `response.content_part.done`
  - `response.output_text.delta`, `response.output_text.done`
 - The spec requires:
  - `Content-Type: text/event-stream`
  - `event:` must match the JSON `type` field
  - terminal event must be literal `[DONE]`
 - Reasoning items may expose `content`, `encrypted_content`, and `summary`.
 - HF examples include `OpenResponses-Version: latest` in requests (optional header).
 ## Proposed Architecture
 - Add `src/gateway/open-responses.schema.ts` containing Zod schemas only (no gateway imports).
 - Add `src/gateway/openresponses-http.ts` (or `open-responses-http.ts`) for `/v1/responses`.
 - Keep `src/gateway/openai-http.ts` intact as a legacy compatibility adapter.
 - Add config `gateway.http.endpoints.responses.enabled` (default `false`).
 - Keep `gateway.http.endpoints.chatCompletions.enabled` independent; allow both endpoints to be
  toggled separately.
 - Emit a startup warning when Chat Completions is enabled to signal legacy status.
 ## Deprecation Path for Chat Completions
 - Maintain strict module boundaries: no shared schema types between responses and chat completions.
 - Make Chat Completions opt-in by config so it can be disabled without code changes.
 - Update docs to label Chat Completions as legacy once `/v1/responses` is stable.
 - Optional future step: map Chat Completions requests to the Responses handler for a simpler
  removal path.
 ## Phase 1 Support Subset
 - Accept `input` as string or `ItemParam[]` with message roles and `function_call_output`.
 - Extract system and developer messages into `extraSystemPrompt`.
 - Use the most recent `user` or `function_call_output` as the current message for agent runs.
 - Reject unsupported content parts (image/file) with `invalid_request_error`.
 - Return a single assistant message with `output_text` content.
 - Return `usage` with zeroed values until token accounting is wired.
 ## Validation Strategy (No SDK)
 - Implement Zod schemas for the supported subset of:
  - `CreateResponseBody`
  - `ItemParam` + message content part unions
  - `ResponseResource`
  - Streaming event shapes used by the gateway
 - Keep schemas in a single, isolated module to avoid drift and allow future codegen.
 ## Streaming Implementation (Phase 1)
 - SSE lines with both `event:` and `data:`.
 - Required sequence (minimum viable):
  - `response.created`
  - `response.output_item.added`
  - `response.content_part.added`
  - `response.output_text.delta` (repeat as needed)
  - `response.output_text.done`
  - `response.content_part.done`
  - `response.completed`
  - `[DONE]`
 ## Tests and Verification Plan
 - Add e2e coverage for `/v1/responses`:
  - Auth required
  - Non-stream response shape
  - Stream event ordering and `[DONE]`
  - Session routing with headers and `user`
 - Keep `src/gateway/openai-http.e2e.test.ts` unchanged.
 - Manual: curl to `/v1/responses` with `stream: true` and verify event ordering and terminal
  `[DONE]`.
 ## Doc Updates (Follow-up)
 - Add a new docs page for `/v1/responses` usage and examples.
 - Update `/gateway/openai-http-api` with a legacy note and pointer to `/v1/responses`.
--- a/src/config/types.gateway.ts
+++ b/src/config/types.gateway.ts
@@ -105,8 +105,17 @@ export type GatewayHttpChatCompletionsConfig = {
  enabled?: boolean;
 };
 export type GatewayHttpResponsesConfig = {
  /**
   * If false, the Gateway will not serve `POST /v1/responses` (OpenResponses API).
   * Default: false when absent.
   */
  enabled?: boolean;
 };
 export type GatewayHttpEndpointsConfig = {
  chatCompletions?: GatewayHttpChatCompletionsConfig;
  responses?: GatewayHttpResponsesConfig;
 };
 export type GatewayHttpConfig = {
--- a/src/config/zod-schema.ts
+++ b/src/config/zod-schema.ts
@@ -299,6 +299,12 @@ export const ClawdbotSchema = z
                  })
                  .strict()
                  .optional(),
                responses: z
                  .object({
                    enabled: z.boolean().optional(),
                  })
                  .strict()
                  .optional(),
              })
              .strict()
              .optional(),
--- a/src/gateway/open-responses.schema.ts
+++ b/src/gateway/open-responses.schema.ts
@@ -0,0 +1,325 @@
 /**
 * OpenResponses API Zod Schemas
 *
 * Zod schemas for the OpenResponses `/v1/responses` endpoint.
 * This module is isolated from gateway imports to enable future codegen and prevent drift.
 *
 * @see https://www.open-responses.com/
 */
 import { z } from "zod";
 // ─────────────────────────────────────────────────────────────────────────────
 // Content Parts
 // ─────────────────────────────────────────────────────────────────────────────
 export const InputTextContentPartSchema = z
  .object({
    type: z.literal("input_text"),
    text: z.string(),
  })
  .strict();
 export const OutputTextContentPartSchema = z
  .object({
    type: z.literal("output_text"),
    text: z.string(),
  })
  .strict();
 // For Phase 1, we reject image/file content with helpful errors
 export const InputImageContentPartSchema = z
  .object({
    type: z.literal("input_image"),
  })
  .passthrough();
 export const InputFileContentPartSchema = z
  .object({
    type: z.literal("input_file"),
  })
  .passthrough();
 export const ContentPartSchema = z.discriminatedUnion("type", [
  InputTextContentPartSchema,
  OutputTextContentPartSchema,
  InputImageContentPartSchema,
  InputFileContentPartSchema,
 ]);
 export type ContentPart = z.infer<typeof ContentPartSchema>;
 // ─────────────────────────────────────────────────────────────────────────────
 // Item Types (ItemParam)
 // ─────────────────────────────────────────────────────────────────────────────
 export const MessageItemRoleSchema = z.enum(["system", "developer", "user", "assistant"]);
 export type MessageItemRole = z.infer<typeof MessageItemRoleSchema>;
 export const MessageItemSchema = z
  .object({
    type: z.literal("message"),
    role: MessageItemRoleSchema,
    content: z.union([z.string(), z.array(ContentPartSchema)]),
  })
  .strict();
 export const FunctionCallItemSchema = z
  .object({
    type: z.literal("function_call"),
    id: z.string().optional(),
    call_id: z.string().optional(),
    name: z.string(),
    arguments: z.string(),
  })
  .strict();
 export const FunctionCallOutputItemSchema = z
  .object({
    type: z.literal("function_call_output"),
    call_id: z.string(),
    output: z.string(),
  })
  .strict();
 export const ReasoningItemSchema = z
  .object({
    type: z.literal("reasoning"),
    content: z.string().optional(),
    encrypted_content: z.string().optional(),
    summary: z.string().optional(),
  })
  .strict();
 export const ItemReferenceItemSchema = z
  .object({
    type: z.literal("item_reference"),
    id: z.string(),
  })
  .strict();
 export const ItemParamSchema = z.discriminatedUnion("type", [
  MessageItemSchema,
  FunctionCallItemSchema,
  FunctionCallOutputItemSchema,
  ReasoningItemSchema,
  ItemReferenceItemSchema,
 ]);
 export type ItemParam = z.infer<typeof ItemParamSchema>;
 // ─────────────────────────────────────────────────────────────────────────────
 // Tool Definitions
 // ─────────────────────────────────────────────────────────────────────────────
 export const FunctionToolDefinitionSchema = z
  .object({
    type: z.literal("function"),
    function: z.object({
      name: z.string(),
      description: z.string().optional(),
      parameters: z.record(z.string(), z.unknown()).optional(),
    }),
  })
  .strict();
 export const ToolDefinitionSchema = FunctionToolDefinitionSchema;
 export type ToolDefinition = z.infer<typeof ToolDefinitionSchema>;
 // ─────────────────────────────────────────────────────────────────────────────
 // Request Body
 // ─────────────────────────────────────────────────────────────────────────────
 export const ToolChoiceSchema = z.union([
  z.literal("auto"),
  z.literal("none"),
  z.literal("required"),
  z.object({
    type: z.literal("function"),
    function: z.object({ name: z.string() }),
  }),
 ]);
 export const CreateResponseBodySchema = z
  .object({
    model: z.string(),
    input: z.union([z.string(), z.array(ItemParamSchema)]),
    instructions: z.string().optional(),
    tools: z.array(ToolDefinitionSchema).optional(),
    tool_choice: ToolChoiceSchema.optional(),
    stream: z.boolean().optional(),
    max_output_tokens: z.number().int().positive().optional(),
    max_tool_calls: z.number().int().positive().optional(),
    user: z.string().optional(),
    // Phase 1: ignore but accept these fields
    temperature: z.number().optional(),
    top_p: z.number().optional(),
    metadata: z.record(z.string(), z.string()).optional(),
    store: z.boolean().optional(),
    previous_response_id: z.string().optional(),
    reasoning: z
      .object({
        effort: z.enum(["low", "medium", "high"]).optional(),
        summary: z.enum(["auto", "concise", "detailed"]).optional(),
      })
      .optional(),
    truncation: z.enum(["auto", "disabled"]).optional(),
  })
  .strict();
 export type CreateResponseBody = z.infer<typeof CreateResponseBodySchema>;
 // ─────────────────────────────────────────────────────────────────────────────
 // Response Resource
 // ─────────────────────────────────────────────────────────────────────────────
 export const ResponseStatusSchema = z.enum([
  "in_progress",
  "completed",
  "failed",
  "cancelled",
  "incomplete",
 ]);
 export type ResponseStatus = z.infer<typeof ResponseStatusSchema>;
 export const OutputItemSchema = z.discriminatedUnion("type", [
  z
    .object({
      type: z.literal("message"),
      id: z.string(),
      role: z.literal("assistant"),
      content: z.array(OutputTextContentPartSchema),
      status: z.enum(["in_progress", "completed"]).optional(),
    })
    .strict(),
  z
    .object({
      type: z.literal("function_call"),
      id: z.string(),
      call_id: z.string(),
      name: z.string(),
      arguments: z.string(),
      status: z.enum(["in_progress", "completed"]).optional(),
    })
    .strict(),
  z
    .object({
      type: z.literal("reasoning"),
      id: z.string(),
      content: z.string().optional(),
      summary: z.string().optional(),
    })
    .strict(),
 ]);
 export type OutputItem = z.infer<typeof OutputItemSchema>;
 export const UsageSchema = z.object({
  input_tokens: z.number().int().nonnegative(),
  output_tokens: z.number().int().nonnegative(),
  total_tokens: z.number().int().nonnegative(),
 });
 export type Usage = z.infer<typeof UsageSchema>;
 export const ResponseResourceSchema = z.object({
  id: z.string(),
  object: z.literal("response"),
  created_at: z.number().int(),
  status: ResponseStatusSchema,
  model: z.string(),
  output: z.array(OutputItemSchema),
  usage: UsageSchema,
  // Optional fields for future phases
  error: z
    .object({
      code: z.string(),
      message: z.string(),
    })
    .optional(),
 });
 export type ResponseResource = z.infer<typeof ResponseResourceSchema>;
 // ─────────────────────────────────────────────────────────────────────────────
 // Streaming Event Types
 // ─────────────────────────────────────────────────────────────────────────────
 export const ResponseCreatedEventSchema = z.object({
  type: z.literal("response.created"),
  response: ResponseResourceSchema,
 });
 export const ResponseInProgressEventSchema = z.object({
  type: z.literal("response.in_progress"),
  response: ResponseResourceSchema,
 });
 export const ResponseCompletedEventSchema = z.object({
  type: z.literal("response.completed"),
  response: ResponseResourceSchema,
 });
 export const ResponseFailedEventSchema = z.object({
  type: z.literal("response.failed"),
  response: ResponseResourceSchema,
 });
 export const OutputItemAddedEventSchema = z.object({
  type: z.literal("response.output_item.added"),
  output_index: z.number().int().nonnegative(),
  item: OutputItemSchema,
 });
 export const OutputItemDoneEventSchema = z.object({
  type: z.literal("response.output_item.done"),
  output_index: z.number().int().nonnegative(),
  item: OutputItemSchema,
 });
 export const ContentPartAddedEventSchema = z.object({
  type: z.literal("response.content_part.added"),
  item_id: z.string(),
  output_index: z.number().int().nonnegative(),
  content_index: z.number().int().nonnegative(),
  part: OutputTextContentPartSchema,
 });
 export const ContentPartDoneEventSchema = z.object({
  type: z.literal("response.content_part.done"),
  item_id: z.string(),
  output_index: z.number().int().nonnegative(),
  content_index: z.number().int().nonnegative(),
  part: OutputTextContentPartSchema,
 });
 export const OutputTextDeltaEventSchema = z.object({
  type: z.literal("response.output_text.delta"),
  item_id: z.string(),
  output_index: z.number().int().nonnegative(),
  content_index: z.number().int().nonnegative(),
  delta: z.string(),
 });
 export const OutputTextDoneEventSchema = z.object({
  type: z.literal("response.output_text.done"),
  item_id: z.string(),
  output_index: z.number().int().nonnegative(),
  content_index: z.number().int().nonnegative(),
  text: z.string(),
 });
 export type StreamingEvent =
  | z.infer<typeof ResponseCreatedEventSchema>
  | z.infer<typeof ResponseInProgressEventSchema>
  | z.infer<typeof ResponseCompletedEventSchema>
  | z.infer<typeof ResponseFailedEventSchema>
  | z.infer<typeof OutputItemAddedEventSchema>
  | z.infer<typeof OutputItemDoneEventSchema>
  | z.infer<typeof ContentPartAddedEventSchema>
  | z.infer<typeof ContentPartDoneEventSchema>
  | z.infer<typeof OutputTextDeltaEventSchema>
  | z.infer<typeof OutputTextDoneEventSchema>;
--- a/src/gateway/openresponses-http.e2e.test.ts
+++ b/src/gateway/openresponses-http.e2e.test.ts
@@ -0,0 +1,511 @@
 import { describe, expect, it } from "vitest";
 import { HISTORY_CONTEXT_MARKER } from "../auto-reply/reply/history.js";
 import { CURRENT_MESSAGE_MARKER } from "../auto-reply/reply/mentions.js";
 import { emitAgentEvent } from "../infra/agent-events.js";
 import { agentCommand, getFreePort, installGatewayTestHooks } from "./test-helpers.js";
 installGatewayTestHooks();
 async function startServerWithDefaultConfig(port: number) {
  const { startGatewayServer } = await import("./server.js");
  return await startGatewayServer(port, {
    host: "127.0.0.1",
    auth: { mode: "token", token: "secret" },
    controlUiEnabled: false,
  });
 }
 async function startServer(port: number, opts?: { openResponsesEnabled?: boolean }) {
  const { startGatewayServer } = await import("./server.js");
  return await startGatewayServer(port, {
    host: "127.0.0.1",
    auth: { mode: "token", token: "secret" },
    controlUiEnabled: false,
    openResponsesEnabled: opts?.openResponsesEnabled ?? true,
  });
 }
 async function postResponses(port: number, body: unknown, headers?: Record<string, string>) {
  const res = await fetch(`http://127.0.0.1:${port}/v1/responses`, {
    method: "POST",
    headers: {
      "content-type": "application/json",
      authorization: "Bearer secret",
      ...headers,
    },
    body: JSON.stringify(body),
  });
  return res;
 }
 function parseSseEvents(text: string): Array<{ event?: string; data: string }> {
  const events: Array<{ event?: string; data: string }> = [];
  const lines = text.split("\n");
  let currentEvent: string | undefined;
  let currentData: string[] = [];
  for (const line of lines) {
    if (line.startsWith("event: ")) {
      currentEvent = line.slice("event: ".length);
    } else if (line.startsWith("data: ")) {
      currentData.push(line.slice("data: ".length));
    } else if (line.trim() === "" && currentData.length > 0) {
      events.push({ event: currentEvent, data: currentData.join("\n") });
      currentEvent = undefined;
      currentData = [];
    }
  }
  return events;
 }
 describe("OpenResponses HTTP API (e2e)", () => {
  it("is disabled by default (requires config)", async () => {
    const port = await getFreePort();
    const server = await startServerWithDefaultConfig(port);
    try {
      const res = await postResponses(port, {
        model: "clawdbot",
        input: "hi",
      });
      expect(res.status).toBe(404);
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("can be disabled via config (404)", async () => {
    const port = await getFreePort();
    const server = await startServer(port, {
      openResponsesEnabled: false,
    });
    try {
      const res = await postResponses(port, {
        model: "clawdbot",
        input: "hi",
      });
      expect(res.status).toBe(404);
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("rejects non-POST", async () => {
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await fetch(`http://127.0.0.1:${port}/v1/responses`, {
        method: "GET",
        headers: { authorization: "Bearer secret" },
      });
      expect(res.status).toBe(405);
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("rejects missing auth", async () => {
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await fetch(`http://127.0.0.1:${port}/v1/responses`, {
        method: "POST",
        headers: { "content-type": "application/json" },
        body: JSON.stringify({ model: "clawdbot", input: "hi" }),
      });
      expect(res.status).toBe(401);
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("rejects invalid request body (missing model)", async () => {
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, { input: "hi" });
      expect(res.status).toBe(400);
      const json = (await res.json()) as Record<string, unknown>;
      expect((json.error as Record<string, unknown> | undefined)?.type).toBe(
        "invalid_request_error",
      );
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("routes to a specific agent via header", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(
        port,
        { model: "clawdbot", input: "hi" },
        { "x-clawdbot-agent-id": "beta" },
      );
      expect(res.status).toBe(200);
      expect(agentCommand).toHaveBeenCalledTimes(1);
      const [opts] = agentCommand.mock.calls[0] ?? [];
      expect((opts as { sessionKey?: string } | undefined)?.sessionKey ?? "").toMatch(
        /^agent:beta:/,
      );
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("routes to a specific agent via model (no custom headers)", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        model: "clawdbot:beta",
        input: "hi",
      });
      expect(res.status).toBe(200);
      expect(agentCommand).toHaveBeenCalledTimes(1);
      const [opts] = agentCommand.mock.calls[0] ?? [];
      expect((opts as { sessionKey?: string } | undefined)?.sessionKey ?? "").toMatch(
        /^agent:beta:/,
      );
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("uses OpenResponses user for a stable session key", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        user: "alice",
        model: "clawdbot",
        input: "hi",
      });
      expect(res.status).toBe(200);
      const [opts] = agentCommand.mock.calls[0] ?? [];
      expect((opts as { sessionKey?: string } | undefined)?.sessionKey ?? "").toContain(
        "openresponses-user:alice",
      );
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("accepts string input", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        model: "clawdbot",
        input: "hello world",
      });
      expect(res.status).toBe(200);
      const [opts] = agentCommand.mock.calls[0] ?? [];
      expect((opts as { message?: string } | undefined)?.message).toBe("hello world");
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("accepts array input with message items", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        model: "clawdbot",
        input: [{ type: "message", role: "user", content: "hello there" }],
      });
      expect(res.status).toBe(200);
      const [opts] = agentCommand.mock.calls[0] ?? [];
      expect((opts as { message?: string } | undefined)?.message).toBe("hello there");
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("extracts system and developer messages as extraSystemPrompt", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        model: "clawdbot",
        input: [
          { type: "message", role: "system", content: "You are a helpful assistant." },
          { type: "message", role: "developer", content: "Be concise." },
          { type: "message", role: "user", content: "Hello" },
        ],
      });
      expect(res.status).toBe(200);
      const [opts] = agentCommand.mock.calls[0] ?? [];
      const extraSystemPrompt =
        (opts as { extraSystemPrompt?: string } | undefined)?.extraSystemPrompt ?? "";
      expect(extraSystemPrompt).toContain("You are a helpful assistant.");
      expect(extraSystemPrompt).toContain("Be concise.");
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("includes instructions in extraSystemPrompt", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        model: "clawdbot",
        input: "hi",
        instructions: "Always respond in French.",
      });
      expect(res.status).toBe(200);
      const [opts] = agentCommand.mock.calls[0] ?? [];
      const extraSystemPrompt =
        (opts as { extraSystemPrompt?: string } | undefined)?.extraSystemPrompt ?? "";
      expect(extraSystemPrompt).toContain("Always respond in French.");
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("includes conversation history when multiple messages are provided", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "I am Claude" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        model: "clawdbot",
        input: [
          { type: "message", role: "system", content: "You are a helpful assistant." },
          { type: "message", role: "user", content: "Hello, who are you?" },
          { type: "message", role: "assistant", content: "I am Claude." },
          { type: "message", role: "user", content: "What did I just ask you?" },
        ],
      });
      expect(res.status).toBe(200);
      const [opts] = agentCommand.mock.calls[0] ?? [];
      const message = (opts as { message?: string } | undefined)?.message ?? "";
      expect(message).toContain(HISTORY_CONTEXT_MARKER);
      expect(message).toContain("User: Hello, who are you?");
      expect(message).toContain("Assistant: I am Claude.");
      expect(message).toContain(CURRENT_MESSAGE_MARKER);
      expect(message).toContain("User: What did I just ask you?");
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("includes function_call_output when it is the latest item", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "ok" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        model: "clawdbot",
        input: [
          { type: "message", role: "user", content: "What's the weather?" },
          { type: "function_call_output", call_id: "call_1", output: "Sunny, 70F." },
        ],
      });
      expect(res.status).toBe(200);
      const [opts] = agentCommand.mock.calls[0] ?? [];
      const message = (opts as { message?: string } | undefined)?.message ?? "";
      expect(message).toContain("Sunny, 70F.");
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("returns a non-streaming response with correct shape", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        stream: false,
        model: "clawdbot",
        input: "hi",
      });
      expect(res.status).toBe(200);
      const json = (await res.json()) as Record<string, unknown>;
      expect(json.object).toBe("response");
      expect(json.status).toBe("completed");
      expect(Array.isArray(json.output)).toBe(true);
      const output = json.output as Array<Record<string, unknown>>;
      expect(output.length).toBe(1);
      const item = output[0] ?? {};
      expect(item.type).toBe("message");
      expect(item.role).toBe("assistant");
      const content = item.content as Array<Record<string, unknown>>;
      expect(content.length).toBe(1);
      expect(content[0]?.type).toBe("output_text");
      expect(content[0]?.text).toBe("hello");
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("requires a user message in input", async () => {
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        model: "clawdbot",
        input: [{ type: "message", role: "system", content: "yo" }],
      });
      expect(res.status).toBe(400);
      const json = (await res.json()) as Record<string, unknown>;
      expect((json.error as Record<string, unknown> | undefined)?.type).toBe(
        "invalid_request_error",
      );
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("streams SSE events when stream=true (delta events)", async () => {
    agentCommand.mockImplementationOnce(async (opts: unknown) => {
      const runId = (opts as { runId?: string } | undefined)?.runId ?? "";
      emitAgentEvent({ runId, stream: "assistant", data: { delta: "he" } });
      emitAgentEvent({ runId, stream: "assistant", data: { delta: "llo" } });
      return { payloads: [{ text: "hello" }] } as never;
    });
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        stream: true,
        model: "clawdbot",
        input: "hi",
      });
      expect(res.status).toBe(200);
      expect(res.headers.get("content-type") ?? "").toContain("text/event-stream");
      const text = await res.text();
      const events = parseSseEvents(text);
      // Check for required event types
      const eventTypes = events.map((e) => e.event).filter(Boolean);
      expect(eventTypes).toContain("response.created");
      expect(eventTypes).toContain("response.output_item.added");
      expect(eventTypes).toContain("response.content_part.added");
      expect(eventTypes).toContain("response.output_text.delta");
      expect(eventTypes).toContain("response.output_text.done");
      expect(eventTypes).toContain("response.content_part.done");
      expect(eventTypes).toContain("response.completed");
      // Check for [DONE] terminal event
      expect(events.some((e) => e.data === "[DONE]")).toBe(true);
      // Verify delta content
      const deltaEvents = events.filter((e) => e.event === "response.output_text.delta");
      const allDeltas = deltaEvents
        .map((e) => {
          const parsed = JSON.parse(e.data) as { delta?: string };
          return parsed.delta ?? "";
        })
        .join("");
      expect(allDeltas).toBe("hello");
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("streams SSE events when stream=true (fallback when no deltas)", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        stream: true,
        model: "clawdbot",
        input: "hi",
      });
      expect(res.status).toBe(200);
      const text = await res.text();
      expect(text).toContain("[DONE]");
      expect(text).toContain("hello");
    } finally {
      await server.close({ reason: "test done" });
    }
  });
  it("event type matches JSON type field", async () => {
    agentCommand.mockResolvedValueOnce({
      payloads: [{ text: "hello" }],
    } as never);
    const port = await getFreePort();
    const server = await startServer(port);
    try {
      const res = await postResponses(port, {
        stream: true,
        model: "clawdbot",
        input: "hi",
      });
      expect(res.status).toBe(200);
      const text = await res.text();
      const events = parseSseEvents(text);
      for (const event of events) {
        if (event.data === "[DONE]") continue;
        const parsed = JSON.parse(event.data) as { type?: string };
        expect(event.event).toBe(parsed.type);
      }
    } finally {
      await server.close({ reason: "test done" });
    }
  });
 });
--- a/src/gateway/openresponses-http.ts
+++ b/src/gateway/openresponses-http.ts
@@ -0,0 +1,580 @@
 /**
 * OpenResponses HTTP Handler
 *
 * Implements the OpenResponses `/v1/responses` endpoint for Clawdbot Gateway.
 *
 * @see https://www.open-responses.com/
 */
 import { randomUUID } from "node:crypto";
 import type { IncomingMessage, ServerResponse } from "node:http";
 import { buildHistoryContextFromEntries, type HistoryEntry } from "../auto-reply/reply/history.js";
 import { createDefaultDeps } from "../cli/deps.js";
 import { agentCommand } from "../commands/agent.js";
 import { emitAgentEvent, onAgentEvent } from "../infra/agent-events.js";
 import { buildAgentMainSessionKey, normalizeAgentId } from "../routing/session-key.js";
 import { defaultRuntime } from "../runtime.js";
 import { authorizeGatewayConnect, type ResolvedGatewayAuth } from "./auth.js";
 import { readJsonBody } from "./hooks.js";
 import {
  CreateResponseBodySchema,
  type ContentPart,
  type CreateResponseBody,
  type ItemParam,
  type OutputItem,
  type ResponseResource,
  type StreamingEvent,
  type Usage,
 } from "./open-responses.schema.js";
 type OpenResponsesHttpOptions = {
  auth: ResolvedGatewayAuth;
  maxBodyBytes?: number;
 };
 function sendJson(res: ServerResponse, status: number, body: unknown) {
  res.statusCode = status;
  res.setHeader("Content-Type", "application/json; charset=utf-8");
  res.end(JSON.stringify(body));
 }
 function getHeader(req: IncomingMessage, name: string): string | undefined {
  const raw = req.headers[name.toLowerCase()];
  if (typeof raw === "string") return raw;
  if (Array.isArray(raw)) return raw[0];
  return undefined;
 }
 function getBearerToken(req: IncomingMessage): string | undefined {
  const raw = getHeader(req, "authorization")?.trim() ?? "";
  if (!raw.toLowerCase().startsWith("bearer ")) return undefined;
  const token = raw.slice(7).trim();
  return token || undefined;
 }
 function writeSseEvent(res: ServerResponse, event: StreamingEvent) {
  res.write(`event: ${event.type}\n`);
  res.write(`data: ${JSON.stringify(event)}\n\n`);
 }
 function writeDone(res: ServerResponse) {
  res.write("data: [DONE]\n\n");
 }
 function extractTextContent(content: string | ContentPart[]): string {
  if (typeof content === "string") return content;
  return content
    .map((part) => {
      if (part.type === "input_text") return part.text;
      if (part.type === "output_text") return part.text;
      return "";
    })
    .filter(Boolean)
    .join("\n");
 }
 function hasUnsupportedContent(content: string | ContentPart[]): string | null {
  if (typeof content === "string") return null;
  for (const part of content) {
    if (part.type === "input_image") return "input_image content is not supported in Phase 1";
    if (part.type === "input_file") return "input_file content is not supported in Phase 1";
  }
  return null;
 }
 function buildAgentPrompt(input: string | ItemParam[]): {
  message: string;
  extraSystemPrompt?: string;
 } {
  if (typeof input === "string") {
    return { message: input };
  }
  const systemParts: string[] = [];
  const conversationEntries: Array<{ role: "user" | "assistant" | "tool"; entry: HistoryEntry }> =
    [];
  for (const item of input) {
    if (item.type === "message") {
      const content = extractTextContent(item.content).trim();
      if (!content) continue;
      if (item.role === "system" || item.role === "developer") {
        systemParts.push(content);
        continue;
      }
      const normalizedRole = item.role === "assistant" ? "assistant" : "user";
      const sender = normalizedRole === "assistant" ? "Assistant" : "User";
      conversationEntries.push({
        role: normalizedRole,
        entry: { sender, body: content },
      });
    } else if (item.type === "function_call_output") {
      conversationEntries.push({
        role: "tool",
        entry: { sender: `Tool:${item.call_id}`, body: item.output },
      });
    }
    // Skip reasoning and item_reference for prompt building (Phase 1)
  }
  let message = "";
  if (conversationEntries.length > 0) {
    // Find the last user or tool message as the current message
    let currentIndex = -1;
    for (let i = conversationEntries.length - 1; i >= 0; i -= 1) {
      const entryRole = conversationEntries[i]?.role;
      if (entryRole === "user" || entryRole === "tool") {
        currentIndex = i;
        break;
      }
    }
    if (currentIndex < 0) currentIndex = conversationEntries.length - 1;
    const currentEntry = conversationEntries[currentIndex]?.entry;
    if (currentEntry) {
      const historyEntries = conversationEntries.slice(0, currentIndex).map((entry) => entry.entry);
      if (historyEntries.length === 0) {
        message = currentEntry.body;
      } else {
        const formatEntry = (entry: HistoryEntry) => `${entry.sender}: ${entry.body}`;
        message = buildHistoryContextFromEntries({
          entries: [...historyEntries, currentEntry],
          currentMessage: formatEntry(currentEntry),
          formatEntry,
        });
      }
    }
  }
  return {
    message,
    extraSystemPrompt: systemParts.length > 0 ? systemParts.join("\n\n") : undefined,
  };
 }
 function resolveAgentIdFromHeader(req: IncomingMessage): string | undefined {
  const raw =
    getHeader(req, "x-clawdbot-agent-id")?.trim() ||
    getHeader(req, "x-clawdbot-agent")?.trim() ||
    "";
  if (!raw) return undefined;
  return normalizeAgentId(raw);
 }
 function resolveAgentIdFromModel(model: string | undefined): string | undefined {
  const raw = model?.trim();
  if (!raw) return undefined;
  const m =
    raw.match(/^clawdbot[:/](?<agentId>[a-z0-9][a-z0-9_-]{0,63})$/i) ??
    raw.match(/^agent:(?<agentId>[a-z0-9][a-z0-9_-]{0,63})$/i);
  const agentId = m?.groups?.agentId;
  if (!agentId) return undefined;
  return normalizeAgentId(agentId);
 }
 function resolveAgentIdForRequest(params: {
  req: IncomingMessage;
  model: string | undefined;
 }): string {
  const fromHeader = resolveAgentIdFromHeader(params.req);
  if (fromHeader) return fromHeader;
  const fromModel = resolveAgentIdFromModel(params.model);
  return fromModel ?? "main";
 }
 function resolveSessionKey(params: {
  req: IncomingMessage;
  agentId: string;
  user?: string | undefined;
 }): string {
  const explicit = getHeader(params.req, "x-clawdbot-session-key")?.trim();
  if (explicit) return explicit;
  // Default: stateless per-request session key, but stable if OpenResponses "user" is provided.
  const user = params.user?.trim();
  const mainKey = user ? `openresponses-user:${user}` : `openresponses:${randomUUID()}`;
  return buildAgentMainSessionKey({ agentId: params.agentId, mainKey });
 }
 function createEmptyUsage(): Usage {
  return { input_tokens: 0, output_tokens: 0, total_tokens: 0 };
 }
 function createResponseResource(params: {
  id: string;
  model: string;
  status: ResponseResource["status"];
  output: OutputItem[];
  usage?: Usage;
  error?: { code: string; message: string };
 }): ResponseResource {
  return {
    id: params.id,
    object: "response",
    created_at: Math.floor(Date.now() / 1000),
    status: params.status,
    model: params.model,
    output: params.output,
    usage: params.usage ?? createEmptyUsage(),
    error: params.error,
  };
 }
 function createAssistantOutputItem(params: {
  id: string;
  text: string;
  status?: "in_progress" | "completed";
 }): OutputItem {
  return {
    type: "message",
    id: params.id,
    role: "assistant",
    content: [{ type: "output_text", text: params.text }],
    status: params.status,
  };
 }
 export async function handleOpenResponsesHttpRequest(
  req: IncomingMessage,
  res: ServerResponse,
  opts: OpenResponsesHttpOptions,
 ): Promise<boolean> {
  const url = new URL(req.url ?? "/", `http://${req.headers.host || "localhost"}`);
  if (url.pathname !== "/v1/responses") return false;
  if (req.method !== "POST") {
    res.statusCode = 405;
    res.setHeader("Allow", "POST");
    res.setHeader("Content-Type", "text/plain; charset=utf-8");
    res.end("Method Not Allowed");
    return true;
  }
  const token = getBearerToken(req);
  const authResult = await authorizeGatewayConnect({
    auth: opts.auth,
    connectAuth: { token, password: token },
    req,
  });
  if (!authResult.ok) {
    sendJson(res, 401, {
      error: { message: "Unauthorized", type: "unauthorized" },
    });
    return true;
  }
  const body = await readJsonBody(req, opts.maxBodyBytes ?? 1024 * 1024);
  if (!body.ok) {
    sendJson(res, 400, {
      error: { message: body.error, type: "invalid_request_error" },
    });
    return true;
  }
  // Validate request body with Zod
  const parseResult = CreateResponseBodySchema.safeParse(body.value);
  if (!parseResult.success) {
    const issue = parseResult.error.issues[0];
    const message = issue ? `${issue.path.join(".")}: ${issue.message}` : "Invalid request body";
    sendJson(res, 400, {
      error: { message, type: "invalid_request_error" },
    });
    return true;
  }
  const payload: CreateResponseBody = parseResult.data;
  const stream = Boolean(payload.stream);
  const model = payload.model;
  const user = payload.user;
  // Check for unsupported content types (Phase 1)
  if (Array.isArray(payload.input)) {
    for (const item of payload.input) {
      if (item.type === "message" && typeof item.content !== "string") {
        const unsupported = hasUnsupportedContent(item.content);
        if (unsupported) {
          sendJson(res, 400, {
            error: { message: unsupported, type: "invalid_request_error" },
          });
          return true;
        }
      }
    }
  }
  const agentId = resolveAgentIdForRequest({ req, model });
  const sessionKey = resolveSessionKey({ req, agentId, user });
  // Build prompt from input
  const prompt = buildAgentPrompt(payload.input);
  // Handle instructions as extra system prompt
  const extraSystemPrompt = [payload.instructions, prompt.extraSystemPrompt]
    .filter(Boolean)
    .join("\n\n");
  if (!prompt.message) {
    sendJson(res, 400, {
      error: {
        message: "Missing user message in `input`.",
        type: "invalid_request_error",
      },
    });
    return true;
  }
  const responseId = `resp_${randomUUID()}`;
  const outputItemId = `msg_${randomUUID()}`;
  const deps = createDefaultDeps();
  if (!stream) {
    try {
      const result = await agentCommand(
        {
          message: prompt.message,
          extraSystemPrompt: extraSystemPrompt || undefined,
          sessionKey,
          runId: responseId,
          deliver: false,
          messageChannel: "webchat",
          bestEffortDeliver: false,
        },
        defaultRuntime,
        deps,
      );
      const payloads = (result as { payloads?: Array<{ text?: string }> } | null)?.payloads;
      const content =
        Array.isArray(payloads) && payloads.length > 0
          ? payloads
              .map((p) => (typeof p.text === "string" ? p.text : ""))
              .filter(Boolean)
              .join("\n\n")
          : "No response from Clawdbot.";
      const response = createResponseResource({
        id: responseId,
        model,
        status: "completed",
        output: [
          createAssistantOutputItem({ id: outputItemId, text: content, status: "completed" }),
        ],
      });
      sendJson(res, 200, response);
    } catch (err) {
      const response = createResponseResource({
        id: responseId,
        model,
        status: "failed",
        output: [],
        error: { code: "api_error", message: String(err) },
      });
      sendJson(res, 500, response);
    }
    return true;
  }
  // ─────────────────────────────────────────────────────────────────────────
  // Streaming mode
  // ─────────────────────────────────────────────────────────────────────────
  res.statusCode = 200;
  res.setHeader("Content-Type", "text/event-stream; charset=utf-8");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");
  res.flushHeaders?.();
  let accumulatedText = "";
  let sawAssistantDelta = false;
  let closed = false;
  // Send initial events
  const initialResponse = createResponseResource({
    id: responseId,
    model,
    status: "in_progress",
    output: [],
  });
  writeSseEvent(res, { type: "response.created", response: initialResponse });
  // Add output item
  const outputItem = createAssistantOutputItem({
    id: outputItemId,
    text: "",
    status: "in_progress",
  });
  writeSseEvent(res, {
    type: "response.output_item.added",
    output_index: 0,
    item: outputItem,
  });
  // Add content part
  writeSseEvent(res, {
    type: "response.content_part.added",
    item_id: outputItemId,
    output_index: 0,
    content_index: 0,
    part: { type: "output_text", text: "" },
  });
  const unsubscribe = onAgentEvent((evt) => {
    if (evt.runId !== responseId) return;
    if (closed) return;
    if (evt.stream === "assistant") {
      const delta = evt.data?.delta;
      const text = evt.data?.text;
      const content = typeof delta === "string" ? delta : typeof text === "string" ? text : "";
      if (!content) return;
      sawAssistantDelta = true;
      accumulatedText += content;
      writeSseEvent(res, {
        type: "response.output_text.delta",
        item_id: outputItemId,
        output_index: 0,
        content_index: 0,
        delta: content,
      });
      return;
    }
    if (evt.stream === "lifecycle") {
      const phase = evt.data?.phase;
      if (phase === "end" || phase === "error") {
        closed = true;
        unsubscribe();
        // Complete the stream with final events
        const finalText = accumulatedText || "No response from Clawdbot.";
        const finalStatus = phase === "error" ? "failed" : "completed";
        writeSseEvent(res, {
          type: "response.output_text.done",
          item_id: outputItemId,
          output_index: 0,
          content_index: 0,
          text: finalText,
        });
        writeSseEvent(res, {
          type: "response.content_part.done",
          item_id: outputItemId,
          output_index: 0,
          content_index: 0,
          part: { type: "output_text", text: finalText },
        });
        const completedItem = createAssistantOutputItem({
          id: outputItemId,
          text: finalText,
          status: "completed",
        });
        writeSseEvent(res, {
          type: "response.output_item.done",
          output_index: 0,
          item: completedItem,
        });
        const finalResponse = createResponseResource({
          id: responseId,
          model,
          status: finalStatus,
          output: [completedItem],
        });
        writeSseEvent(res, { type: "response.completed", response: finalResponse });
        writeDone(res);
        res.end();
      }
    }
  });
  req.on("close", () => {
    closed = true;
    unsubscribe();
  });
  void (async () => {
    try {
      const result = await agentCommand(
        {
          message: prompt.message,
          extraSystemPrompt: extraSystemPrompt || undefined,
          sessionKey,
          runId: responseId,
          deliver: false,
          messageChannel: "webchat",
          bestEffortDeliver: false,
        },
        defaultRuntime,
        deps,
      );
      if (closed) return;
      // Fallback: if no streaming deltas were received, send the full response
      if (!sawAssistantDelta) {
        const payloads = (result as { payloads?: Array<{ text?: string }> } | null)?.payloads;
        const content =
          Array.isArray(payloads) && payloads.length > 0
            ? payloads
                .map((p) => (typeof p.text === "string" ? p.text : ""))
                .filter(Boolean)
                .join("\n\n")
            : "No response from Clawdbot.";
        accumulatedText = content;
        sawAssistantDelta = true;
        writeSseEvent(res, {
          type: "response.output_text.delta",
          item_id: outputItemId,
          output_index: 0,
          content_index: 0,
          delta: content,
        });
      }
    } catch (err) {
      if (closed) return;
      const errorResponse = createResponseResource({
        id: responseId,
        model,
        status: "failed",
        output: [],
        error: { code: "api_error", message: String(err) },
      });
      writeSseEvent(res, { type: "response.failed", response: errorResponse });
      emitAgentEvent({
        runId: responseId,
        stream: "lifecycle",
        data: { phase: "error" },
      });
    } finally {
      if (!closed) {
        // Emit lifecycle end to trigger completion
        emitAgentEvent({
          runId: responseId,
          stream: "lifecycle",
          data: { phase: "end" },
        });
      }
    }
  })();
  return true;
 }
--- a/src/gateway/server-http.ts
+++ b/src/gateway/server-http.ts
@@ -26,6 +26,7 @@ import {
 } from "./hooks.js";
 import { applyHookMappings } from "./hooks-mapping.js";
 import { handleOpenAiHttpRequest } from "./openai-http.js";
 import { handleOpenResponsesHttpRequest } from "./openresponses-http.js";
 type SubsystemLogger = ReturnType<typeof createSubsystemLogger>;
@@ -192,6 +193,7 @@ export function createGatewayHttpServer(opts: {
  controlUiEnabled: boolean;
  controlUiBasePath: string;
  openAiChatCompletionsEnabled: boolean;
  openResponsesEnabled: boolean;
  handleHooksRequest: HooksRequestHandler;
  handlePluginRequest?: HooksRequestHandler;
  resolvedAuth: import("./auth.js").ResolvedGatewayAuth;
@@ -202,6 +204,7 @@ export function createGatewayHttpServer(opts: {
    controlUiEnabled,
    controlUiBasePath,
    openAiChatCompletionsEnabled,
    openResponsesEnabled,
    handleHooksRequest,
    handlePluginRequest,
    resolvedAuth,
@@ -222,6 +225,9 @@ export function createGatewayHttpServer(opts: {
      if (await handleHooksRequest(req, res)) return;
      if (await handleSlackHttpRequest(req, res)) return;
      if (handlePluginRequest && (await handlePluginRequest(req, res))) return;
      if (openResponsesEnabled) {
        if (await handleOpenResponsesHttpRequest(req, res, { auth: resolvedAuth })) return;
      }
      if (openAiChatCompletionsEnabled) {
        if (await handleOpenAiHttpRequest(req, res, { auth: resolvedAuth })) return;
      }
--- a/src/gateway/server-runtime-config.ts
+++ b/src/gateway/server-runtime-config.ts
@@ -17,6 +17,7 @@ export type GatewayRuntimeConfig = {
  bindHost: string;
  controlUiEnabled: boolean;
  openAiChatCompletionsEnabled: boolean;
  openResponsesEnabled: boolean;
  controlUiBasePath: string;
  resolvedAuth: ResolvedGatewayAuth;
  authMode: ResolvedGatewayAuth["mode"];
@@ -33,6 +34,7 @@ export async function resolveGatewayRuntimeConfig(params: {
  host?: string;
  controlUiEnabled?: boolean;
  openAiChatCompletionsEnabled?: boolean;
  openResponsesEnabled?: boolean;
  auth?: GatewayAuthConfig;
  tailscale?: GatewayTailscaleConfig;
 }): Promise<GatewayRuntimeConfig> {
@@ -45,6 +47,8 @@ export async function resolveGatewayRuntimeConfig(params: {
    params.openAiChatCompletionsEnabled ??
    params.cfg.gateway?.http?.endpoints?.chatCompletions?.enabled ??
    false;
  const openResponsesEnabled =
    params.openResponsesEnabled ?? params.cfg.gateway?.http?.endpoints?.responses?.enabled ?? false;
  const controlUiBasePath = normalizeControlUiBasePath(params.cfg.gateway?.controlUi?.basePath);
  const authBase = params.cfg.gateway?.auth ?? {};
  const authOverrides = params.auth ?? {};
@@ -88,6 +92,7 @@ export async function resolveGatewayRuntimeConfig(params: {
    bindHost,
    controlUiEnabled,
    openAiChatCompletionsEnabled,
    openResponsesEnabled,
    controlUiBasePath,
    resolvedAuth,
    authMode,
--- a/src/gateway/server-runtime-state.ts
+++ b/src/gateway/server-runtime-state.ts
@@ -27,6 +27,7 @@ export async function createGatewayRuntimeState(params: {
  controlUiEnabled: boolean;
  controlUiBasePath: string;
  openAiChatCompletionsEnabled: boolean;
  openResponsesEnabled: boolean;
  resolvedAuth: ResolvedGatewayAuth;
  gatewayTls?: GatewayTlsRuntime;
  hooksConfig: () => HooksConfigResolved | null;
@@ -103,6 +104,7 @@ export async function createGatewayRuntimeState(params: {
    controlUiEnabled: params.controlUiEnabled,
    controlUiBasePath: params.controlUiBasePath,
    openAiChatCompletionsEnabled: params.openAiChatCompletionsEnabled,
    openResponsesEnabled: params.openResponsesEnabled,
    handleHooksRequest,
    handlePluginRequest,
    resolvedAuth: params.resolvedAuth,
--- a/src/gateway/server.impl.ts
+++ b/src/gateway/server.impl.ts
@@ -111,6 +111,11 @@ export type GatewayServerOptions = {
   * Default: config `gateway.http.endpoints.chatCompletions.enabled` (or false when absent).
   */
  openAiChatCompletionsEnabled?: boolean;
  /**
   * If false, do not serve `POST /v1/responses` (OpenResponses API).
   * Default: config `gateway.http.endpoints.responses.enabled` (or false when absent).
   */
  openResponsesEnabled?: boolean;
  /**
   * Override gateway auth configuration (merges with config).
   */
@@ -205,6 +210,7 @@ export async function startGatewayServer(
    host: opts.host,
    controlUiEnabled: opts.controlUiEnabled,
    openAiChatCompletionsEnabled: opts.openAiChatCompletionsEnabled,
    openResponsesEnabled: opts.openResponsesEnabled,
    auth: opts.auth,
    tailscale: opts.tailscale,
  });
@@ -212,6 +218,7 @@ export async function startGatewayServer(
    bindHost,
    controlUiEnabled,
    openAiChatCompletionsEnabled,
    openResponsesEnabled,
    controlUiBasePath,
    resolvedAuth,
    tailscaleConfig,
@@ -250,6 +257,7 @@ export async function startGatewayServer(
    controlUiEnabled,
    controlUiBasePath,
    openAiChatCompletionsEnabled,
    openResponsesEnabled,
    resolvedAuth,
    gatewayTls,
    hooksConfig: () => hooksConfig,