test(gateway): add OpenResponses parity E2E tests
- Add schema validation tests for input_image, input_file, client tools - Add buildAgentPrompt tests for turn-based tool flow
This commit is contained in:
committed by
Peter Steinberger
parent
a5afe7bc2b
commit
4f02c74dca
BIN
.agent/.DS_Store
vendored
Normal file
BIN
.agent/.DS_Store
vendored
Normal file
Binary file not shown.
366
.agent/workflows/update_clawdbot.md
Normal file
366
.agent/workflows/update_clawdbot.md
Normal file
@@ -0,0 +1,366 @@
|
||||
---
|
||||
description: Update Clawdbot from upstream when branch has diverged (ahead/behind)
|
||||
---
|
||||
|
||||
# Clawdbot Upstream Sync Workflow
|
||||
|
||||
Use this workflow when your fork has diverged from upstream (e.g., "18 commits ahead, 29 commits behind").
|
||||
|
||||
## Quick Reference
|
||||
|
||||
```bash
|
||||
# Check divergence status
|
||||
git fetch upstream && git rev-list --left-right --count main...upstream/main
|
||||
|
||||
# Full sync (rebase preferred)
|
||||
git fetch upstream && git rebase upstream/main && pnpm install && pnpm build && ./scripts/restart-mac.sh
|
||||
|
||||
# Check for Swift 6.2 issues after sync
|
||||
grep -r "FileManager\.default\|Thread\.isMainThread" src/ apps/ --include="*.swift"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Assess Divergence
|
||||
|
||||
```bash
|
||||
git fetch upstream
|
||||
git log --oneline --left-right main...upstream/main | head -20
|
||||
```
|
||||
|
||||
This shows:
|
||||
- `<` = your local commits (ahead)
|
||||
- `>` = upstream commits you're missing (behind)
|
||||
|
||||
**Decision point:**
|
||||
- Few local commits, many upstream → **Rebase** (cleaner history)
|
||||
- Many local commits or shared branch → **Merge** (preserves history)
|
||||
|
||||
---
|
||||
|
||||
## Step 2A: Rebase Strategy (Preferred)
|
||||
|
||||
Replays your commits on top of upstream. Results in linear history.
|
||||
|
||||
```bash
|
||||
# Ensure working tree is clean
|
||||
git status
|
||||
|
||||
# Rebase onto upstream
|
||||
git rebase upstream/main
|
||||
```
|
||||
|
||||
### Handling Rebase Conflicts
|
||||
|
||||
```bash
|
||||
# When conflicts occur:
|
||||
# 1. Fix conflicts in the listed files
|
||||
# 2. Stage resolved files
|
||||
git add <resolved-files>
|
||||
|
||||
# 3. Continue rebase
|
||||
git rebase --continue
|
||||
|
||||
# If a commit is no longer needed (already in upstream):
|
||||
git rebase --skip
|
||||
|
||||
# To abort and return to original state:
|
||||
git rebase --abort
|
||||
```
|
||||
|
||||
### Common Conflict Patterns
|
||||
|
||||
| File | Resolution |
|
||||
|------|------------|
|
||||
| `package.json` | Take upstream deps, keep local scripts if needed |
|
||||
| `pnpm-lock.yaml` | Accept upstream, regenerate with `pnpm install` |
|
||||
| `*.patch` files | Usually take upstream version |
|
||||
| Source files | Merge logic carefully, prefer upstream structure |
|
||||
|
||||
---
|
||||
|
||||
## Step 2B: Merge Strategy (Alternative)
|
||||
|
||||
Preserves all history with a merge commit.
|
||||
|
||||
```bash
|
||||
git merge upstream/main --no-edit
|
||||
```
|
||||
|
||||
Resolve conflicts same as rebase, then:
|
||||
```bash
|
||||
git add <resolved-files>
|
||||
git commit
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Rebuild Everything
|
||||
|
||||
After sync completes:
|
||||
|
||||
```bash
|
||||
# Install dependencies (regenerates lock if needed)
|
||||
pnpm install
|
||||
|
||||
# Build TypeScript
|
||||
pnpm build
|
||||
|
||||
# Build UI assets
|
||||
pnpm ui:build
|
||||
|
||||
# Run diagnostics
|
||||
pnpm clawdbot doctor
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Rebuild macOS App
|
||||
|
||||
```bash
|
||||
# Full rebuild, sign, and launch
|
||||
./scripts/restart-mac.sh
|
||||
|
||||
# Or just package without restart
|
||||
pnpm mac:package
|
||||
```
|
||||
|
||||
### Install to /Applications
|
||||
|
||||
```bash
|
||||
# Kill running app
|
||||
pkill -x "Clawdbot" || true
|
||||
|
||||
# Move old version
|
||||
mv /Applications/Clawdbot.app /tmp/Clawdbot-backup.app
|
||||
|
||||
# Install new build
|
||||
cp -R dist/Clawdbot.app /Applications/
|
||||
|
||||
# Launch
|
||||
open /Applications/Clawdbot.app
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4A: Verify macOS App & Agent
|
||||
|
||||
After rebuilding the macOS app, always verify it works correctly:
|
||||
|
||||
```bash
|
||||
# Check gateway health
|
||||
pnpm clawdbot health
|
||||
|
||||
# Verify no zombie processes
|
||||
ps aux | grep -E "(clawdbot|gateway)" | grep -v grep
|
||||
|
||||
# Test agent functionality by sending a verification message
|
||||
pnpm clawdbot agent --message "Verification: macOS app rebuild successful - agent is responding." --session-id YOUR_TELEGRAM_SESSION_ID
|
||||
|
||||
# Confirm the message was received on Telegram
|
||||
# (Check your Telegram chat with the bot)
|
||||
```
|
||||
|
||||
**Important:** Always wait for the Telegram verification message before proceeding. If the agent doesn't respond, troubleshoot the gateway or model configuration before pushing.
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Handle Swift/macOS Build Issues (Common After Upstream Sync)
|
||||
|
||||
Upstream updates may introduce Swift 6.2 / macOS 26 SDK incompatibilities. Use analyze-mode for systematic debugging:
|
||||
|
||||
### Analyze-Mode Investigation
|
||||
```bash
|
||||
# Gather context with parallel agents
|
||||
morph-mcp_warpgrep_codebase_search search_string="Find deprecated FileManager.default and Thread.isMainThread usages in Swift files" repo_path="/Volumes/Main SSD/Developer/clawdis"
|
||||
morph-mcp_warpgrep_codebase_search search_string="Locate Peekaboo submodule and macOS app Swift files with concurrency issues" repo_path="/Volumes/Main SSD/Developer/clawdis"
|
||||
```
|
||||
|
||||
### Common Swift 6.2 Fixes
|
||||
|
||||
**FileManager.default Deprecation:**
|
||||
```bash
|
||||
# Search for deprecated usage
|
||||
grep -r "FileManager\.default" src/ apps/ --include="*.swift"
|
||||
|
||||
# Replace with proper initialization
|
||||
# OLD: FileManager.default
|
||||
# NEW: FileManager()
|
||||
```
|
||||
|
||||
**Thread.isMainThread Deprecation:**
|
||||
```bash
|
||||
# Search for deprecated usage
|
||||
grep -r "Thread\.isMainThread" src/ apps/ --include="*.swift"
|
||||
|
||||
# Replace with modern concurrency check
|
||||
# OLD: Thread.isMainThread
|
||||
# NEW: await MainActor.run { ... } or DispatchQueue.main.sync { ... }
|
||||
```
|
||||
|
||||
### Peekaboo Submodule Fixes
|
||||
```bash
|
||||
# Check Peekaboo for concurrency issues
|
||||
cd src/canvas-host/a2ui
|
||||
grep -r "Thread\.isMainThread\|FileManager\.default" . --include="*.swift"
|
||||
|
||||
# Fix and rebuild submodule
|
||||
cd /Volumes/Main SSD/Developer/clawdis
|
||||
pnpm canvas:a2ui:bundle
|
||||
```
|
||||
|
||||
### macOS App Concurrency Fixes
|
||||
```bash
|
||||
# Check macOS app for issues
|
||||
grep -r "Thread\.isMainThread\|FileManager\.default" apps/macos/ --include="*.swift"
|
||||
|
||||
# Clean and rebuild after fixes
|
||||
cd apps/macos && rm -rf .build .swiftpm
|
||||
./scripts/restart-mac.sh
|
||||
```
|
||||
|
||||
### Model Configuration Updates
|
||||
If upstream introduced new model configurations:
|
||||
```bash
|
||||
# Check for OpenRouter API key requirements
|
||||
grep -r "openrouter\|OPENROUTER" src/ --include="*.ts" --include="*.js"
|
||||
|
||||
# Update clawdbot.json with fallback chains
|
||||
# Add model fallback configurations as needed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Verify & Push
|
||||
|
||||
```bash
|
||||
# Verify everything works
|
||||
pnpm clawdbot health
|
||||
pnpm test
|
||||
|
||||
# Push (force required after rebase)
|
||||
git push origin main --force-with-lease
|
||||
|
||||
# Or regular push after merge
|
||||
git push origin main
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Build Fails After Sync
|
||||
|
||||
```bash
|
||||
# Clean and rebuild
|
||||
rm -rf node_modules dist
|
||||
pnpm install
|
||||
pnpm build
|
||||
```
|
||||
|
||||
### Type Errors (Bun/Node Incompatibility)
|
||||
|
||||
Common issue: `fetch.preconnect` type mismatch. Fix by using `FetchLike` type instead of `typeof fetch`.
|
||||
|
||||
### macOS App Crashes on Launch
|
||||
|
||||
Usually resource bundle mismatch. Full rebuild required:
|
||||
```bash
|
||||
cd apps/macos && rm -rf .build .swiftpm
|
||||
./scripts/restart-mac.sh
|
||||
```
|
||||
|
||||
### Patch Failures
|
||||
|
||||
```bash
|
||||
# Check patch status
|
||||
pnpm install 2>&1 | grep -i patch
|
||||
|
||||
# If patches fail, they may need updating for new dep versions
|
||||
# Check patches/ directory against package.json patchedDependencies
|
||||
```
|
||||
|
||||
### Swift 6.2 / macOS 26 SDK Build Failures
|
||||
|
||||
**Symptoms:** Build fails with deprecation warnings about `FileManager.default` or `Thread.isMainThread`
|
||||
|
||||
**Search-Mode Investigation:**
|
||||
```bash
|
||||
# Exhaustive search for deprecated APIs
|
||||
morph-mcp_warpgrep_codebase_search search_string="Find all Swift files using deprecated FileManager.default or Thread.isMainThread" repo_path="/Volumes/Main SSD/Developer/clawdis"
|
||||
```
|
||||
|
||||
**Quick Fix Commands:**
|
||||
```bash
|
||||
# Find all affected files
|
||||
find . -name "*.swift" -exec grep -l "FileManager\.default\|Thread\.isMainThread" {} \;
|
||||
|
||||
# Replace FileManager.default with FileManager()
|
||||
find . -name "*.swift" -exec sed -i '' 's/FileManager\.default/FileManager()/g' {} \;
|
||||
|
||||
# For Thread.isMainThread, need manual review of each usage
|
||||
grep -rn "Thread\.isMainThread" --include="*.swift" .
|
||||
```
|
||||
|
||||
**Rebuild After Fixes:**
|
||||
```bash
|
||||
# Clean all build artifacts
|
||||
rm -rf apps/macos/.build apps/macos/.swiftpm
|
||||
rm -rf src/canvas-host/a2ui/.build
|
||||
|
||||
# Rebuild Peekaboo bundle
|
||||
pnpm canvas:a2ui:bundle
|
||||
|
||||
# Full macOS rebuild
|
||||
./scripts/restart-mac.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Automation Script
|
||||
|
||||
Save as `scripts/sync-upstream.sh`:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
echo "==> Fetching upstream..."
|
||||
git fetch upstream
|
||||
|
||||
echo "==> Current divergence:"
|
||||
git rev-list --left-right --count main...upstream/main
|
||||
|
||||
echo "==> Rebasing onto upstream/main..."
|
||||
git rebase upstream/main
|
||||
|
||||
echo "==> Installing dependencies..."
|
||||
pnpm install
|
||||
|
||||
echo "==> Building..."
|
||||
pnpm build
|
||||
pnpm ui:build
|
||||
|
||||
echo "==> Running doctor..."
|
||||
pnpm clawdbot doctor
|
||||
|
||||
echo "==> Rebuilding macOS app..."
|
||||
./scripts/restart-mac.sh
|
||||
|
||||
echo "==> Verifying gateway health..."
|
||||
pnpm clawdbot health
|
||||
|
||||
echo "==> Checking for Swift 6.2 compatibility issues..."
|
||||
if grep -r "FileManager\.default\|Thread\.isMainThread" src/ apps/ --include="*.swift" --quiet; then
|
||||
echo "⚠️ Found potential Swift 6.2 deprecated API usage"
|
||||
echo " Run manual fixes or use analyze-mode investigation"
|
||||
else
|
||||
echo "✅ No obvious Swift deprecation issues found"
|
||||
fi
|
||||
|
||||
echo "==> Testing agent functionality..."
|
||||
# Note: Update YOUR_TELEGRAM_SESSION_ID with actual session ID
|
||||
pnpm clawdbot agent --message "Verification: Upstream sync and macOS rebuild completed successfully." --session-id YOUR_TELEGRAM_SESSION_ID || echo "Warning: Agent test failed - check Telegram for verification message"
|
||||
|
||||
echo "==> Done! Check Telegram for verification message, then run 'git push --force-with-lease' when ready."
|
||||
```
|
||||
1
.serena/.gitignore
vendored
Normal file
1
.serena/.gitignore
vendored
Normal file
@@ -0,0 +1 @@
|
||||
/cache
|
||||
BIN
.serena/cache/typescript/document_symbols.pkl
vendored
Normal file
BIN
.serena/cache/typescript/document_symbols.pkl
vendored
Normal file
Binary file not shown.
BIN
.serena/cache/typescript/raw_document_symbols.pkl
vendored
Normal file
BIN
.serena/cache/typescript/raw_document_symbols.pkl
vendored
Normal file
Binary file not shown.
87
.serena/project.yml
Normal file
87
.serena/project.yml
Normal file
@@ -0,0 +1,87 @@
|
||||
# list of languages for which language servers are started; choose from:
|
||||
# al bash clojure cpp csharp csharp_omnisharp
|
||||
# dart elixir elm erlang fortran fsharp
|
||||
# go groovy haskell java julia kotlin
|
||||
# lua markdown nix pascal perl php
|
||||
# powershell python python_jedi r rego ruby
|
||||
# ruby_solargraph rust scala swift terraform toml
|
||||
# typescript typescript_vts yaml zig
|
||||
# Note:
|
||||
# - For C, use cpp
|
||||
# - For JavaScript, use typescript
|
||||
# - For Free Pascal / Lazarus, use pascal
|
||||
# Special requirements:
|
||||
# - csharp: Requires the presence of a .sln file in the project folder.
|
||||
# - pascal: Requires Free Pascal Compiler (fpc) and optionally Lazarus.
|
||||
# When using multiple languages, the first language server that supports a given file will be used for that file.
|
||||
# The first language is the default language and the respective language server will be used as a fallback.
|
||||
# Note that when using the JetBrains backend, language servers are not used and this list is correspondingly ignored.
|
||||
languages:
|
||||
- typescript
|
||||
|
||||
# the encoding used by text files in the project
|
||||
# For a list of possible encodings, see https://docs.python.org/3.11/library/codecs.html#standard-encodings
|
||||
encoding: "utf-8"
|
||||
|
||||
# whether to use the project's gitignore file to ignore files
|
||||
# Added on 2025-04-07
|
||||
ignore_all_files_in_gitignore: true
|
||||
|
||||
# list of additional paths to ignore
|
||||
# same syntax as gitignore, so you can use * and **
|
||||
# Was previously called `ignored_dirs`, please update your config if you are using that.
|
||||
# Added (renamed) on 2025-04-07
|
||||
ignored_paths: []
|
||||
|
||||
# whether the project is in read-only mode
|
||||
# If set to true, all editing tools will be disabled and attempts to use them will result in an error
|
||||
# Added on 2025-04-18
|
||||
read_only: false
|
||||
|
||||
# list of tool names to exclude. We recommend not excluding any tools, see the readme for more details.
|
||||
# Below is the complete list of tools for convenience.
|
||||
# To make sure you have the latest list of tools, and to view their descriptions,
|
||||
# execute `uv run scripts/print_tool_overview.py`.
|
||||
#
|
||||
# * `activate_project`: Activates a project by name.
|
||||
# * `check_onboarding_performed`: Checks whether project onboarding was already performed.
|
||||
# * `create_text_file`: Creates/overwrites a file in the project directory.
|
||||
# * `delete_lines`: Deletes a range of lines within a file.
|
||||
# * `delete_memory`: Deletes a memory from Serena's project-specific memory store.
|
||||
# * `execute_shell_command`: Executes a shell command.
|
||||
# * `find_referencing_code_snippets`: Finds code snippets in which the symbol at the given location is referenced.
|
||||
# * `find_referencing_symbols`: Finds symbols that reference the symbol at the given location (optionally filtered by type).
|
||||
# * `find_symbol`: Performs a global (or local) search for symbols with/containing a given name/substring (optionally filtered by type).
|
||||
# * `get_current_config`: Prints the current configuration of the agent, including the active and available projects, tools, contexts, and modes.
|
||||
# * `get_symbols_overview`: Gets an overview of the top-level symbols defined in a given file.
|
||||
# * `initial_instructions`: Gets the initial instructions for the current project.
|
||||
# Should only be used in settings where the system prompt cannot be set,
|
||||
# e.g. in clients you have no control over, like Claude Desktop.
|
||||
# * `insert_after_symbol`: Inserts content after the end of the definition of a given symbol.
|
||||
# * `insert_at_line`: Inserts content at a given line in a file.
|
||||
# * `insert_before_symbol`: Inserts content before the beginning of the definition of a given symbol.
|
||||
# * `list_dir`: Lists files and directories in the given directory (optionally with recursion).
|
||||
# * `list_memories`: Lists memories in Serena's project-specific memory store.
|
||||
# * `onboarding`: Performs onboarding (identifying the project structure and essential tasks, e.g. for testing or building).
|
||||
# * `prepare_for_new_conversation`: Provides instructions for preparing for a new conversation (in order to continue with the necessary context).
|
||||
# * `read_file`: Reads a file within the project directory.
|
||||
# * `read_memory`: Reads the memory with the given name from Serena's project-specific memory store.
|
||||
# * `remove_project`: Removes a project from the Serena configuration.
|
||||
# * `replace_lines`: Replaces a range of lines within a file with new content.
|
||||
# * `replace_symbol_body`: Replaces the full definition of a symbol.
|
||||
# * `restart_language_server`: Restarts the language server, may be necessary when edits not through Serena happen.
|
||||
# * `search_for_pattern`: Performs a search for a pattern in the project.
|
||||
# * `summarize_changes`: Provides instructions for summarizing the changes made to the codebase.
|
||||
# * `switch_modes`: Activates modes by providing a list of their names
|
||||
# * `think_about_collected_information`: Thinking tool for pondering the completeness of collected information.
|
||||
# * `think_about_task_adherence`: Thinking tool for determining whether the agent is still on track with the current task.
|
||||
# * `think_about_whether_you_are_done`: Thinking tool for determining whether the task is truly completed.
|
||||
# * `write_memory`: Writes a named memory (for future reference) to Serena's project-specific memory store.
|
||||
excluded_tools: []
|
||||
|
||||
# initial prompt for the project. It will always be given to the LLM upon activating the project
|
||||
# (contrary to the memories, which are loaded on demand).
|
||||
initial_prompt: ""
|
||||
|
||||
project_name: "clawdbot"
|
||||
included_optional_tools: []
|
||||
315
src/gateway/openresponses-parity.e2e.test.ts
Normal file
315
src/gateway/openresponses-parity.e2e.test.ts
Normal file
@@ -0,0 +1,315 @@
|
||||
/**
|
||||
* OpenResponses Feature Parity E2E Tests
|
||||
*
|
||||
* Tests for input_image, input_file, and client-side tools (Hosted Tools)
|
||||
* support in the OpenResponses `/v1/responses` endpoint.
|
||||
*/
|
||||
|
||||
import { describe, it, expect } from "vitest";
|
||||
|
||||
describe("OpenResponses Feature Parity", () => {
|
||||
describe("Schema Validation", () => {
|
||||
it("should validate input_image with url source", async () => {
|
||||
const { InputImageContentPartSchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const validImage = {
|
||||
type: "input_image" as const,
|
||||
source: {
|
||||
type: "url" as const,
|
||||
url: "https://example.com/image.png",
|
||||
},
|
||||
};
|
||||
|
||||
const result = InputImageContentPartSchema.safeParse(validImage);
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should validate input_image with base64 source", async () => {
|
||||
const { InputImageContentPartSchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const validImage = {
|
||||
type: "input_image" as const,
|
||||
source: {
|
||||
type: "base64" as const,
|
||||
media_type: "image/png" as const,
|
||||
data: "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==",
|
||||
},
|
||||
};
|
||||
|
||||
const result = InputImageContentPartSchema.safeParse(validImage);
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should reject input_image with invalid mime type", async () => {
|
||||
const { InputImageContentPartSchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const invalidImage = {
|
||||
type: "input_image" as const,
|
||||
source: {
|
||||
type: "base64" as const,
|
||||
media_type: "application/json" as const, // Not an image
|
||||
data: "SGVsbG8gV29ybGQh",
|
||||
},
|
||||
};
|
||||
|
||||
const result = InputImageContentPartSchema.safeParse(invalidImage);
|
||||
expect(result.success).toBe(false);
|
||||
});
|
||||
|
||||
it("should validate input_file with url source", async () => {
|
||||
const { InputFileContentPartSchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const validFile = {
|
||||
type: "input_file" as const,
|
||||
source: {
|
||||
type: "url" as const,
|
||||
url: "https://example.com/document.txt",
|
||||
},
|
||||
};
|
||||
|
||||
const result = InputFileContentPartSchema.safeParse(validFile);
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should validate input_file with base64 source", async () => {
|
||||
const { InputFileContentPartSchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const validFile = {
|
||||
type: "input_file" as const,
|
||||
source: {
|
||||
type: "base64" as const,
|
||||
media_type: "text/plain" as const,
|
||||
data: "SGVsbG8gV29ybGQh",
|
||||
filename: "hello.txt",
|
||||
},
|
||||
};
|
||||
|
||||
const result = InputFileContentPartSchema.safeParse(validFile);
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should validate tool definition", async () => {
|
||||
const { ToolDefinitionSchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const validTool = {
|
||||
type: "function" as const,
|
||||
function: {
|
||||
name: "get_weather",
|
||||
description: "Get the current weather",
|
||||
parameters: {
|
||||
type: "object",
|
||||
properties: {
|
||||
location: { type: "string" },
|
||||
},
|
||||
required: ["location"],
|
||||
},
|
||||
},
|
||||
};
|
||||
|
||||
const result = ToolDefinitionSchema.safeParse(validTool);
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should reject tool definition without name", async () => {
|
||||
const { ToolDefinitionSchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const invalidTool = {
|
||||
type: "function" as const,
|
||||
function: {
|
||||
name: "", // Empty name
|
||||
description: "Get the current weather",
|
||||
},
|
||||
};
|
||||
|
||||
const result = ToolDefinitionSchema.safeParse(invalidTool);
|
||||
expect(result.success).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe("CreateResponseBody Schema", () => {
|
||||
it("should validate request with input_image", async () => {
|
||||
const { CreateResponseBodySchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const validRequest = {
|
||||
model: "claude-sonnet-4-20250514",
|
||||
input: [
|
||||
{
|
||||
type: "message" as const,
|
||||
role: "user" as const,
|
||||
content: [
|
||||
{
|
||||
type: "input_image" as const,
|
||||
source: {
|
||||
type: "url" as const,
|
||||
url: "https://example.com/photo.jpg",
|
||||
},
|
||||
},
|
||||
{
|
||||
type: "input_text" as const,
|
||||
text: "What's in this image?",
|
||||
},
|
||||
],
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
const result = CreateResponseBodySchema.safeParse(validRequest);
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should validate request with client tools", async () => {
|
||||
const { CreateResponseBodySchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const validRequest = {
|
||||
model: "claude-sonnet-4-20250514",
|
||||
input: [
|
||||
{
|
||||
type: "message" as const,
|
||||
role: "user" as const,
|
||||
content: "What's the weather?",
|
||||
},
|
||||
],
|
||||
tools: [
|
||||
{
|
||||
type: "function" as const,
|
||||
function: {
|
||||
name: "get_weather",
|
||||
description: "Get weather for a location",
|
||||
parameters: {
|
||||
type: "object",
|
||||
properties: {
|
||||
location: { type: "string" },
|
||||
},
|
||||
required: ["location"],
|
||||
},
|
||||
},
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
const result = CreateResponseBodySchema.safeParse(validRequest);
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should validate request with function_call_output for turn-based tools", async () => {
|
||||
const { CreateResponseBodySchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const validRequest = {
|
||||
model: "claude-sonnet-4-20250514",
|
||||
input: [
|
||||
{
|
||||
type: "function_call_output" as const,
|
||||
call_id: "call_123",
|
||||
output: '{"temperature": "72°F", "condition": "sunny"}',
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
const result = CreateResponseBodySchema.safeParse(validRequest);
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should validate complete turn-based tool flow", async () => {
|
||||
const { CreateResponseBodySchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const turn1Request = {
|
||||
model: "claude-sonnet-4-20250514",
|
||||
input: [
|
||||
{
|
||||
type: "message" as const,
|
||||
role: "user" as const,
|
||||
content: "What's the weather in San Francisco?",
|
||||
},
|
||||
],
|
||||
tools: [
|
||||
{
|
||||
type: "function" as const,
|
||||
function: {
|
||||
name: "get_weather",
|
||||
description: "Get weather for a location",
|
||||
},
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
const turn1Result = CreateResponseBodySchema.safeParse(turn1Request);
|
||||
expect(turn1Result.success).toBe(true);
|
||||
|
||||
// Turn 2: Client provides tool output
|
||||
const turn2Request = {
|
||||
model: "claude-sonnet-4-20250514",
|
||||
input: [
|
||||
{
|
||||
type: "function_call_output" as const,
|
||||
call_id: "call_123",
|
||||
output: '{"temperature": "72°F", "condition": "sunny"}',
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
const turn2Result = CreateResponseBodySchema.safeParse(turn2Request);
|
||||
expect(turn2Result.success).toBe(true);
|
||||
});
|
||||
});
|
||||
|
||||
describe("Response Resource Schema", () => {
|
||||
it("should validate response with function_call output", async () => {
|
||||
const { OutputItemSchema } = await import("./open-responses.schema.js");
|
||||
|
||||
const functionCallOutput = {
|
||||
type: "function_call" as const,
|
||||
id: "msg_123",
|
||||
call_id: "call_456",
|
||||
name: "get_weather",
|
||||
arguments: '{"location": "San Francisco"}',
|
||||
};
|
||||
|
||||
const result = OutputItemSchema.safeParse(functionCallOutput);
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
});
|
||||
|
||||
describe("buildAgentPrompt", () => {
|
||||
it("should convert function_call_output to tool entry", async () => {
|
||||
const { buildAgentPrompt } = await import("./openresponses-http.js");
|
||||
|
||||
const result = buildAgentPrompt([
|
||||
{
|
||||
type: "function_call_output" as const,
|
||||
call_id: "call_123",
|
||||
output: '{"temperature": "72°F"}',
|
||||
},
|
||||
]);
|
||||
|
||||
// When there's only a tool output (no history), returns just the body
|
||||
expect(result.message).toBe('{"temperature": "72°F"}');
|
||||
});
|
||||
|
||||
it("should handle mixed message and function_call_output items", async () => {
|
||||
const { buildAgentPrompt } = await import("./openresponses-http.js");
|
||||
|
||||
const result = buildAgentPrompt([
|
||||
{
|
||||
type: "message" as const,
|
||||
role: "user" as const,
|
||||
content: "What's the weather?",
|
||||
},
|
||||
{
|
||||
type: "function_call_output" as const,
|
||||
call_id: "call_123",
|
||||
output: '{"temperature": "72°F"}',
|
||||
},
|
||||
{
|
||||
type: "message" as const,
|
||||
role: "user" as const,
|
||||
content: "Thanks!",
|
||||
},
|
||||
]);
|
||||
|
||||
// Should include both user messages and tool output
|
||||
expect(result.message).toContain("weather");
|
||||
expect(result.message).toContain("72°F");
|
||||
expect(result.message).toContain("Thanks");
|
||||
});
|
||||
});
|
||||
});
|
||||
Reference in New Issue
Block a user