diff --git a/README.md b/README.md index ab5b8da..355a9ae 100644 --- a/README.md +++ b/README.md @@ -11,12 +11,13 @@ OpenAI 兼容的 API 代理服务器,统一访问不同的 LLM 模型。 - **智能优先级** - FACTORY_API_KEY > refresh_token > 客户端authorization - **容错启动** - 无任何认证配置时不报错,继续运行支持客户端授权 -### 🧠 模型推理能力级别 -- **四档推理级别** - off/low/medium/high,精确控制模型思考深度 +### 🧠 智能推理级别控制 +- **五档推理级别** - auto/off/low/medium/high,灵活控制推理行为 +- **auto模式** - 完全遵循客户端原始请求,不做任何推理参数修改 +- **固定级别** - off/low/medium/high强制覆盖客户端推理设置 - **OpenAI模型** - 自动注入reasoning字段,effort参数控制推理强度 - **Anthropic模型** - 自动配置thinking字段和budget_tokens (4096/12288/24576) -- **智能Beta头管理** - 自动添加/移除anthropic-beta字段中的推理相关标识 -- **配置驱动** - 通过config.json灵活调整每个模型的推理级别 +- **智能头管理** - 根据推理级别自动添加/移除anthropic-beta相关标识 ### 🚀 服务器部署/Docker部署 - **本地服务器** - 支持npm start快速启动 @@ -103,34 +104,37 @@ export DROID_REFRESH_KEY="your_refresh_token_here" #### 推理级别配置 -每个模型支持四种推理级别: +每个模型支持五种推理级别: -- **`off`** - 关闭推理功能,使用标准响应 +- **`auto`** - 遵循客户端原始请求,不做任何推理参数修改 +- **`off`** - 强制关闭推理功能,删除所有推理字段 - **`low`** - 低级推理 (Anthropic: 4096 tokens, OpenAI: low effort) -- **`medium`** - 中级推理 (Anthropic: 12288 tokens, OpenAI: medium effort) +- **`medium`** - 中级推理 (Anthropic: 12288 tokens, OpenAI: medium effort) - **`high`** - 高级推理 (Anthropic: 24576 tokens, OpenAI: high effort) **对于Anthropic模型 (Claude)**: ```json { - "name": "Claude Sonnet 4.5", + "name": "Claude Sonnet 4.5", "id": "claude-sonnet-4-5-20250929", "type": "anthropic", - "reasoning": "high" + "reasoning": "auto" // 推荐:让客户端控制推理 } ``` -自动添加thinking字段和anthropic-beta头,budget_tokens根据级别设置。 +- `auto`: 保留客户端thinking字段,不修改anthropic-beta头 +- `low/medium/high`: 自动添加thinking字段和anthropic-beta头,budget_tokens根据级别设置 **对于OpenAI模型 (GPT)**: ```json { - "name": "GPT-5-Codex", - "id": "gpt-5-codex", - "type": "openai", - "reasoning": "medium" + "name": "GPT-5", + "id": "gpt-5-2025-08-07", + "type": "openai", + "reasoning": "auto" // 推荐:让客户端控制推理 } ``` -自动添加reasoning字段,effort参数对应配置级别。 +- `auto`: 保留客户端reasoning字段不变 +- `low/medium/high`: 自动添加reasoning字段,effort参数设置为对应级别 ## 使用方法 @@ -311,6 +315,36 @@ droid2api完全尊重客户端的stream参数设置: - **`"stream": false`** - 禁用流式响应,等待完整结果后返回 - **不设置stream** - 由服务器端决定默认行为,不强制转换 +### 什么是auto推理模式? + +`auto` 是v1.3.0新增的推理级别,完全遵循客户端的原始请求: + +**行为特点**: +- 🎯 **零干预** - 不添加、不删除、不修改任何推理相关字段 +- 🔄 **完全透传** - 客户端发什么就转发什么 +- 🛡️ **头信息保护** - 不修改anthropic-beta等推理相关头信息 + +**使用场景**: +- 客户端需要完全控制推理参数 +- 与原始API行为保持100%一致 +- 不同客户端有不同的推理需求 + +**示例对比**: +```bash +# 客户端请求包含推理字段 +{ + "model": "claude-opus-4-1-20250805", + "reasoning": "auto", // 配置为auto + "messages": [...], + "thinking": {"type": "enabled", "budget_tokens": 8192} +} + +# auto模式:完全保留客户端设置 +→ thinking字段原样转发,不做任何修改 + +# 如果配置为"high":会被覆盖为 {"type": "enabled", "budget_tokens": 24576} +``` + ### 如何配置推理级别? 在 `config.json` 中为每个模型设置 `reasoning` 字段: @@ -319,14 +353,24 @@ droid2api完全尊重客户端的stream参数设置: { "models": [ { - "id": "claude-opus-4-1-20250805", + "id": "claude-opus-4-1-20250805", "type": "anthropic", - "reasoning": "high" // off/low/medium/high + "reasoning": "auto" // auto/off/low/medium/high } ] } ``` +**推理级别说明**: + +| 级别 | 行为 | 适用场景 | +|------|------|----------| +| `auto` | 完全遵循客户端原始请求参数 | 让客户端自主控制推理 | +| `off` | 强制禁用推理,删除所有推理字段 | 快速响应场景 | +| `low` | 轻度推理 (4096 tokens) | 简单任务 | +| `medium` | 中度推理 (12288 tokens) | 平衡性能与质量 | +| `high` | 深度推理 (24576 tokens) | 复杂任务 | + ### 令牌多久刷新一次? 系统每6小时自动刷新一次访问令牌。刷新令牌有效期为8小时,确保有2小时的缓冲时间。 @@ -346,9 +390,19 @@ Token refreshed successfully, expires at: 2025-01-XX XX:XX:XX ### 推理功能为什么没有生效? -1. 检查模型配置中的 `reasoning` 字段是否设置正确 -2. 确认模型类型匹配(anthropic模型用thinking,openai模型用reasoning) -3. 查看请求日志确认字段是否正确添加 +**如果推理级别设置无效**: +1. 检查模型配置中的 `reasoning` 字段是否为有效值 (`auto/off/low/medium/high`) +2. 确认模型ID是否正确匹配config.json中的配置 +3. 查看服务器日志确认推理字段是否正确处理 + +**如果使用auto模式但推理不生效**: +1. 确认客户端请求中包含了推理字段 (`reasoning` 或 `thinking`) +2. auto模式不会添加推理字段,只会保留客户端原有的设置 +3. 如需强制推理,请改用 `low/medium/high` 级别 + +**推理字段对应关系**: +- OpenAI模型 (`gpt-*`) → 使用 `reasoning` 字段 +- Anthropic模型 (`claude-*`) → 使用 `thinking` 字段 ### 如何更改端口? diff --git a/config.js b/config.js index 44e3085..eaa13a1 100644 --- a/config.js +++ b/config.js @@ -56,7 +56,7 @@ export function getModelReasoning(modelId) { return null; } const reasoningLevel = model.reasoning.toLowerCase(); - if (['low', 'medium', 'high'].includes(reasoningLevel)) { + if (['low', 'medium', 'high', 'auto'].includes(reasoningLevel)) { return reasoningLevel; } return null; diff --git a/config.json b/config.json index e3e211e..b4bad11 100644 --- a/config.json +++ b/config.json @@ -19,25 +19,25 @@ "name": "Opus 4.1", "id": "claude-opus-4-1-20250805", "type": "anthropic", - "reasoning": "off" + "reasoning": "auto" }, { "name": "Sonnet 4", "id": "claude-sonnet-4-20250514", "type": "anthropic", - "reasoning": "medium" + "reasoning": "auto" }, { "name": "Sonnet 4.5", "id": "claude-sonnet-4-5-20250929", "type": "anthropic", - "reasoning": "high" + "reasoning": "auto" }, { "name": "GPT-5", "id": "gpt-5-2025-08-07", "type": "openai", - "reasoning": "high" + "reasoning": "auto" }, { "name": "GPT-5-Codex", diff --git a/package.json b/package.json index 39db529..d0ca984 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "droid2api", - "version": "1.2.2", + "version": "1.3.0", "description": "OpenAI Compatible API Proxy", "main": "server.js", "type": "module", diff --git a/routes.js b/routes.js index 3d968c1..d8227b9 100644 --- a/routes.js +++ b/routes.js @@ -238,7 +238,10 @@ async function handleDirectResponses(req, res) { // 处理reasoning字段 const reasoningLevel = getModelReasoning(modelId); - if (reasoningLevel) { + if (reasoningLevel === 'auto') { + // Auto模式:保持原始请求的reasoning字段不变 + // 如果原始请求有reasoning字段就保留,没有就不添加 + } else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) { modifiedRequest.reasoning = { effort: reasoningLevel, summary: 'auto' @@ -373,7 +376,10 @@ async function handleDirectMessages(req, res) { // 处理thinking字段 const reasoningLevel = getModelReasoning(modelId); - if (reasoningLevel) { + if (reasoningLevel === 'auto') { + // Auto模式:保持原始请求的thinking字段不变 + // 如果原始请求有thinking字段就保留,没有就不添加 + } else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) { const budgetTokens = { 'low': 4096, 'medium': 12288, diff --git a/transformers/request-anthropic.js b/transformers/request-anthropic.js index 1fea716..52f6f41 100644 --- a/transformers/request-anthropic.js +++ b/transformers/request-anthropic.js @@ -113,7 +113,14 @@ export function transformToAnthropic(openaiRequest) { // Handle thinking field based on model configuration const reasoningLevel = getModelReasoning(openaiRequest.model); - if (reasoningLevel) { + if (reasoningLevel === 'auto') { + // Auto mode: preserve original request's thinking field exactly as-is + if (openaiRequest.thinking !== undefined) { + anthropicRequest.thinking = openaiRequest.thinking; + } + // If original request has no thinking field, don't add one + } else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) { + // Specific level: override with model configuration const budgetTokens = { 'low': 4096, 'medium': 12288, @@ -125,7 +132,7 @@ export function transformToAnthropic(openaiRequest) { budget_tokens: budgetTokens[reasoningLevel] }; } else { - // If reasoning is off or invalid, explicitly remove thinking field + // Off or invalid: explicitly remove thinking field // This ensures any thinking field from the original request is deleted delete anthropicRequest.thinking; } @@ -179,7 +186,10 @@ export function getAnthropicHeaders(authHeader, clientHeaders = {}, isStreaming // Handle thinking beta based on reasoning configuration const thinkingBeta = 'interleaved-thinking-2025-05-14'; - if (reasoningLevel) { + if (reasoningLevel === 'auto') { + // Auto mode: don't modify anthropic-beta header, preserve original + // betaValues remain unchanged from client headers + } else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) { // Add thinking beta if not already present if (!betaValues.includes(thinkingBeta)) { betaValues.push(thinkingBeta); diff --git a/transformers/request-openai.js b/transformers/request-openai.js index 2d6a44c..6490a54 100644 --- a/transformers/request-openai.js +++ b/transformers/request-openai.js @@ -94,14 +94,20 @@ export function transformToOpenAI(openaiRequest) { // Handle reasoning field based on model configuration const reasoningLevel = getModelReasoning(openaiRequest.model); - if (reasoningLevel) { - // Add reasoning field based on model configuration + if (reasoningLevel === 'auto') { + // Auto mode: preserve original request's reasoning field exactly as-is + if (openaiRequest.reasoning !== undefined) { + targetRequest.reasoning = openaiRequest.reasoning; + } + // If original request has no reasoning field, don't add one + } else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) { + // Specific level: override with model configuration targetRequest.reasoning = { effort: reasoningLevel, summary: 'auto' }; } else { - // If reasoning is off or invalid, explicitly remove reasoning field + // Off or invalid: explicitly remove reasoning field // This ensures any reasoning field from the original request is deleted delete targetRequest.reasoning; }