升级到v1.3.0:新增auto推理模式和完善推理级别文档
主要功能更新: - 新增auto推理级别,完全遵循客户端原始请求参数 - 支持五档推理级别:auto/off/low/medium/high - auto模式零干预:不修改推理字段和anthropic-beta头 - 除gpt-5-codex外,所有模型默认设为auto模式 文档完善: - 更新核心功能说明,突出智能推理级别控制 - 新增auto推理模式详细说明和使用场景 - 添加推理级别对比表格和配置示例 - 增强FAQ部分,分场景解答推理相关问题 - 提供OpenAI和Anthropic模型字段对应关系 技术实现: - 更新getModelReasoning函数支持auto选项 - 完善所有transformer的auto模式处理逻辑 - 优化routes.js中直接转发端点的auto支持 - 确保auto模式下头信息和请求体完全透传
This commit is contained in:
94
README.md
94
README.md
@@ -11,12 +11,13 @@ OpenAI 兼容的 API 代理服务器,统一访问不同的 LLM 模型。
|
|||||||
- **智能优先级** - FACTORY_API_KEY > refresh_token > 客户端authorization
|
- **智能优先级** - FACTORY_API_KEY > refresh_token > 客户端authorization
|
||||||
- **容错启动** - 无任何认证配置时不报错,继续运行支持客户端授权
|
- **容错启动** - 无任何认证配置时不报错,继续运行支持客户端授权
|
||||||
|
|
||||||
### 🧠 模型推理能力级别
|
### 🧠 智能推理级别控制
|
||||||
- **四档推理级别** - off/low/medium/high,精确控制模型思考深度
|
- **五档推理级别** - auto/off/low/medium/high,灵活控制推理行为
|
||||||
|
- **auto模式** - 完全遵循客户端原始请求,不做任何推理参数修改
|
||||||
|
- **固定级别** - off/low/medium/high强制覆盖客户端推理设置
|
||||||
- **OpenAI模型** - 自动注入reasoning字段,effort参数控制推理强度
|
- **OpenAI模型** - 自动注入reasoning字段,effort参数控制推理强度
|
||||||
- **Anthropic模型** - 自动配置thinking字段和budget_tokens (4096/12288/24576)
|
- **Anthropic模型** - 自动配置thinking字段和budget_tokens (4096/12288/24576)
|
||||||
- **智能Beta头管理** - 自动添加/移除anthropic-beta字段中的推理相关标识
|
- **智能头管理** - 根据推理级别自动添加/移除anthropic-beta相关标识
|
||||||
- **配置驱动** - 通过config.json灵活调整每个模型的推理级别
|
|
||||||
|
|
||||||
### 🚀 服务器部署/Docker部署
|
### 🚀 服务器部署/Docker部署
|
||||||
- **本地服务器** - 支持npm start快速启动
|
- **本地服务器** - 支持npm start快速启动
|
||||||
@@ -103,34 +104,37 @@ export DROID_REFRESH_KEY="your_refresh_token_here"
|
|||||||
|
|
||||||
#### 推理级别配置
|
#### 推理级别配置
|
||||||
|
|
||||||
每个模型支持四种推理级别:
|
每个模型支持五种推理级别:
|
||||||
|
|
||||||
- **`off`** - 关闭推理功能,使用标准响应
|
- **`auto`** - 遵循客户端原始请求,不做任何推理参数修改
|
||||||
|
- **`off`** - 强制关闭推理功能,删除所有推理字段
|
||||||
- **`low`** - 低级推理 (Anthropic: 4096 tokens, OpenAI: low effort)
|
- **`low`** - 低级推理 (Anthropic: 4096 tokens, OpenAI: low effort)
|
||||||
- **`medium`** - 中级推理 (Anthropic: 12288 tokens, OpenAI: medium effort)
|
- **`medium`** - 中级推理 (Anthropic: 12288 tokens, OpenAI: medium effort)
|
||||||
- **`high`** - 高级推理 (Anthropic: 24576 tokens, OpenAI: high effort)
|
- **`high`** - 高级推理 (Anthropic: 24576 tokens, OpenAI: high effort)
|
||||||
|
|
||||||
**对于Anthropic模型 (Claude)**:
|
**对于Anthropic模型 (Claude)**:
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"name": "Claude Sonnet 4.5",
|
"name": "Claude Sonnet 4.5",
|
||||||
"id": "claude-sonnet-4-5-20250929",
|
"id": "claude-sonnet-4-5-20250929",
|
||||||
"type": "anthropic",
|
"type": "anthropic",
|
||||||
"reasoning": "high"
|
"reasoning": "auto" // 推荐:让客户端控制推理
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
自动添加thinking字段和anthropic-beta头,budget_tokens根据级别设置。
|
- `auto`: 保留客户端thinking字段,不修改anthropic-beta头
|
||||||
|
- `low/medium/high`: 自动添加thinking字段和anthropic-beta头,budget_tokens根据级别设置
|
||||||
|
|
||||||
**对于OpenAI模型 (GPT)**:
|
**对于OpenAI模型 (GPT)**:
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"name": "GPT-5-Codex",
|
"name": "GPT-5",
|
||||||
"id": "gpt-5-codex",
|
"id": "gpt-5-2025-08-07",
|
||||||
"type": "openai",
|
"type": "openai",
|
||||||
"reasoning": "medium"
|
"reasoning": "auto" // 推荐:让客户端控制推理
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
自动添加reasoning字段,effort参数对应配置级别。
|
- `auto`: 保留客户端reasoning字段不变
|
||||||
|
- `low/medium/high`: 自动添加reasoning字段,effort参数设置为对应级别
|
||||||
|
|
||||||
## 使用方法
|
## 使用方法
|
||||||
|
|
||||||
@@ -311,6 +315,36 @@ droid2api完全尊重客户端的stream参数设置:
|
|||||||
- **`"stream": false`** - 禁用流式响应,等待完整结果后返回
|
- **`"stream": false`** - 禁用流式响应,等待完整结果后返回
|
||||||
- **不设置stream** - 由服务器端决定默认行为,不强制转换
|
- **不设置stream** - 由服务器端决定默认行为,不强制转换
|
||||||
|
|
||||||
|
### 什么是auto推理模式?
|
||||||
|
|
||||||
|
`auto` 是v1.3.0新增的推理级别,完全遵循客户端的原始请求:
|
||||||
|
|
||||||
|
**行为特点**:
|
||||||
|
- 🎯 **零干预** - 不添加、不删除、不修改任何推理相关字段
|
||||||
|
- 🔄 **完全透传** - 客户端发什么就转发什么
|
||||||
|
- 🛡️ **头信息保护** - 不修改anthropic-beta等推理相关头信息
|
||||||
|
|
||||||
|
**使用场景**:
|
||||||
|
- 客户端需要完全控制推理参数
|
||||||
|
- 与原始API行为保持100%一致
|
||||||
|
- 不同客户端有不同的推理需求
|
||||||
|
|
||||||
|
**示例对比**:
|
||||||
|
```bash
|
||||||
|
# 客户端请求包含推理字段
|
||||||
|
{
|
||||||
|
"model": "claude-opus-4-1-20250805",
|
||||||
|
"reasoning": "auto", // 配置为auto
|
||||||
|
"messages": [...],
|
||||||
|
"thinking": {"type": "enabled", "budget_tokens": 8192}
|
||||||
|
}
|
||||||
|
|
||||||
|
# auto模式:完全保留客户端设置
|
||||||
|
→ thinking字段原样转发,不做任何修改
|
||||||
|
|
||||||
|
# 如果配置为"high":会被覆盖为 {"type": "enabled", "budget_tokens": 24576}
|
||||||
|
```
|
||||||
|
|
||||||
### 如何配置推理级别?
|
### 如何配置推理级别?
|
||||||
|
|
||||||
在 `config.json` 中为每个模型设置 `reasoning` 字段:
|
在 `config.json` 中为每个模型设置 `reasoning` 字段:
|
||||||
@@ -319,14 +353,24 @@ droid2api完全尊重客户端的stream参数设置:
|
|||||||
{
|
{
|
||||||
"models": [
|
"models": [
|
||||||
{
|
{
|
||||||
"id": "claude-opus-4-1-20250805",
|
"id": "claude-opus-4-1-20250805",
|
||||||
"type": "anthropic",
|
"type": "anthropic",
|
||||||
"reasoning": "high" // off/low/medium/high
|
"reasoning": "auto" // auto/off/low/medium/high
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**推理级别说明**:
|
||||||
|
|
||||||
|
| 级别 | 行为 | 适用场景 |
|
||||||
|
|------|------|----------|
|
||||||
|
| `auto` | 完全遵循客户端原始请求参数 | 让客户端自主控制推理 |
|
||||||
|
| `off` | 强制禁用推理,删除所有推理字段 | 快速响应场景 |
|
||||||
|
| `low` | 轻度推理 (4096 tokens) | 简单任务 |
|
||||||
|
| `medium` | 中度推理 (12288 tokens) | 平衡性能与质量 |
|
||||||
|
| `high` | 深度推理 (24576 tokens) | 复杂任务 |
|
||||||
|
|
||||||
### 令牌多久刷新一次?
|
### 令牌多久刷新一次?
|
||||||
|
|
||||||
系统每6小时自动刷新一次访问令牌。刷新令牌有效期为8小时,确保有2小时的缓冲时间。
|
系统每6小时自动刷新一次访问令牌。刷新令牌有效期为8小时,确保有2小时的缓冲时间。
|
||||||
@@ -346,9 +390,19 @@ Token refreshed successfully, expires at: 2025-01-XX XX:XX:XX
|
|||||||
|
|
||||||
### 推理功能为什么没有生效?
|
### 推理功能为什么没有生效?
|
||||||
|
|
||||||
1. 检查模型配置中的 `reasoning` 字段是否设置正确
|
**如果推理级别设置无效**:
|
||||||
2. 确认模型类型匹配(anthropic模型用thinking,openai模型用reasoning)
|
1. 检查模型配置中的 `reasoning` 字段是否为有效值 (`auto/off/low/medium/high`)
|
||||||
3. 查看请求日志确认字段是否正确添加
|
2. 确认模型ID是否正确匹配config.json中的配置
|
||||||
|
3. 查看服务器日志确认推理字段是否正确处理
|
||||||
|
|
||||||
|
**如果使用auto模式但推理不生效**:
|
||||||
|
1. 确认客户端请求中包含了推理字段 (`reasoning` 或 `thinking`)
|
||||||
|
2. auto模式不会添加推理字段,只会保留客户端原有的设置
|
||||||
|
3. 如需强制推理,请改用 `low/medium/high` 级别
|
||||||
|
|
||||||
|
**推理字段对应关系**:
|
||||||
|
- OpenAI模型 (`gpt-*`) → 使用 `reasoning` 字段
|
||||||
|
- Anthropic模型 (`claude-*`) → 使用 `thinking` 字段
|
||||||
|
|
||||||
### 如何更改端口?
|
### 如何更改端口?
|
||||||
|
|
||||||
|
|||||||
@@ -56,7 +56,7 @@ export function getModelReasoning(modelId) {
|
|||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
const reasoningLevel = model.reasoning.toLowerCase();
|
const reasoningLevel = model.reasoning.toLowerCase();
|
||||||
if (['low', 'medium', 'high'].includes(reasoningLevel)) {
|
if (['low', 'medium', 'high', 'auto'].includes(reasoningLevel)) {
|
||||||
return reasoningLevel;
|
return reasoningLevel;
|
||||||
}
|
}
|
||||||
return null;
|
return null;
|
||||||
|
|||||||
@@ -19,25 +19,25 @@
|
|||||||
"name": "Opus 4.1",
|
"name": "Opus 4.1",
|
||||||
"id": "claude-opus-4-1-20250805",
|
"id": "claude-opus-4-1-20250805",
|
||||||
"type": "anthropic",
|
"type": "anthropic",
|
||||||
"reasoning": "off"
|
"reasoning": "auto"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "Sonnet 4",
|
"name": "Sonnet 4",
|
||||||
"id": "claude-sonnet-4-20250514",
|
"id": "claude-sonnet-4-20250514",
|
||||||
"type": "anthropic",
|
"type": "anthropic",
|
||||||
"reasoning": "medium"
|
"reasoning": "auto"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "Sonnet 4.5",
|
"name": "Sonnet 4.5",
|
||||||
"id": "claude-sonnet-4-5-20250929",
|
"id": "claude-sonnet-4-5-20250929",
|
||||||
"type": "anthropic",
|
"type": "anthropic",
|
||||||
"reasoning": "high"
|
"reasoning": "auto"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "GPT-5",
|
"name": "GPT-5",
|
||||||
"id": "gpt-5-2025-08-07",
|
"id": "gpt-5-2025-08-07",
|
||||||
"type": "openai",
|
"type": "openai",
|
||||||
"reasoning": "high"
|
"reasoning": "auto"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "GPT-5-Codex",
|
"name": "GPT-5-Codex",
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "droid2api",
|
"name": "droid2api",
|
||||||
"version": "1.2.2",
|
"version": "1.3.0",
|
||||||
"description": "OpenAI Compatible API Proxy",
|
"description": "OpenAI Compatible API Proxy",
|
||||||
"main": "server.js",
|
"main": "server.js",
|
||||||
"type": "module",
|
"type": "module",
|
||||||
|
|||||||
10
routes.js
10
routes.js
@@ -238,7 +238,10 @@ async function handleDirectResponses(req, res) {
|
|||||||
|
|
||||||
// 处理reasoning字段
|
// 处理reasoning字段
|
||||||
const reasoningLevel = getModelReasoning(modelId);
|
const reasoningLevel = getModelReasoning(modelId);
|
||||||
if (reasoningLevel) {
|
if (reasoningLevel === 'auto') {
|
||||||
|
// Auto模式:保持原始请求的reasoning字段不变
|
||||||
|
// 如果原始请求有reasoning字段就保留,没有就不添加
|
||||||
|
} else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) {
|
||||||
modifiedRequest.reasoning = {
|
modifiedRequest.reasoning = {
|
||||||
effort: reasoningLevel,
|
effort: reasoningLevel,
|
||||||
summary: 'auto'
|
summary: 'auto'
|
||||||
@@ -373,7 +376,10 @@ async function handleDirectMessages(req, res) {
|
|||||||
|
|
||||||
// 处理thinking字段
|
// 处理thinking字段
|
||||||
const reasoningLevel = getModelReasoning(modelId);
|
const reasoningLevel = getModelReasoning(modelId);
|
||||||
if (reasoningLevel) {
|
if (reasoningLevel === 'auto') {
|
||||||
|
// Auto模式:保持原始请求的thinking字段不变
|
||||||
|
// 如果原始请求有thinking字段就保留,没有就不添加
|
||||||
|
} else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) {
|
||||||
const budgetTokens = {
|
const budgetTokens = {
|
||||||
'low': 4096,
|
'low': 4096,
|
||||||
'medium': 12288,
|
'medium': 12288,
|
||||||
|
|||||||
@@ -113,7 +113,14 @@ export function transformToAnthropic(openaiRequest) {
|
|||||||
|
|
||||||
// Handle thinking field based on model configuration
|
// Handle thinking field based on model configuration
|
||||||
const reasoningLevel = getModelReasoning(openaiRequest.model);
|
const reasoningLevel = getModelReasoning(openaiRequest.model);
|
||||||
if (reasoningLevel) {
|
if (reasoningLevel === 'auto') {
|
||||||
|
// Auto mode: preserve original request's thinking field exactly as-is
|
||||||
|
if (openaiRequest.thinking !== undefined) {
|
||||||
|
anthropicRequest.thinking = openaiRequest.thinking;
|
||||||
|
}
|
||||||
|
// If original request has no thinking field, don't add one
|
||||||
|
} else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) {
|
||||||
|
// Specific level: override with model configuration
|
||||||
const budgetTokens = {
|
const budgetTokens = {
|
||||||
'low': 4096,
|
'low': 4096,
|
||||||
'medium': 12288,
|
'medium': 12288,
|
||||||
@@ -125,7 +132,7 @@ export function transformToAnthropic(openaiRequest) {
|
|||||||
budget_tokens: budgetTokens[reasoningLevel]
|
budget_tokens: budgetTokens[reasoningLevel]
|
||||||
};
|
};
|
||||||
} else {
|
} else {
|
||||||
// If reasoning is off or invalid, explicitly remove thinking field
|
// Off or invalid: explicitly remove thinking field
|
||||||
// This ensures any thinking field from the original request is deleted
|
// This ensures any thinking field from the original request is deleted
|
||||||
delete anthropicRequest.thinking;
|
delete anthropicRequest.thinking;
|
||||||
}
|
}
|
||||||
@@ -179,7 +186,10 @@ export function getAnthropicHeaders(authHeader, clientHeaders = {}, isStreaming
|
|||||||
|
|
||||||
// Handle thinking beta based on reasoning configuration
|
// Handle thinking beta based on reasoning configuration
|
||||||
const thinkingBeta = 'interleaved-thinking-2025-05-14';
|
const thinkingBeta = 'interleaved-thinking-2025-05-14';
|
||||||
if (reasoningLevel) {
|
if (reasoningLevel === 'auto') {
|
||||||
|
// Auto mode: don't modify anthropic-beta header, preserve original
|
||||||
|
// betaValues remain unchanged from client headers
|
||||||
|
} else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) {
|
||||||
// Add thinking beta if not already present
|
// Add thinking beta if not already present
|
||||||
if (!betaValues.includes(thinkingBeta)) {
|
if (!betaValues.includes(thinkingBeta)) {
|
||||||
betaValues.push(thinkingBeta);
|
betaValues.push(thinkingBeta);
|
||||||
|
|||||||
@@ -94,14 +94,20 @@ export function transformToOpenAI(openaiRequest) {
|
|||||||
|
|
||||||
// Handle reasoning field based on model configuration
|
// Handle reasoning field based on model configuration
|
||||||
const reasoningLevel = getModelReasoning(openaiRequest.model);
|
const reasoningLevel = getModelReasoning(openaiRequest.model);
|
||||||
if (reasoningLevel) {
|
if (reasoningLevel === 'auto') {
|
||||||
// Add reasoning field based on model configuration
|
// Auto mode: preserve original request's reasoning field exactly as-is
|
||||||
|
if (openaiRequest.reasoning !== undefined) {
|
||||||
|
targetRequest.reasoning = openaiRequest.reasoning;
|
||||||
|
}
|
||||||
|
// If original request has no reasoning field, don't add one
|
||||||
|
} else if (reasoningLevel && ['low', 'medium', 'high'].includes(reasoningLevel)) {
|
||||||
|
// Specific level: override with model configuration
|
||||||
targetRequest.reasoning = {
|
targetRequest.reasoning = {
|
||||||
effort: reasoningLevel,
|
effort: reasoningLevel,
|
||||||
summary: 'auto'
|
summary: 'auto'
|
||||||
};
|
};
|
||||||
} else {
|
} else {
|
||||||
// If reasoning is off or invalid, explicitly remove reasoning field
|
// Off or invalid: explicitly remove reasoning field
|
||||||
// This ensures any reasoning field from the original request is deleted
|
// This ensures any reasoning field from the original request is deleted
|
||||||
delete targetRequest.reasoning;
|
delete targetRequest.reasoning;
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user