When Antigravity returns 429, throw immediately instead of waiting for the
server-provided retry delay (which can be 10+ minutes). This lets clawdbot
quickly rotate to another account.
This patch was previously in PR #454 but was accidentally removed in 0dbb569
when bumping to pi-ai 0.40.0. The upstream release did NOT include this fix.
Context: Antigravity rate limits cause pi-ai to sleep for the full retry
delay inside the request, blocking the thread. Clawdbot's timeout would
eventually fire, but waiting 10+ minutes is unacceptable UX.
Bundled with the empty error message filter since both handle 429 recovery.
When 429/500 errors occur during tool execution, empty assistant messages
with stopReason='error' and content=[] get persisted to the session file.
These break the tool_use -> tool_result chain that Claude/Gemini require:
- Claude expects every tool_use block to have a matching tool_result
- Empty error messages inserted mid-sequence violate this invariant
- Results in: 'tool_use ids were found without tool_result blocks'
This patch filters out empty error messages when building session context,
allowing sessions to recover gracefully from transient API errors.
Evidence from production:
- 113 of 170 sessions had empty error messages
- Session 30764430 demonstrated recovery: 429 at 14:30:11 IST,
resumed successfully at 14:30:22, completed at 14:30:34
Sorry Peter, one more patch! 🙈
- Add Microsoft 365 Agents SDK packages (@microsoft/agents-hosting,
@microsoft/agents-hosting-express, @microsoft/agents-hosting-extensions-teams)
- Add MSTeamsConfig type and zod schema
- Create src/msteams/ provider with monitor, token, send, probe
- Wire provider into gateway (server-providers.ts, server.ts)
- Add msteams to all provider type unions (hooks, queue, cron, etc.)
- Update implementation guide with new SDK and progress
Port Antigravity payload enhancements from CLIProxyAPI v6.6.89:
- Add ANTIGRAVITY_SYSTEM_INSTRUCTION with agent identity/guidelines
- Inject systemInstruction with role 'user' for Antigravity requests
- Add requestType: 'agent' to wrapped request body
- Update userAgent to 'antigravity' for Antigravity requests
This fixes RESOURCE_EXHAUSTED (429) errors when using Antigravity.
Adapted from: https://github.com/NoeFabris/opencode-antigravity-auth/pull/137
Reference: https://github.com/router-for-me/CLIProxyAPI/commit/67985d8
Add automatic thinking mode support for Z.AI GLM-4.x models:
- GLM-4.7: Preserved thinking (clear_thinking: false)
- GLM-4.5/4.6: Interleaved thinking (clear_thinking: true)
Uses Z.AI Cloud API format: thinking: { type: "enabled", clear_thinking: boolean }
Includes patches for pi-ai, pi-agent-core, and pi-coding-agent to pass
extraParams through the stream pipeline. User can override via config
or disable via --thinking off.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, grammY's default bot.start() processed updates sequentially,
blocking all Telegram messages while one was being handled. This made
maxConcurrent settings ineffective for Telegram.
Now uses @grammyjs/runner which processes updates concurrently, matching
the behavior of Discord (Promise.all) and WhatsApp (fire-and-forget).
Benefits:
- Ack reactions (👀) appear immediately, not after queue clears
- Multiple chats can be processed in parallel
- maxConcurrent setting now works correctly for Telegram
- Long-running tool calls no longer block other conversations
This commit fixes several issues with multi-account OAuth rotation that
were causing slow responses and inefficient account cycling.
## Changes
### 1. Fix usageStats race condition (auth-profiles.ts)
The `markAuthProfileUsed`, `markAuthProfileCooldown`, `markAuthProfileGood`,
and `clearAuthProfileCooldown` functions were using a stale in-memory store
passed as a parameter. Long-running sessions would overwrite usageStats
updates from concurrent sessions when saving.
**Fix:** Re-read the store from disk before each update to get fresh
usageStats from other sessions, then merge the update.
### 2. Capture AbortError from waitForCompactionRetry (pi-embedded-runner.ts)
When a request timed out, `session.abort()` was called which throws an
`AbortError`. The code structure was:
```javascript
try {
await session.prompt(params.prompt);
} catch (err) {
promptError = err; // Catches AbortError here
}
await waitForCompactionRetry(); // But THIS also throws AbortError!
```
The second `AbortError` from `waitForCompactionRetry()` escaped and
bypassed the rotation/fallback logic entirely.
**Fix:** Wrap `waitForCompactionRetry()` in its own try/catch to capture
the error as `promptError`, enabling proper timeout handling.
Root cause analysis and fix proposed by @erikpr1994 in #313.
Fixes#313
### 3. Fail fast on 429 rate limits (pi-ai patch)
The pi-ai library was retrying 429 errors up to 3 times with exponential
backoff before throwing. This meant a rate-limited account would waste
30+ seconds retrying before our rotation code could try the next account.
**Fix:** Patch google-gemini-cli.js to:
- Throw immediately on first 429 (no retries)
- Not catch and retry 429 errors in the network error handler
This allows the caller to rotate to the next account instantly on rate limit.
Note: We submitted this fix upstream (https://github.com/badlogic/pi-mono/pull/504)
but it was closed without merging. Keeping as a local patch for now.
## Testing
With 6 Antigravity accounts configured:
- Accounts rotate properly based on lastUsed (round-robin)
- 429s trigger immediate rotation to next account
- usageStats persist correctly across concurrent sessions
- Cooldown tracking works as expected
## Before/After
**Before:** Multiple 429 retries on same account, 30-90s delays
**After:** Instant rotation on 429, responses in seconds
* fix: Gemini stops working after one message in a session
* fix: small issue in test file
* test: cover google role-merge behavior
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* cron: skip delivery for HEARTBEAT_OK responses
When an isolated cron job has deliver:true, skip message delivery if the
response is just HEARTBEAT_OK (or contains HEARTBEAT_OK at edges with
short remaining content <= 30 chars). This allows cron jobs to silently
ack when nothing to report but still deliver actual content when there
is something meaningful to say.
Media is still delivered even if text is HEARTBEAT_OK, since the
presence of media indicates there's something to share.
* fix(heartbeat): make ack padding configurable
* chore(deps): update to latest
---------
Co-authored-by: Josh Lehman <josh@martian.engineering>