Files
clawdbot/docs/concepts/model-failover.md
2026-01-16 00:25:49 +00:00

5.0 KiB
Raw Blame History

summary, read_when
summary read_when
How Clawdbot rotates auth profiles and falls back across models
Diagnosing auth profile rotation, cooldowns, or model fallback behavior
Updating failover rules for auth profiles or models

Model failover

Clawdbot handles failures in two stages:

  1. Auth profile rotation within the current provider.
  2. Model fallback to the next model in agents.defaults.model.fallbacks.

This doc explains the runtime rules and the data that backs them.

Auth storage (keys + OAuth)

Clawdbot uses auth profiles for both API keys and OAuth tokens.

  • Secrets live in ~/.clawdbot/agents/<agentId>/agent/auth-profiles.json (legacy: ~/.clawdbot/agent/auth-profiles.json).
  • Config auth.profiles / auth.order are metadata + routing only (no secrets).
  • Legacy import-only OAuth file: ~/.clawdbot/credentials/oauth.json (imported into auth-profiles.json on first use).

More detail: /concepts/oauth

Credential types:

  • type: "api_key"{ provider, key }
  • type: "oauth"{ provider, access, refresh, expires, email? } (+ projectId/enterpriseUrl for some providers)

Profile IDs

OAuth logins create distinct profiles so multiple accounts can coexist.

  • Default: provider:default when no email is available.
  • OAuth with email: provider:<email> (for example google-antigravity:user@gmail.com).

Profiles live in ~/.clawdbot/agents/<agentId>/agent/auth-profiles.json under profiles.

Rotation order

When a provider has multiple profiles, Clawdbot chooses an order like this:

  1. Explicit config: auth.order[provider] (if set).
  2. Configured profiles: auth.profiles filtered by provider.
  3. Stored profiles: entries in auth-profiles.json for the provider.

If no explicit order is configured, Clawdbot uses a roundrobin order:

  • Primary key: profile type (OAuth before API keys).
  • Secondary key: usageStats.lastUsed (oldest first, within each type).
  • Cooldown/disabled profiles are moved to the end, ordered by soonest expiry.

Session stickiness (cache-friendly)

Clawdbot pins the chosen auth profile per session to keep provider caches warm. It does not rotate on every request. The pinned profile is reused until:

  • the session is reset (/new / /reset)
  • a compaction completes (compaction count increments)
  • the profile is in cooldown/disabled

Manual selection via /model …@<profileId> sets a user override for that session and is not autorotated until a new session starts.

Why OAuth can “look lost”

If you have both an OAuth profile and an API key profile for the same provider, roundrobin can switch between them across messages unless pinned. To force a single profile:

  • Pin with auth.order[provider] = ["provider:profileId"], or
  • Use a per-session override via /model … with a profile override (when supported by your UI/chat surface).

Cooldowns

When a profile fails due to auth/ratelimit errors (or a timeout that looks like rate limiting), Clawdbot marks it in cooldown and moves to the next profile. Format/invalidrequest errors (for example Cloud Code Assist tool call ID validation failures) are treated as failoverworthy and use the same cooldowns.

Cooldowns use exponential backoff:

  • 1 minute
  • 5 minutes
  • 25 minutes
  • 1 hour (cap)

State is stored in auth-profiles.json under usageStats:

{
  "usageStats": {
    "provider:profile": {
      "lastUsed": 1736160000000,
      "cooldownUntil": 1736160600000,
      "errorCount": 2
    }
  }
}

Billing disables

Billing/credit failures (for example “insufficient credits” / “credit balance too low”) are treated as failoverworthy, but theyre usually not transient. Instead of a short cooldown, Clawdbot marks the profile as disabled (with a longer backoff) and rotates to the next profile/provider.

State is stored in auth-profiles.json:

{
  "usageStats": {
    "provider:profile": {
      "disabledUntil": 1736178000000,
      "disabledReason": "billing"
    }
  }
}

Defaults:

  • Billing backoff starts at 5 hours, doubles per billing failure, and caps at 24 hours.
  • Backoff counters reset if the profile hasnt failed for 24 hours (configurable).

Model fallback

If all profiles for a provider fail, Clawdbot moves to the next model in agents.defaults.model.fallbacks. This applies to auth failures, rate limits, and timeouts that exhausted profile rotation (other errors do not advance fallback).

When a run starts with a model override (hooks or CLI), fallbacks still end at agents.defaults.model.primary after trying any configured fallbacks.

See Gateway configuration for:

  • auth.profiles / auth.order
  • auth.cooldowns.billingBackoffHours / auth.cooldowns.billingBackoffHoursByProvider
  • auth.cooldowns.billingMaxHours / auth.cooldowns.failureWindowHours
  • agents.defaults.model.primary / agents.defaults.model.fallbacks
  • agents.defaults.imageModel routing

See Models for the broader model selection and fallback overview.