* refactor: introduce provider plugin registry * refactor: move provider CLI to plugins * docs: add provider plugin implementation notes * refactor: shift provider runtime logic into plugins * refactor: add plugin defaults and summaries * docs: update provider plugin notes * feat(commands): add /commands slash list * Auto-reply: tidy help message * Auto-reply: fix status command lint * Tests: align google shared expectations * Auto-reply: tidy help message * Auto-reply: fix status command lint * refactor: move provider routing into plugins * test: align agent routing expectations * docs: update provider plugin notes * refactor: route replies via provider plugins * docs: note route-reply plugin hooks * refactor: extend provider plugin contract * refactor: derive provider status from plugins * refactor: unify gateway provider control * refactor: use plugin metadata in auto-reply * fix: parenthesize cron target selection * refactor: derive gateway methods from plugins * refactor: generalize provider logout * refactor: route provider logout through plugins * refactor: move WhatsApp web login methods into plugin * refactor: generalize provider log prefixes * refactor: centralize default chat provider * refactor: derive provider lists from registry * refactor: move provider reload noops into plugins * refactor: resolve web login provider via alias * refactor: derive CLI provider options from plugins * refactor: derive prompt provider list from plugins * style: apply biome lint fixes * fix: resolve provider routing edge cases * docs: update provider plugin refactor notes * fix(gateway): harden agent provider routing * refactor: move provider routing into plugins * refactor: move provider CLI to plugins * refactor: derive provider lists from registry * fix: restore slash command parsing * refactor: align provider ids for schema * refactor: unify outbound target resolution * fix: keep outbound labels stable * feat: add msteams to cron surfaces * fix: clean up lint build issues * refactor: localize chat provider alias normalization * refactor: drive gateway provider lists from plugins * docs: update provider plugin notes * style: format message-provider * fix: avoid provider registry init cycles * style: sort message-provider imports * fix: relax provider alias map typing * refactor: move provider routing into plugins * refactor: add plugin pairing/config adapters * refactor: route pairing and provider removal via plugins * refactor: align auto-reply provider typing * test: stabilize telegram media mocks * docs: update provider plugin refactor notes * refactor: pluginize outbound targets * refactor: pluginize provider selection * refactor: generalize text chunk limits * docs: update provider plugin notes * refactor: generalize group session/config * fix: normalize provider id for room detection * fix: avoid provider init in system prompt * style: formatting cleanup * refactor: normalize agent delivery targets * test: update outbound delivery labels * chore: fix lint regressions * refactor: extend provider plugin adapters * refactor: move elevated/block streaming defaults to plugins * refactor: defer outbound send deps to plugins * docs: note plugin-driven streaming/elevated defaults * refactor: centralize webchat provider constant * refactor: add provider setup adapters * refactor: delegate provider add config to plugins * docs: document plugin-driven provider add * refactor: add plugin state/binding metadata * refactor: build agent provider status from plugins * docs: note plugin-driven agent bindings * refactor: centralize internal provider constant usage * fix: normalize WhatsApp targets for groups and E.164 (#631) (thanks @imfing) * refactor: centralize default chat provider * refactor: centralize WhatsApp target normalization * refactor: move provider routing into plugins * refactor: normalize agent delivery targets * chore: fix lint regressions * fix: normalize WhatsApp targets for groups and E.164 (#631) (thanks @imfing) * feat: expand provider plugin adapters * refactor: route auto-reply via provider plugins * fix: align WhatsApp target normalization * fix: normalize WhatsApp targets for groups and E.164 (#631) (thanks @imfing) * refactor: centralize WhatsApp target normalization * feat: add /config chat config updates * docs: add /config get alias * feat(commands): add /commands slash list * refactor: centralize default chat provider * style: apply biome lint fixes * chore: fix lint regressions * fix: clean up whatsapp allowlist typing * style: format config command helpers * refactor: pluginize tool threading context * refactor: normalize session announce targets * docs: note new plugin threading and announce hooks * refactor: pluginize message actions * docs: update provider plugin actions notes * fix: align provider action adapters * refactor: centralize webchat checks * style: format message provider helpers * refactor: move provider onboarding into adapters * docs: note onboarding provider adapters * feat: add msteams onboarding adapter * style: organize onboarding imports * fix: normalize msteams allowFrom types * feat: add plugin text chunk limits * refactor: use plugin chunk limit fallbacks * feat: add provider mention stripping hooks * style: organize provider plugin type imports * refactor: generalize health snapshots * refactor: update macOS health snapshot handling * docs: refresh health snapshot notes * style: format health snapshot updates * refactor: drive security warnings via plugins * docs: note provider security adapter * style: format provider security adapters * refactor: centralize provider account defaults * refactor: type gateway client identity constants * chore: regen gateway protocol swift * fix: degrade health on failed provider probe * refactor: centralize pairing approve hint * docs: add plugin CLI command references * refactor: route auth and tool sends through plugins * docs: expand provider plugin hooks * refactor: document provider docking touchpoints * refactor: normalize internal provider defaults * refactor: streamline outbound delivery wiring * refactor: make provider onboarding plugin-owned * refactor: support provider-owned agent tools * refactor: move telegram draft chunking into telegram module * refactor: infer provider tool sends via extractToolSend * fix: repair plugin onboarding imports * refactor: de-dup outbound target normalization * style: tidy plugin and agent imports * refactor: data-drive provider selection line * fix: satisfy lint after provider plugin rebase * test: deflake gateway-cli coverage * style: format gateway-cli coverage test * refactor(provider-plugins): simplify provider ids * test(pairing-cli): avoid provider-specific ternary * style(macos): swiftformat HealthStore * refactor(sandbox): derive provider tool denylist * fix(sandbox): avoid plugin init in defaults * refactor(provider-plugins): centralize provider aliases * style(test): satisfy biome * refactor(protocol): v3 providers.status maps * refactor(ui): adapt to protocol v3 * refactor(macos): adapt to protocol v3 * test: update providers.status v3 fixtures * refactor(gateway): map provider runtime snapshot * test(gateway): update reload runtime snapshot * refactor(whatsapp): normalize heartbeat provider id * docs(refactor): update provider plugin notes * style: satisfy biome after rebase * fix: describe sandboxed elevated in prompt * feat(gateway): add agent image attachments + live probe * refactor: derive CLI provider options from plugins * fix(gateway): harden agent provider routing * fix(gateway): harden agent provider routing * refactor: align provider ids for schema * fix(protocol): keep agent provider string * fix(gateway): harden agent provider routing * fix(protocol): keep agent provider string * refactor: normalize agent delivery targets * refactor: support provider-owned agent tools * refactor(config): provider-keyed elevated allowFrom * style: satisfy biome * fix(gateway): appease provider narrowing * style: satisfy biome * refactor(reply): move group intro hints into plugin * fix(reply): avoid plugin registry init cycle * refactor(providers): add lightweight provider dock * refactor(gateway): use typed client id in connect * refactor(providers): document docks and avoid init cycles * refactor(providers): make media limit helper generic * fix(providers): break plugin registry import cycles * style: satisfy biome * refactor(status-all): build providers table from plugins * refactor(gateway): delegate web login to provider plugin * refactor(provider): drop web alias * refactor(provider): lazy-load monitors * style: satisfy lint/format * style: format status-all providers table * style: swiftformat gateway discovery model * test: make reload plan plugin-driven * fix: avoid token stringification in status-all * refactor: make provider IDs explicit in status * feat: warn on signal/imessage provider runtime errors * test: cover gateway provider runtime warnings in status * fix: add runtime kind to provider status issues * test: cover health degradation on probe failure * fix: keep routeReply lightweight * style: organize routeReply imports * refactor(web): extract auth-store helpers * refactor(whatsapp): lazy login imports * refactor(outbound): route replies via plugin outbound * docs: update provider plugin notes * style: format provider status issues * fix: make sandbox scope warning wrap-safe * refactor: load outbound adapters from provider plugins * docs: update provider plugin outbound notes * style(macos): fix swiftformat lint * docs: changelog for provider plugins * fix(macos): satisfy swiftformat * fix(macos): open settings via menu action * style: format after rebase * fix(macos): open Settings via menu action --------- Co-authored-by: LK <luke@kyohere.com> Co-authored-by: Luke K (pr-0f3t) <2609441+lc0rp@users.noreply.github.com> Co-authored-by: Xin <xin@imfing.com>
15 KiB
summary, read_when
| summary | read_when | |
|---|---|---|
| Runbook for the Gateway daemon, lifecycle, and operations |
|
Gateway (daemon) runbook
Last updated: 2025-12-09
What it is
- The always-on process that owns the single Baileys/Telegram connection and the control/event plane.
- Replaces the legacy
gatewaycommand. CLI entry point:clawdbot gateway. - Runs until stopped; exits non-zero on fatal errors so the supervisor restarts it.
How to run (local)
clawdbot gateway --port 18789
# for full debug/trace logs in stdio:
clawdbot gateway --port 18789 --verbose
# if the port is busy, terminate listeners then start:
clawdbot gateway --force
# dev loop (auto-reload on TS changes):
pnpm gateway:watch
- Config hot reload watches
~/.clawdbot/clawdbot.json(orCLAWDBOT_CONFIG_PATH).- Default mode:
gateway.reload.mode="hybrid"(hot-apply safe changes, restart on critical). - Hot reload uses in-process restart via SIGUSR1 when needed.
- Disable with
gateway.reload.mode="off".
- Default mode:
- Binds WebSocket control plane to
127.0.0.1:<port>(default 18789). - The same port also serves HTTP (control UI, hooks, A2UI). Single-port multiplex.
- OpenAI Chat Completions (HTTP):
/v1/chat/completions.
- OpenAI Chat Completions (HTTP):
- Starts a Canvas file server by default on
canvasHost.port(default18793), servinghttp://<gateway-host>:18793/__clawdbot__/canvas/from~/clawd/canvas. Disable withcanvasHost.enabled=falseorCLAWDBOT_SKIP_CANVAS_HOST=1. - Logs to stdout; use launchd/systemd to keep it alive and rotate logs.
- Pass
--verboseto mirror debug logging (handshakes, req/res, events) from the log file into stdio when troubleshooting. --forceuseslsofto find listeners on the chosen port, sends SIGTERM, logs what it killed, then starts the gateway (fails fast iflsofis missing).- If you run under a supervisor (launchd/systemd/mac app child-process mode), a stop/restart typically sends SIGTERM; older builds may surface this as
pnpmELIFECYCLEexit code 143 (SIGTERM), which is a normal shutdown, not a crash. - SIGUSR1 triggers an in-process restart (no external supervisor required). This is what the
gatewayagent tool uses. - Gateway auth: set
gateway.auth.mode=token+gateway.auth.token(or pass--token <value>/CLAWDBOT_GATEWAY_TOKEN) to require clients to sendconnect.params.auth.token. - The wizard now generates a token by default, even on loopback.
- Port precedence:
--port>CLAWDBOT_GATEWAY_PORT>gateway.port> default18789.
Remote access
- Tailscale/VPN preferred; otherwise SSH tunnel:
ssh -N -L 18789:127.0.0.1:18789 user@host - Clients then connect to
ws://127.0.0.1:18789through the tunnel. - If a token is configured, clients must include it in
connect.params.auth.tokeneven over the tunnel.
Multiple gateways (same host)
Supported if you isolate state + config and use unique ports.
Service names are profile-aware:
- macOS:
com.clawdbot.<profile> - Linux:
clawdbot-gateway-<profile>.service - Windows:
Clawdbot Gateway (<profile>)
Install metadata is embedded in the service config:
CLAWDBOT_SERVICE_MARKER=clawdbotCLAWDBOT_SERVICE_KIND=gatewayCLAWDBOT_SERVICE_VERSION=<version>
Dev profile (--dev)
Fast path: run a fully-isolated dev instance (config/state/workspace) without touching your primary setup.
clawdbot --dev setup
clawdbot --dev gateway --allow-unconfigured
# then target the dev instance:
clawdbot --dev status
clawdbot --dev health
Defaults (can be overridden via env/flags/config):
CLAWDBOT_STATE_DIR=~/.clawdbot-devCLAWDBOT_CONFIG_PATH=~/.clawdbot-dev/clawdbot.jsonCLAWDBOT_GATEWAY_PORT=19001(Gateway WS + HTTP)bridge.port=19002(derived:gateway.port+1)browser.controlUrl=http://127.0.0.1:19003(derived:gateway.port+2)canvasHost.port=19005(derived:gateway.port+4)agents.defaults.workspacedefault becomes~/clawd-devwhen you runsetup/onboardunder--dev.
Derived ports (rules of thumb):
- Base port =
gateway.port(orCLAWDBOT_GATEWAY_PORT/--port) bridge.port = base + 1(orCLAWDBOT_BRIDGE_PORT/ config override)browser.controlUrl port = base + 2(orCLAWDBOT_BROWSER_CONTROL_URL/ config override)canvasHost.port = base + 4(orCLAWDBOT_CANVAS_HOST_PORT/ config override)- Browser profile CDP ports auto-allocate from
browser.controlPort + 9 .. + 108(persisted per profile).
Checklist per instance:
- unique
gateway.port - unique
CLAWDBOT_CONFIG_PATH - unique
CLAWDBOT_STATE_DIR - unique
agents.defaults.workspace - separate WhatsApp numbers (if using WA)
Example:
CLAWDBOT_CONFIG_PATH=~/.clawdbot/a.json CLAWDBOT_STATE_DIR=~/.clawdbot-a clawdbot gateway --port 19001
CLAWDBOT_CONFIG_PATH=~/.clawdbot/b.json CLAWDBOT_STATE_DIR=~/.clawdbot-b clawdbot gateway --port 19002
Protocol (operator view)
- Mandatory first frame from client:
req {type:"req", id, method:"connect", params:{minProtocol,maxProtocol,client:{id,displayName?,version,platform,deviceFamily?,modelIdentifier?,mode,instanceId?}, caps, auth?, locale?, userAgent? } }. - Gateway replies
res {type:"res", id, ok:true, payload:hello-ok }(orok:falsewith an error, then closes). - After handshake:
- Requests:
{type:"req", id, method, params}→{type:"res", id, ok, payload|error} - Events:
{type:"event", event, payload, seq?, stateVersion?}
- Requests:
- Structured presence entries:
{host, ip, version, platform?, deviceFamily?, modelIdentifier?, mode, lastInputSeconds?, ts, reason?, tags?[], instanceId? }. agentresponses are two-stage: firstresack{runId,status:"accepted"}, then a finalres{runId,status:"ok"|"error",summary}after the run finishes; streamed output arrives asevent:"agent".
Methods (initial set)
health— full health snapshot (same shape asclawdbot health --json).status— short summary.system-presence— current presence list.system-event— post a presence/system note (structured).send— send a message via the active provider(s).agent— run an agent turn (streams events back on same connection).node.list— list paired + currently-connected bridge nodes (includescaps,deviceFamily,modelIdentifier,paired,connected, and advertisedcommands).node.describe— describe a node (capabilities + supportednode.invokecommands; works for paired nodes and for currently-connected unpaired nodes).node.invoke— invoke a command on a node (e.g.canvas.*,camera.*).node.pair.*— pairing lifecycle (request,list,approve,reject,verify).
See also: Presence for how presence is produced/deduped and why instanceId matters.
Events
agent— streamed tool/output events from the agent run (seq-tagged).presence— presence updates (deltas with stateVersion) pushed to all connected clients.tick— periodic keepalive/no-op to confirm liveness.shutdown— Gateway is exiting; payload includesreasonand optionalrestartExpectedMs. Clients should reconnect.
WebChat integration
- WebChat is a native SwiftUI UI that talks directly to the Gateway WebSocket for history, sends, abort, and events.
- Remote use goes through the same SSH/Tailscale tunnel; if a gateway token is configured, the client includes it during
connect. - macOS app connects via a single WS (shared connection); it hydrates presence from the initial snapshot and listens for
presenceevents to update the UI.
Typing and validation
- Server validates every inbound frame with AJV against JSON Schema emitted from the protocol definitions.
- Clients (TS/Swift) consume generated types (TS directly; Swift via the repo’s generator).
- Protocol definitions are the source of truth; regenerate schema/models with:
pnpm protocol:genpnpm protocol:gen:swift
Connection snapshot
hello-okincludes asnapshotwithpresence,health,stateVersion, anduptimeMspluspolicy {maxPayload,maxBufferedBytes,tickIntervalMs}so clients can render immediately without extra requests.health/system-presenceremain available for manual refresh, but are not required at connect time.
Error codes (res.error shape)
- Errors use
{ code, message, details?, retryable?, retryAfterMs? }. - Standard codes:
NOT_LINKED— WhatsApp not authenticated.AGENT_TIMEOUT— agent did not respond within the configured deadline.INVALID_REQUEST— schema/param validation failed.UNAVAILABLE— Gateway is shutting down or a dependency is unavailable.
Keepalive behavior
tickevents (or WS ping/pong) are emitted periodically so clients know the Gateway is alive even when no traffic occurs.- Send/agent acknowledgements remain separate responses; do not overload ticks for sends.
Replay / gaps
- Events are not replayed. Clients detect seq gaps and should refresh (
health+system-presence) before continuing. WebChat and macOS clients now auto-refresh on gap.
Supervision (macOS example)
- Use launchd to keep the daemon alive:
- Program: path to
clawdbot - Arguments:
gateway - KeepAlive: true
- StandardOut/Err: file paths or
syslog
- Program: path to
- On failure, launchd restarts; fatal misconfig should keep exiting so the operator notices.
- LaunchAgents are per-user and require a logged-in session; for headless setups use a custom LaunchDaemon (not shipped).
clawdbot daemon installwrites~/Library/LaunchAgents/com.clawdbot.gateway.plist(orcom.clawdbot.<profile>.plist).clawdbot doctoraudits the LaunchAgent config and can update it to current defaults.
Daemon management (CLI)
Use the CLI daemon manager for install/start/stop/restart/status:
clawdbot daemon status
clawdbot daemon install
clawdbot daemon stop
clawdbot daemon restart
clawdbot logs --follow
Notes:
daemon statusprobes the Gateway RPC by default using the daemon’s resolved port/config (override with--url).daemon status --deepadds system-level scans (LaunchDaemons/system units).daemon status --no-probeskips the RPC probe (useful when networking is down).daemon status --jsonis stable for scripts.daemon statusreports supervisor runtime (launchd/systemd running) separately from RPC reachability (WS connect + status RPC).daemon statusprints config path + probe target to avoid “localhost vs LAN bind” confusion and profile mismatches.daemon statusincludes the last gateway error line when the service looks running but the port is closed.logstails the Gateway file log via RPC (no manualtail/grepneeded).- If other gateway-like services are detected, the CLI warns unless they are Clawdbot profile services.
We still recommend one gateway per machine unless you need redundant profiles.
- Cleanup:
clawdbot daemon uninstall(current service) andclawdbot doctor(legacy migrations).
- Cleanup:
daemon installis a no-op when already installed; useclawdbot daemon install --forceto reinstall (profile/env/path changes).
Bundled mac app:
- Clawdbot.app can bundle a Node-based gateway relay and install a per-user LaunchAgent labeled
com.clawdbot.gateway(orcom.clawdbot.<profile>). - To stop it cleanly, use
clawdbot daemon stop(orlaunchctl bootout gui/$UID/com.clawdbot.gateway). - To restart, use
clawdbot daemon restart(orlaunchctl kickstart -k gui/$UID/com.clawdbot.gateway).launchctlonly works if the LaunchAgent is installed; otherwise useclawdbot daemon installfirst.- Replace the label with
com.clawdbot.<profile>when running a named profile.
Supervision (systemd user unit)
Clawdbot installs a systemd user service by default on Linux/WSL2. We recommend user services for single-user machines (simpler env, per-user config). Use a system service for multi-user or always-on servers (no lingering required, shared supervision).
clawdbot daemon install writes the user unit. clawdbot doctor audits the
unit and can update it to match the current recommended defaults.
Create ~/.config/systemd/user/clawdbot-gateway[-<profile>].service:
[Unit]
Description=Clawdbot Gateway (profile: <profile>, v<version>)
After=network-online.target
Wants=network-online.target
[Service]
ExecStart=/usr/local/bin/clawdbot gateway --port 18789
Restart=always
RestartSec=5
Environment=CLAWDBOT_GATEWAY_TOKEN=
WorkingDirectory=/home/youruser
[Install]
WantedBy=default.target
Enable lingering (required so the user service survives logout/idle):
sudo loginctl enable-linger youruser
Onboarding runs this on Linux/WSL2 (may prompt for sudo; writes /var/lib/systemd/linger).
Then enable the service:
systemctl --user enable --now clawdbot-gateway[-<profile>].service
Alternative (system service) - for always-on or multi-user servers, you can
install a systemd system unit instead of a user unit (no lingering needed).
Create /etc/systemd/system/clawdbot-gateway[-<profile>].service (copy the unit above,
switch WantedBy=multi-user.target, set User= + WorkingDirectory=), then:
sudo systemctl daemon-reload
sudo systemctl enable --now clawdbot-gateway[-<profile>].service
Windows (WSL2)
Windows installs should use WSL2 and follow the Linux systemd section above.
Operational checks
- Liveness: open WS and send
req:connect→ expectreswithpayload.type="hello-ok"(with snapshot). - Readiness: call
health→ expectok: trueand a linked provider in theproviderspayload (when applicable). - Debug: subscribe to
tickandpresenceevents; ensurestatusshows linked/auth age; presence entries show Gateway host and connected clients.
Safety guarantees
- Assume one Gateway per host by default; if you run multiple profiles, isolate ports/state and target the right instance.
- No fallback to direct Baileys connections; if the Gateway is down, sends fail fast.
- Non-connect first frames or malformed JSON are rejected and the socket is closed.
- Graceful shutdown: emit
shutdownevent before closing; clients must handle close + reconnect.
CLI helpers
clawdbot gateway health|status— request health/status over the Gateway WS.clawdbot message send --to <num> --message "hi" [--media ...]— send via Gateway (idempotent for WhatsApp).clawdbot agent --message "hi" --to <num>— run an agent turn (waits for final by default).clawdbot gateway call <method> --params '{"k":"v"}'— raw method invoker for debugging.clawdbot daemon stop|restart— stop/restart the supervised gateway service (launchd/systemd).- Gateway helper subcommands assume a running gateway on
--url; they no longer auto-spawn one.
Migration guidance
- Retire uses of
clawdbot gatewayand the legacy TCP control port. - Update clients to speak the WS protocol with mandatory connect and structured presence.