13 KiB
summary, read_when
| summary | read_when | |
|---|---|---|
| Runbook for the Gateway daemon, lifecycle, and operations |
|
Gateway (daemon) runbook
Last updated: 2025-12-09
What it is
- The always-on process that owns the single Baileys/Telegram connection and the control/event plane.
- Replaces the legacy
gatewaycommand. CLI entry point:clawdbot gateway. - Runs until stopped; exits non-zero on fatal errors so the supervisor restarts it.
How to run (local)
clawdbot gateway --port 18789
# for full debug/trace logs in stdio:
clawdbot gateway --port 18789 --verbose
# if the port is busy, terminate listeners then start:
clawdbot gateway --force
# dev loop (auto-reload on TS changes):
pnpm gateway:watch
- Config hot reload watches
~/.clawdbot/clawdbot.json(orCLAWDBOT_CONFIG_PATH).- Default mode:
gateway.reload.mode="hybrid"(hot-apply safe changes, restart on critical). - Hot reload uses in-process restart via SIGUSR1 when needed.
- Disable with
gateway.reload.mode="off".
- Default mode:
- Binds WebSocket control plane to
127.0.0.1:<port>(default 18789). - The same port also serves HTTP (control UI, hooks, A2UI). Single-port multiplex.
- Starts a Canvas file server by default on
canvasHost.port(default18793), servinghttp://<gateway-host>:18793/__clawdbot__/canvas/from~/clawd/canvas. Disable withcanvasHost.enabled=falseorCLAWDBOT_SKIP_CANVAS_HOST=1. - Logs to stdout; use launchd/systemd to keep it alive and rotate logs.
- Pass
--verboseto mirror debug logging (handshakes, req/res, events) from the log file into stdio when troubleshooting. --forceuseslsofto find listeners on the chosen port, sends SIGTERM, logs what it killed, then starts the gateway (fails fast iflsofis missing).- If you run under a supervisor (launchd/systemd/mac app child-process mode), a stop/restart typically sends SIGTERM; older builds may surface this as
pnpmELIFECYCLEexit code 143 (SIGTERM), which is a normal shutdown, not a crash. - SIGUSR1 triggers an in-process restart (no external supervisor required). This is what the
gatewayagent tool uses. - Optional shared secret: pass
--token <value>or setCLAWDBOT_GATEWAY_TOKENto require clients to sendconnect.params.auth.token. - Port precedence:
--port>CLAWDBOT_GATEWAY_PORT>gateway.port> default18789.
Remote access
- Tailscale/VPN preferred; otherwise SSH tunnel:
ssh -N -L 18789:127.0.0.1:18789 user@host - Clients then connect to
ws://127.0.0.1:18789through the tunnel. - If a token is configured, clients must include it in
connect.params.auth.tokeneven over the tunnel.
Multiple gateways (same host)
Supported if you isolate state + config and use unique ports.
Dev profile (--dev)
Fast path: run a fully-isolated dev instance (config/state/workspace) without touching your primary setup.
clawdbot --dev setup
clawdbot --dev gateway --allow-unconfigured
# then target the dev instance:
clawdbot --dev status
clawdbot --dev health
Defaults (can be overridden via env/flags/config):
CLAWDBOT_STATE_DIR=~/.clawdbot-devCLAWDBOT_CONFIG_PATH=~/.clawdbot-dev/clawdbot.jsonCLAWDBOT_GATEWAY_PORT=19001(Gateway WS + HTTP)bridge.port=19002(derived:gateway.port+1)browser.controlUrl=http://127.0.0.1:19003(derived:gateway.port+2)canvasHost.port=19005(derived:gateway.port+4)agent.workspacedefault becomes~/clawd-devwhen you runsetup/onboardunder--dev.
Derived ports (rules of thumb):
- Base port =
gateway.port(orCLAWDBOT_GATEWAY_PORT/--port) bridge.port = base + 1(orCLAWDBOT_BRIDGE_PORT/ config override)browser.controlUrl port = base + 2(orCLAWDBOT_BROWSER_CONTROL_URL/ config override)canvasHost.port = base + 4(orCLAWDBOT_CANVAS_HOST_PORT/ config override)- Browser profile CDP ports auto-allocate from
browser.controlPort + 9 .. + 108(persisted per profile).
Checklist per instance:
- unique
gateway.port - unique
CLAWDBOT_CONFIG_PATH - unique
CLAWDBOT_STATE_DIR - unique
agent.workspace - separate WhatsApp numbers (if using WA)
Example:
CLAWDBOT_CONFIG_PATH=~/.clawdbot/a.json CLAWDBOT_STATE_DIR=~/.clawdbot-a clawdbot gateway --port 19001
CLAWDBOT_CONFIG_PATH=~/.clawdbot/b.json CLAWDBOT_STATE_DIR=~/.clawdbot-b clawdbot gateway --port 19002
Protocol (operator view)
- Mandatory first frame from client:
req {type:"req", id, method:"connect", params:{minProtocol,maxProtocol,client:{name,version,platform,deviceFamily?,modelIdentifier?,mode,instanceId}, caps, auth?, locale?, userAgent? } }. - Gateway replies
res {type:"res", id, ok:true, payload:hello-ok }(orok:falsewith an error, then closes). - After handshake:
- Requests:
{type:"req", id, method, params}→{type:"res", id, ok, payload|error} - Events:
{type:"event", event, payload, seq?, stateVersion?}
- Requests:
- Structured presence entries:
{host, ip, version, platform?, deviceFamily?, modelIdentifier?, mode, lastInputSeconds?, ts, reason?, tags?[], instanceId? }. agentresponses are two-stage: firstresack{runId,status:"accepted"}, then a finalres{runId,status:"ok"|"error",summary}after the run finishes; streamed output arrives asevent:"agent".
Methods (initial set)
health— full health snapshot (same shape asclawdbot health --json).status— short summary.system-presence— current presence list.system-event— post a presence/system note (structured).send— send a message via the active provider(s).agent— run an agent turn (streams events back on same connection).node.list— list paired + currently-connected bridge nodes (includescaps,deviceFamily,modelIdentifier,paired,connected, and advertisedcommands).node.describe— describe a node (capabilities + supportednode.invokecommands; works for paired nodes and for currently-connected unpaired nodes).node.invoke— invoke a command on a node (e.g.canvas.*,camera.*).node.pair.*— pairing lifecycle (request,list,approve,reject,verify).
See also: docs/presence.md for how presence is produced/deduped and why instanceId matters.
Events
agent— streamed tool/output events from the agent run (seq-tagged).presence— presence updates (deltas with stateVersion) pushed to all connected clients.tick— periodic keepalive/no-op to confirm liveness.shutdown— Gateway is exiting; payload includesreasonand optionalrestartExpectedMs. Clients should reconnect.
WebChat integration
- WebChat is a native SwiftUI UI that talks directly to the Gateway WebSocket for history, sends, abort, and events.
- Remote use goes through the same SSH/Tailscale tunnel; if a gateway token is configured, the client includes it during
connect. - macOS app connects via a single WS (shared connection); it hydrates presence from the initial snapshot and listens for
presenceevents to update the UI.
Typing and validation
- Server validates every inbound frame with AJV against JSON Schema emitted from the protocol definitions.
- Clients (TS/Swift) consume generated types (TS directly; Swift via the repo’s generator).
- Types live in
src/gateway/protocol/*.ts; regenerate schemas/models withpnpm protocol:gen(writesdist/protocol.schema.json) andpnpm protocol:gen:swift(writesapps/macos/Sources/ClawdbotProtocol/GatewayModels.swift).
Connection snapshot
hello-okincludes asnapshotwithpresence,health,stateVersion, anduptimeMspluspolicy {maxPayload,maxBufferedBytes,tickIntervalMs}so clients can render immediately without extra requests.health/system-presenceremain available for manual refresh, but are not required at connect time.
Error codes (res.error shape)
- Errors use
{ code, message, details?, retryable?, retryAfterMs? }. - Standard codes:
NOT_LINKED— WhatsApp not authenticated.AGENT_TIMEOUT— agent did not respond within the configured deadline.INVALID_REQUEST— schema/param validation failed.UNAVAILABLE— Gateway is shutting down or a dependency is unavailable.
Keepalive behavior
tickevents (or WS ping/pong) are emitted periodically so clients know the Gateway is alive even when no traffic occurs.- Send/agent acknowledgements remain separate responses; do not overload ticks for sends.
Replay / gaps
- Events are not replayed. Clients detect seq gaps and should refresh (
health+system-presence) before continuing. WebChat and macOS clients now auto-refresh on gap.
Supervision (macOS example)
- Use launchd to keep the daemon alive:
- Program: path to
clawdbot - Arguments:
gateway - KeepAlive: true
- StandardOut/Err: file paths or
syslog
- Program: path to
- On failure, launchd restarts; fatal misconfig should keep exiting so the operator notices.
- LaunchAgents are per-user and require a logged-in session; for headless setups use a custom LaunchDaemon (not shipped).
Daemon management (CLI)
Use the CLI daemon manager for install/start/stop/restart/status:
clawdbot daemon status
clawdbot daemon install
clawdbot daemon stop
clawdbot daemon restart
clawdbot logs --follow
Notes:
daemon statusprobes the Gateway RPC by default (same URL/token defaults asgateway status).daemon status --deepadds system-level scans (LaunchDaemons/system units).daemon statusreports supervisor runtime (launchd/systemd running) separately from RPC reachability (WS connect + status RPC).daemon statusprints config path + probe target to avoid “localhost vs LAN bind” confusion and profile mismatches.logstails the Gateway file log via RPC (no manualtail/grepneeded).- If other gateway-like services are detected, the CLI warns. We recommend one gateway per machine; one gateway can host multiple agents.
- Cleanup:
clawdbot daemon uninstall(current service) andclawdbot doctor(legacy migrations).
- Cleanup:
Bundled mac app:
- Clawdbot.app can bundle a bun-compiled gateway binary and install a per-user LaunchAgent labeled
com.clawdbot.gateway. - To stop it cleanly, use
clawdbot daemon stop(orlaunchctl bootout gui/$UID/com.clawdbot.gateway). - To restart, use
clawdbot daemon restart(orlaunchctl kickstart -k gui/$UID/com.clawdbot.gateway).launchctlonly works if the LaunchAgent is installed; otherwise useclawdbot daemon installfirst.
Supervision (systemd user unit)
Create ~/.config/systemd/user/clawdbot-gateway.service:
[Unit]
Description=Clawdbot Gateway
After=network-online.target
Wants=network-online.target
[Service]
ExecStart=/usr/local/bin/clawdbot gateway --port 18789
Restart=always
RestartSec=5
Environment=CLAWDBOT_GATEWAY_TOKEN=
WorkingDirectory=/home/youruser
[Install]
WantedBy=default.target
Enable lingering (required so the user service survives logout/idle):
sudo loginctl enable-linger youruser
Onboarding runs this on Linux/WSL2 (may prompt for sudo; writes /var/lib/systemd/linger).
Then enable the service:
systemctl --user enable --now clawdbot-gateway.service
Alternative (system service) - for always-on or multi-user servers, you can
install a systemd system unit instead of a user unit (no lingering needed).
Create /etc/systemd/system/clawdbot-gateway.service (copy the unit above,
switch WantedBy=multi-user.target, set User= + WorkingDirectory=), then:
sudo systemctl daemon-reload
sudo systemctl enable --now clawdbot-gateway.service
Windows (WSL2)
Windows installs should use WSL2 and follow the Linux systemd section above.
Operational checks
- Liveness: open WS and send
req:connect→ expectreswithpayload.type="hello-ok"(with snapshot). - Readiness: call
health→ expectok: trueandweb.linked=true. - Debug: subscribe to
tickandpresenceevents; ensurestatusshows linked/auth age; presence entries show Gateway host and connected clients.
Safety guarantees
- Only one Gateway per host; all sends/agent calls must go through it.
- No fallback to direct Baileys connections; if the Gateway is down, sends fail fast.
- Non-connect first frames or malformed JSON are rejected and the socket is closed.
- Graceful shutdown: emit
shutdownevent before closing; clients must handle close + reconnect.
CLI helpers
clawdbot gateway health|status— request health/status over the Gateway WS.clawdbot send --to <num> --message "hi" [--media ...]— send via Gateway (idempotent for WhatsApp).clawdbot agent --message "hi" --to <num>— run an agent turn (waits for final by default).clawdbot gateway call <method> --params '{"k":"v"}'— raw method invoker for debugging.clawdbot daemon stop|restart— stop/restart the supervised gateway service (launchd/systemd).- Gateway helper subcommands assume a running gateway on
--url; they no longer auto-spawn one.
Migration guidance
- Retire uses of
clawdbot gatewayand the legacy TCP control port. - Update clients to speak the WS protocol with mandatory connect and structured presence.