5.9 KiB
5.9 KiB
summary, read_when
| summary | read_when | |
|---|---|---|
| WhatsApp (web provider) integration: login, inbox, replies, media, and ops |
|
WhatsApp (web provider)
Updated: 2025-12-23
Status: WhatsApp Web via Baileys only. Gateway owns the single session.
Goals
- One WhatsApp identity, one gateway session.
- Deterministic routing: replies return to WhatsApp, no model routing.
- Model sees enough context to understand quoted replies.
Architecture (who owns what)
- Gateway owns the Baileys socket and inbox loop.
- CLI / macOS app talk to the gateway; no direct Baileys use.
- Active listener is required for outbound sends; otherwise send fails fast.
Getting a phone number
WhatsApp requires a real mobile number for verification. VoIP and virtual numbers are usually blocked.
Recommended approaches:
- Local eSIM from your country's mobile carrier (most reliable)
- Prepaid SIM — cheap, just needs to receive one SMS for verification
Avoid: TextNow, Google Voice, most "free SMS" services — WhatsApp blocks these aggressively.
Tip: The number only needs to receive one verification SMS. After that, WhatsApp Web sessions persist via creds.json.
WhatsApp Business: You can use WhatsApp Business on the same phone with a different number. This is a great option if you want to keep your personal WhatsApp separate — just install WhatsApp Business and register it with Clawdbot's dedicated number.
Login + credentials
- Login command:
clawdbot login(QR via Linked Devices). - Credentials stored in
~/.clawdbot/credentials/creds.json. - Backup copy at
creds.json.bak(restored on corruption). - Logout:
clawdbot logoutdeletes creds and session store. - Logged-out socket => error instructs re-link.
Inbound flow (DM + group)
- WhatsApp events come from
messages.upsert(Baileys). - Inbox listeners are detached on shutdown to avoid accumulating event handlers in tests/restarts.
- Status/broadcast chats are ignored.
- Direct chats use E.164; groups use group JID.
- Allowlist:
whatsapp.allowFromenforced for direct chats only.- If
whatsapp.allowFromis empty, default allowlist = self number (self-chat mode).
- If
- Self-chat mode: avoids auto read receipts and ignores mention JIDs.
- Read receipts sent for non-self-chat DMs.
Message normalization (what the model sees)
Bodyis the current message body with envelope.- Quoted reply context is always appended:
[Replying to +1555 id:ABC123] <quoted text or <media:...>> [/Replying] - Reply metadata also set:
ReplyToId= stanzaIdReplyToBody= quoted body or media placeholderReplyToSender= E.164 when known
- Media-only inbound messages use placeholders:
<media:image|video|audio|document|sticker>
Groups
- Groups map to
whatsapp:group:<jid>sessions. - Activation modes:
mention(default): requires @mention or regex match.always: always triggers.
/activation mention|alwaysis owner-only.- Owner =
whatsapp.allowFrom(or self E.164 if unset). - History injection:
- Recent messages (default 50) inserted under:
[Chat messages since your last reply - for context] - Current message under:
[Current message - respond to this] - Sender suffix appended:
[from: Name (+E164)]
- Recent messages (default 50) inserted under:
- Group metadata cached 5 min (subject + participants).
Reply delivery (threading)
- WhatsApp Web sends standard messages (no quoted reply threading in the current gateway).
- Reply tags are ignored on this surface.
Outbound send (text + media)
- Uses active web listener; error if gateway not running.
- Text chunking: 4k max per message.
- Media:
- Image/video/audio/document supported.
- Audio sent as PTT;
audio/ogg=>audio/ogg; codecs=opus. - Caption only on first media item.
- Media fetch supports HTTP(S) and local paths.
- Animated GIFs: WhatsApp expects MP4 with
gifPlayback: truefor inline looping.- CLI:
clawdbot send --media <mp4> --gif-playback - Gateway:
sendparams includegifPlayback: true
- CLI:
Media limits + optimization
- Default cap: 5 MB (per media item).
- Override:
agent.mediaMaxMb. - Images are auto-optimized to JPEG under cap (resize + quality sweep).
- Oversize media => error; media reply falls back to text warning.
Heartbeats
- Gateway heartbeat logs connection health (
web.heartbeatSeconds, default 60s). - Agent heartbeat is global (
agent.heartbeat.*) and runs in the main session.- Uses
HEARTBEATprompt +HEARTBEAT_OKskip behavior. - Delivery defaults to the last used channel (or configured target).
- Uses
Reconnect behavior
- Backoff policy:
web.reconnect:initialMs,maxMs,factor,jitter,maxAttempts.
- If maxAttempts reached, web monitoring stops (degraded).
- Logged-out => stop and require re-link.
Config quick map
whatsapp.allowFrom(DM allowlist).whatsapp.groups(group mention gating defaults/overrides)routing.groupChat.mentionPatternsrouting.groupChat.historyLimitmessages.messagePrefix(inbound prefix)messages.responsePrefix(outbound prefix)agent.mediaMaxMbagent.heartbeat.everyagent.heartbeat.model(optional override)agent.heartbeat.targetagent.heartbeat.tosession.*(scope, idle, store;mainKeyis ignored)web.enabled(disable provider startup when false)web.heartbeatSecondsweb.reconnect.*
Logs + troubleshooting
- Subsystems:
whatsapp/inbound,whatsapp/outbound,web-heartbeat,web-reconnect. - Log file:
/tmp/clawdbot/clawdbot-YYYY-MM-DD.log(configurable). - Troubleshooting guide:
docs/troubleshooting.md.
Tests
src/web/auto-reply.test.ts(mention gating, history injection, reply flow)src/web/monitor-inbox.test.ts(inbound parsing + reply context)src/web/outbound.test.ts(send mapping + media)