--- summary: "Image and media handling rules for send, gateway, and agent replies" read_when: - Modifying media pipeline or attachments --- # Image & Media Support — 2025-12-05 Clawdbot is now **web-only** (Baileys). This document captures the current media handling rules for send, gateway, and agent replies. ## Goals - Send media with optional captions via `clawdbot message send --media`. - Allow auto-replies from the web inbox to include media alongside text. - Keep per-type limits sane and predictable. ## CLI Surface - `clawdbot message send --media [--message ]` - `--media` optional; caption can be empty for media-only sends. - `--dry-run` prints the resolved payload; `--json` emits `{ provider, to, messageId, mediaUrl, caption }`. ## Web Provider Behavior - Input: local file path **or** HTTP(S) URL. - Flow: load into a Buffer, detect media kind, and build the correct payload: - **Images:** resize & recompress to JPEG (max side 2048px) targeting `agents.defaults.mediaMaxMb` (default 5 MB), capped at 6 MB. - **Audio/Voice/Video:** pass-through up to 16 MB; audio is sent as a voice note (`ptt: true`). - **Documents:** anything else, up to 100 MB, with filename preserved when available. - WhatsApp GIF-style playback: send an MP4 with `gifPlayback: true` (CLI: `--gif-playback`) so mobile clients loop inline. - MIME detection prefers magic bytes, then headers, then file extension. - Caption comes from `--message` or `reply.text`; empty caption is allowed. - Logging: non-verbose shows `↩️`/`✅`; verbose includes size and source path/URL. ## Auto-Reply Pipeline - `getReplyFromConfig` returns `{ text?, mediaUrl?, mediaUrls? }`. - When media is present, the web sender resolves local paths or URLs using the same pipeline as `clawdbot message send`. - Multiple media entries are sent sequentially if provided. ## Inbound Media to Commands (Pi) - When inbound web messages include media, Clawdbot downloads to a temp file and exposes templating variables: - `{{MediaUrl}}` pseudo-URL for the inbound media. - `{{MediaPath}}` local temp path written before running the command. - When a per-session Docker sandbox is enabled, inbound media is copied into the sandbox workspace and `MediaPath`/`MediaUrl` are rewritten to a relative path like `media/inbound/`. - Audio transcription (if configured via `tools.audio.transcription`) runs before templating and can replace `Body` with the transcript. ## Limits & Errors - Images: ~6 MB cap after recompression. - Audio/voice/video: 16 MB cap; documents: 100 MB cap. - Oversize or unreadable media → clear error in logs and the reply is skipped. ## Notes for Tests - Cover send + reply flows for image/audio/document cases. - Validate recompression for images (size bound) and voice-note flag for audio. - Ensure multi-media replies fan out as sequential sends.