52 lines
2.5 KiB
Markdown
52 lines
2.5 KiB
Markdown
---
|
||
summary: "Image and media handling rules for send, gateway, and agent replies"
|
||
read_when:
|
||
- Modifying media pipeline or attachments
|
||
---
|
||
<!-- {% raw %} -->
|
||
# Image & Media Support — 2025-12-05
|
||
|
||
CLAWDIS is now **web-only** (Baileys). This document captures the current media handling rules for send, gateway, and agent replies.
|
||
|
||
## Goals
|
||
- Send media with optional captions via `clawdis send --media`.
|
||
- Allow auto-replies from the web inbox to include media alongside text.
|
||
- Keep per-type limits sane and predictable.
|
||
|
||
## CLI Surface
|
||
- `clawdis send --media <path-or-url> [--message <caption>]`
|
||
- `--media` optional; caption can be empty for media-only sends.
|
||
- `--dry-run` prints the resolved payload; `--json` emits `{ provider, to, messageId, mediaUrl, caption }`.
|
||
|
||
## Web Provider Behavior
|
||
- Input: local file path **or** HTTP(S) URL.
|
||
- Flow: load into a Buffer, detect media kind, and build the correct payload:
|
||
- **Images:** resize & recompress to JPEG (max side 2048px) targeting `agent.mediaMaxMb` (default 5 MB), capped at 6 MB.
|
||
- **Audio/Voice/Video:** pass-through up to 16 MB; audio is sent as a voice note (`ptt: true`).
|
||
- **Documents:** anything else, up to 100 MB, with filename preserved when available.
|
||
- MIME detection prefers magic bytes, then headers, then file extension.
|
||
- Caption comes from `--message` or `reply.text`; empty caption is allowed.
|
||
- Logging: non-verbose shows `↩️`/`✅`; verbose includes size and source path/URL.
|
||
|
||
## Auto-Reply Pipeline
|
||
- `getReplyFromConfig` returns `{ text?, mediaUrl?, mediaUrls? }`.
|
||
- When media is present, the web sender resolves local paths or URLs using the same pipeline as `clawdis send`.
|
||
- Multiple media entries are sent sequentially if provided.
|
||
|
||
## Inbound Media to Commands (Pi)
|
||
- When inbound web messages include media, CLAWDIS downloads to a temp file and exposes templating variables:
|
||
- `{{MediaUrl}}` pseudo-URL for the inbound media.
|
||
- `{{MediaPath}}` local temp path written before running the command.
|
||
- Audio transcription (if configured) runs before templating and can replace `Body` with the transcript.
|
||
|
||
## Limits & Errors
|
||
- Images: ~6 MB cap after recompression.
|
||
- Audio/voice/video: 16 MB cap; documents: 100 MB cap.
|
||
- Oversize or unreadable media → clear error in logs and the reply is skipped.
|
||
|
||
## Notes for Tests
|
||
- Cover send + reply flows for image/audio/document cases.
|
||
- Validate recompression for images (size bound) and voice-note flag for audio.
|
||
- Ensure multi-media replies fan out as sequential sends.
|
||
<!-- {% endraw %} -->
|