Files
clawdbot/docs/images.md
2025-12-09 17:51:05 +00:00

2.5 KiB
Raw Blame History

summary, read_when
summary read_when
Image and media handling rules for send, gateway, and agent replies
Modifying media pipeline or attachments

Image & Media Support — 2025-12-05

CLAWDIS is now web-only (Baileys). This document captures the current media handling rules for send, gateway, and agent replies.

Goals

  • Send media with optional captions via clawdis send --media.
  • Allow auto-replies from the web inbox to include media alongside text.
  • Keep per-type limits sane and predictable.

CLI Surface

  • clawdis send --media <path-or-url> [--message <caption>]
    • --media optional; caption can be empty for media-only sends.
    • --dry-run prints the resolved payload; --json emits { provider, to, messageId, mediaUrl, caption }.

Web Provider Behavior

  • Input: local file path or HTTP(S) URL.
  • Flow: load into a Buffer, detect media kind, and build the correct payload:
    • Images: resize & recompress to JPEG (max side 2048px) targeting inbound.reply.mediaMaxMb (default 5MB), capped at 6MB.
    • Audio/Voice/Video: pass-through up to 16MB; audio is sent as a voice note (ptt: true).
    • Documents: anything else, up to 100MB, with filename preserved when available.
  • MIME detection prefers magic bytes, then headers, then file extension.
  • Caption comes from --message or reply.text; empty caption is allowed.
  • Logging: non-verbose shows ↩️/; verbose includes size and source path/URL.

Auto-Reply Pipeline

  • getReplyFromConfig returns { text?, mediaUrl?, mediaUrls? }.
  • When media is present, the web sender resolves local paths or URLs using the same pipeline as clawdis send.
  • Multiple media entries are sent sequentially if provided.

Inbound Media to Commands (Pi/Tau)

  • When inbound web messages include media, CLAWDIS downloads to a temp file and exposes templating variables:
    • {{MediaUrl}} pseudo-URL for the inbound media.
    • {{MediaPath}} local temp path written before running the command.
  • Audio transcription (if configured) runs before templating and can replace Body with the transcript.

Limits & Errors

  • Images: ~6MB cap after recompression.
  • Audio/voice/video: 16MB cap; documents: 100MB cap.
  • Oversize or unreadable media → clear error in logs and the reply is skipped.

Notes for Tests

  • Cover send + reply flows for image/audio/document cases.
  • Validate recompression for images (size bound) and voice-note flag for audio.
  • Ensure multi-media replies fan out as sequential sends.