docs: document mime-first media handling

This commit is contained in:
Peter Steinberger
2025-11-28 08:07:38 +01:00
parent 7d6a4f5204
commit f63bdda628
2 changed files with 15 additions and 1 deletions

View File

@@ -1,5 +1,17 @@
# Changelog
## 1.2.1 — Unreleased
### Changes
- **Manual heartbeat sends:** `warelay heartbeat` now accepts `--message/--body` with `--provider web|twilio` to push real outbound messages through the same plumbing; `--dry-run` previews payloads without sending.
- **Media MIME-first handling:** Media loading now sniffs magic bytes/header before trusting extensions for both providers; local files with the wrong suffix still get correct MIME and image recompression.
- **Hosted media extensions:** Saved/hosted media (web inbound, webhook hosting, Twilio hosting) now writes files with an extension derived from detected MIME (e.g., `.jpg`, `.png`, `.mp4`), so downstream CLI sends carry the right Content-Type. Added tests covering inbound Baileys downloads and buffer saves.
### Planned / in progress
- **Heartbeat targeting quality:** Allow `warelay heartbeat --provider web --all` to fall back to `inbound.allowFrom` when no sessions exist, and surface a clear error when neither sessions nor allow-list entries are present. Add verbose log lines that state exactly which recipients were chosen and why.
- **Heartbeat delivery preview (Claude path):** Add a dry-run mode that resolves the heartbeat reply (text/media) and prints it without sending, to help test Claude prompt changes safely.
- **Simulated inbound hook (debug):** Optional local-only endpoint to inject synthetic inbound messages into the web relay loop, sharing the same command queue and reply path. Useful for testing auto-replies and heartbeats without WhatsApp.
## 1.2.0 — 2025-11-27
### Changes

View File

@@ -26,6 +26,7 @@ This document defines how `warelay` should handle sending and replying with imag
- Images: **resize + recompress to JPEG** (max side 2048px, quality step-down) to fit under `inbound.reply.mediaMaxMb` (default 5MB) but never above the Web hard cap (6MB).
- Audio/voice and video: pass through up to 16MB; set `ptt: true` for audio to send as a voice note.
- Everything else becomes a document with filename, up to 100MB.
- MIME is detected by magic bytes first (then header, then path); wrong file extensions are tolerated and the detected MIME drives payload kind and recompression.
- Caption uses `--message` or `reply.text`; if caption is empty, send media-only.
- Logging: non-verbose shows `↩️`/`✅` with caption; verbose includes `(media, <bytes>B, <ms>ms fetch)` and the local/remote path.
@@ -45,7 +46,7 @@ This document defines how `warelay` should handle sending and replying with imag
- 404/410 if expired or missing.
- Optional `?delete=1` to self-delete after fetch (used by Twilio fetch hook if we detect first hit).
- Temp storage: `~/.warelay/media`; cleaned on startup (remove files older than 15 minutes) and during TTL eviction.
- Security: no directory listing; only UUID file names; CORS open (Twilio fetch); content-type derived from `mime-types` lookup by extension or `content-type` header on download, else `application/octet-stream`.
- Security: no directory listing; only UUID file names; CORS open (Twilio fetch); content-type derived from sniffed bytes (fallback to header, then extension). Saved files are renamed with an extension that matches the detected MIME so downstream fetches present the correct type.
## Auto-Reply Pipeline
- `getReplyFromConfig` returns `{ text?, mediaUrl? }`.
@@ -60,6 +61,7 @@ This document defines how `warelay` should handle sending and replying with imag
- `{{MediaUrl}}` original URL (Twilio) or pseudo-URL (web).
- `{{MediaPath}}` local temp path written before running the command.
- Size guard: only download if ≤5MB; else skip and log (aligns with the temp media store limit).
- Saved inbound media is named with the detected MIME-based extension (e.g., `.jpg`), so later CLI sends reuse a correct filename/content-type even if WhatsApp omitted an extension.
- Audio/voice notes: if you set `inbound.transcribeAudio.command`, warelay will run that CLI (templated with `{{MediaPath}}`) and replace `Body` with the transcript before continuing the reply flow; verbose logs indicate when transcription runs. The command prompt includes the original media path plus a `Transcript:` section so the model sees both.
## Errors & Messaging