- Keep audioAsVoice-only payloads from being filtered out
- Allow empty payloads through when they carry the flag
- Remove temporary debug logs around audioAsVoice buffering
Co-authored-by: Manuel Hettich <17690367+ManuelHettich@users.noreply.github.com>
- Add [[audio_as_voice]] detection to splitMediaFromOutput()
- Pass audioAsVoice through onBlockReply callback chain
- Buffer audio blocks during streaming, flush at end with correct flag
- Non-audio media still streams immediately
- Fix: emit payloads with audioAsVoice flag even if text is empty
Co-authored-by: Manuel Hettich <17690367+ManuelHettich@users.noreply.github.com>
- Fix disableBlockStreaming logic in telegram/bot.ts to properly enable
block streaming when telegram.streamMode is 'block' regardless of
blockStreamingDefault setting
- Set minChars default to 1 for Telegram block mode so chunks send
immediately on newlines/sentences instead of waiting for 800 chars
- Skip coalescing for Telegram block mode when not explicitly configured
to reduce chunk batching delays
- Fix newline preference to wait for actual newlines instead of breaking
on any whitespace when buffer is under maxChars
Fixes issue where all Telegram messages were batched into one message
at the end instead of streaming as separate messages during generation.
- Default: sendAudio (file with metadata) - preserves old behavior
- Opt-in: [[audio_as_voice]] tag for voice bubble
This is non-breaking - existing integrations keep working.
- Add audioAsVoice option to ReplyPayload type
- Update bot.ts to use sendVoice by default for audio (voice bubble)
- When audioAsVoice is false, use sendAudio (file with metadata)
This allows agents to control voice vs file mode via ReplyPayload.
Use Telegram's sendVoice API for audio files by default, displaying them
as round playable voice bubbles instead of file attachments.
Changes:
- Add asVoice option to TelegramSendOpts (defaults to true)
- When asVoice is true (default): use api.sendVoice() for voice bubbles
- When asVoice is false: use api.sendAudio() for traditional audio files
This gives callers control: voice messages for TTS/quick responses,
audio files for music/podcasts with metadata display.