Allow agents to specify audio mode via inline tag:
- Default: voice bubble (sendVoice)
- [[audio_as_file]]: audio file with metadata (sendAudio)
The tag is stripped from the final message text.
Example agent response:
Here's a podcast episode! [[audio_as_file]]
MEDIA:https://example.com/episode.mp3
- Add audioAsVoice option to ReplyPayload type
- Update bot.ts to use sendVoice by default for audio (voice bubble)
- When audioAsVoice is false, use sendAudio (file with metadata)
This allows agents to control voice vs file mode via ReplyPayload.
Use Telegram's sendVoice API for audio files by default, displaying them
as round playable voice bubbles instead of file attachments.
Changes:
- Add asVoice option to TelegramSendOpts (defaults to true)
- When asVoice is true (default): use api.sendVoice() for voice bubbles
- When asVoice is false: use api.sendAudio() for traditional audio files
This gives callers control: voice messages for TTS/quick responses,
audio files for music/podcasts with metadata display.
Add automatic thinking mode support for Z.AI GLM-4.x models:
- GLM-4.7: Preserved thinking (clear_thinking: false)
- GLM-4.5/4.6: Interleaved thinking (clear_thinking: true)
Uses Z.AI Cloud API format: thinking: { type: "enabled", clear_thinking: boolean }
Includes patches for pi-ai, pi-agent-core, and pi-coding-agent to pass
extraParams through the stream pipeline. User can override via config
or disable via --thinking off.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Convert flat bullet list to Mintlify CardGroup/Card components
- Add icons and visual hierarchy
- Include author attribution and tags
- Add 'Submit Your Project' section with Steps component
- Organize by category with emoji headers
Makes the showcase more visually appealing and easier to scan.
- Keep the chosen mic label visible when it disconnects and show a disconnected hint while falling back to system default.
- Avoid clearing the preferred mic on device changes so it auto-restores when available.
- Add audio input change and default-input logs in voice wake runtime/tester/meter to debug routing.
Why: Start Test is a local verification tool, but it was forwarding transcripts to the gateway/chat, which confused users and made tests look like real commands.
What: stop forwarding from VoiceWakeTester and clarify in docs that the tester does not send to the gateway.
Why: voice wake tests often delivered partial/final transcripts without reliable word timings, so trigger matching failed, timeouts overwrote detections, and test runs/mic capture kept running after UI changes.
What: add text-only/prefix fallback and silence-based detection in the test flow, stop/clean up any prior test, cancel timeout on detection/stop, and tear down meter/test when the Voice Wake tab is inactive. Runtime detection now falls back on final text-only matches when timing is missing. UI state now reflects finalizing and prevents hanging tests.