* fix(voice-call): prevent audio overlap with TTS queue Add a TTS queue to serialize audio playback and prevent overlapping speech during voice calls. Previously, concurrent speak() calls could send audio chunks simultaneously, causing garbled/choppy output. Changes: - Add queueTts() to MediaStreamHandler for sequential TTS playback - Wrap playTtsViaStream() audio sending in the queue - Clear queue on barge-in (when user starts speaking) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(voice-call): use iterative queue processing to prevent heap exhaustion The recursive processQueue() pattern accumulated stack frames, causing JavaScript heap out of memory errors on macOS CI. Convert to while loop for constant stack usage regardless of queue depth. * fix: prevent voice-call TTS overlap (#1713) (thanks @dguido) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Peter Steinberger <steipete@gmail.com>
@clawdbot/voice-call
Official Voice Call plugin for Clawdbot.
Providers:
- Twilio (Programmable Voice + Media Streams)
- Telnyx (Call Control v2)
- Plivo (Voice API + XML transfer + GetInput speech)
- Mock (dev/no network)
Docs: https://docs.clawd.bot/plugins/voice-call
Plugin system: https://docs.clawd.bot/plugin
Install (local dev)
Option A: install via Clawdbot (recommended)
clawdbot plugins install @clawdbot/voice-call
Restart the Gateway afterwards.
Option B: copy into your global extensions folder (dev)
mkdir -p ~/.clawdbot/extensions
cp -R extensions/voice-call ~/.clawdbot/extensions/voice-call
cd ~/.clawdbot/extensions/voice-call && pnpm install
Config
Put under plugins.entries.voice-call.config:
{
provider: "twilio", // or "telnyx" | "plivo" | "mock"
fromNumber: "+15550001234",
toNumber: "+15550005678",
twilio: {
accountSid: "ACxxxxxxxx",
authToken: "your_token"
},
plivo: {
authId: "MAxxxxxxxxxxxxxxxxxxxx",
authToken: "your_token"
},
// Webhook server
serve: {
port: 3334,
path: "/voice/webhook"
},
// Public exposure (pick one):
// publicUrl: "https://example.ngrok.app/voice/webhook",
// tunnel: { provider: "ngrok" },
// tailscale: { mode: "funnel", path: "/voice/webhook" }
outbound: {
defaultMode: "notify" // or "conversation"
},
streaming: {
enabled: true,
streamPath: "/voice/stream"
}
}
Notes:
- Twilio/Telnyx/Plivo require a publicly reachable webhook URL.
mockis a local dev provider (no network calls).
TTS for calls
Voice Call uses the core messages.tts configuration (OpenAI or ElevenLabs) for
streaming speech on calls. You can override it under the plugin config with the
same shape — overrides deep-merge with messages.tts.
{
tts: {
provider: "openai",
openai: {
voice: "alloy"
}
}
}
Notes:
- Edge TTS is ignored for voice calls (telephony audio needs PCM; Edge output is unreliable).
- Core TTS is used when Twilio media streaming is enabled; otherwise calls fall back to provider native voices.
CLI
clawdbot voicecall call --to "+15555550123" --message "Hello from Clawdbot"
clawdbot voicecall continue --call-id <id> --message "Any questions?"
clawdbot voicecall speak --call-id <id> --message "One moment"
clawdbot voicecall end --call-id <id>
clawdbot voicecall status --call-id <id>
clawdbot voicecall tail
clawdbot voicecall expose --mode funnel
Tool
Tool name: voice_call
Actions:
initiate_call(message, to?, mode?)continue_call(callId, message)speak_to_user(callId, message)end_call(callId)get_status(callId)
Gateway RPC
voicecall.initiate(to?, message, mode?)voicecall.continue(callId, message)voicecall.speak(callId, message)voicecall.end(callId)voicecall.status(callId)
Notes
- Uses webhook signature verification for Twilio/Telnyx/Plivo.
responseModel/responseSystemPromptcontrol AI auto-responses.- Media streaming requires
wsand OpenAI Realtime API key.