let5see/clawdbot

Fork 0

Files

Peter Steinberger 45c314fbe6 docs(voice-call): expand plugin guide

2026-01-13 11:42:09 +00:00

8.0 KiB

Raw Blame History

summary, read_when

summary

read_when

Voice Call plugin: outbound and inbound calls via Twilio/Telnyx, with CLI, tools, and streaming

You want to place an outbound voice call from Clawdbot

You are configuring or developing the voice-call plugin

Voice Call (plugin)

Voice calls for Clawdbot. Use it to place outbound notifications, run multi-turn phone conversations, and accept inbound calls with an explicit policy.

Current providers:

twilio (Programmable Voice + Media Streams)
telnyx (Call Control v2)
mock (dev/no network)

What you get:

Outbound calls in notify or conversation mode
Inbound calls with allowlist or open policies
Provider webhooks with signature verification
Optional streaming (Twilio Media Streams + OpenAI Realtime STT)
CLI commands, a tool surface, and JSONL call logs

Quick mental model:

Install plugin
Restart Gateway
Configure plugins.entries.voice-call.config
Expose a public webhook URL
Call via clawdbot voicecall ... or the voice_call tool

Where it runs (local vs remote)

The Voice Call plugin runs inside the Gateway process.

If you use a remote Gateway, install and configure the plugin on the machine running the Gateway, then restart the Gateway to load it.

Install

Option A: install from npm (recommended)

clawdbot plugins install @clawdbot/voice-call

Restart the Gateway afterwards.

Option B: install from a local folder (dev, no copying)

clawdbot plugins install ./extensions/voice-call
cd ./extensions/voice-call && pnpm install

Restart the Gateway afterwards.

Note: use pnpm for repo work. Bun is not recommended and can cause issues in other Clawdbot channels (especially WhatsApp and Telegram).

Config overview

All config lives under plugins.entries.voice-call.config. Phone numbers must be in E.164 format (+15550001234).

Minimal example (Twilio outbound only):

{
  plugins: {
    entries: {
      "voice-call": {
        enabled: true,
        config: {
          provider: "twilio",
          fromNumber: "+15550001234",
          toNumber: "+15550005678",
          twilio: {
            accountSid: "ACxxxxxxxx",
            authToken: "..."
          },
          serve: { port: 3334, bind: "127.0.0.1", path: "/voice/webhook" },
          publicUrl: "https://example.ngrok.app/voice/webhook",
          outbound: { defaultMode: "notify", notifyHangupDelaySec: 3 }
        }
      }
    }
  }
}

Notes:

Twilio/Telnyx require a publicly reachable webhook URL.
mock is a local dev provider (no network calls).
skipSignatureVerification is for local testing only.

Public URL and webhook exposure

Providers send webhooks from the public internet. Your serve.path must be reachable from them.

You have three options:

publicUrl: you already have a public HTTPS URL pointing at the Gateway host.
tunnel: use ngrok or Tailscale (recommended for quick setup).
tailscale: legacy Tailscale serve/funnel config (still supported, but tunnel is preferred).

Example using ngrok:

{
  tunnel: {
    provider: "ngrok",
    ngrokAuthToken: "..."
  }
}

Example using Tailscale Funnel:

{
  tunnel: { provider: "tailscale-funnel" }
}

CLI helper (Tailscale only):

clawdbot voicecall expose --mode funnel

If you use Tailscale Serve without Funnel, the URL is private to your tailnet, so Twilio/Telnyx will not be able to reach it.

Providers

Twilio

Twilio uses Programmable Voice with optional Media Streams for real-time audio.

Required config:

twilio.accountSid and twilio.authToken
(or TWILIO_ACCOUNT_SID / TWILIO_AUTH_TOKEN)
A Twilio phone number that can reach your webhook

Inbound setup:

In the Twilio Console for your phone number, set the Voice webhook to your public serve.path URL (HTTP POST).

Outbound setup:

Outbound calls are created via Twilio API; the plugin supplies the webhook URL per call.

Streaming (optional, Twilio only):

Enable streaming.enabled and set streaming.streamPath
Provide OPENAI_API_KEY or streaming.openaiApiKey
The stream WebSocket URL is derived from your publicUrl host + streamPath (https -> wss)

Signature verification:

Webhooks are verified by default.
If you are using ngrok free tier, leave tunnel.allowNgrokFreeTier as true so URL rewriting does not break verification.
Use skipSignatureVerification only for local dev.

Telnyx

Telnyx uses Call Control v2.

Required config:

telnyx.apiKey and telnyx.connectionId
(or TELNYX_API_KEY / TELNYX_CONNECTION_ID)

Inbound setup:

In your Telnyx Call Control App, set the webhook URL to your public serve.path.

Signature verification:

Set telnyx.publicKey to enable Ed25519 signature verification.
If you do not set a public key, webhooks are accepted without verification (not recommended for production).

Transcription:

Telnyx uses its own transcription events for continue responses.

Mock (dev)

mock is for local testing and does not make network calls.

Call modes

Outbound calls support two modes:

notify: speak a message and auto-hangup after notifyHangupDelaySec.
conversation: keep the call open and allow back-and-forth.

Examples:

clawdbot voicecall call --to "+15555550123" --message "Hello" --mode notify
clawdbot voicecall call --to "+15555550123" --message "Ready to talk?" --mode conversation

Inbound calls and policy

Inbound calls are blocked by default.

Policies:

disabled: block all inbound calls
allowlist: allow only numbers in allowFrom
pairing: currently behaves like allowlist
open: accept all inbound calls

Inbound greeting:

inboundGreeting controls the first message spoken when a call is accepted.

Auto-responses and models

When a caller speaks, the plugin can auto-respond using the embedded Clawdbot agent.

Key settings:

responseModel: model reference for voice responses (default openai/gpt-4o-mini)
responseSystemPrompt: optional override for the voice system prompt
responseTimeoutMs: response generation timeout

Responses use the same agent system as messaging, including tool access. The default system prompt keeps replies short and conversational (about 1-2 sentences).

Streaming (Twilio only)

When streaming.enabled is on:

The webhook server also accepts WebSocket upgrades at streaming.streamPath.
Audio is forwarded to OpenAI Realtime STT.
Final transcripts are fed into the call manager and used by continue and auto-responses.

Required:

A public HTTPS URL for the Gateway (used to derive wss://...).
OPENAI_API_KEY or streaming.openaiApiKey.

If no OpenAI key is available, streaming does not start and real-time transcripts will not arrive.

Limits and timeouts

These settings are enforced by the call manager:

maxDurationSeconds: auto-hangup after this many seconds (starts when answered).
maxConcurrentCalls: max simultaneous active calls.
transcriptTimeoutMs: how long continue waits for a final transcript.

Logs and debugging

Calls are appended as JSONL to:

${store}/calls.jsonl, or
~/clawd/voice-calls/calls.jsonl by default

Set store if you want a different base directory for call logs.

Use:

clawdbot voicecall tail

CLI

clawdbot voicecall call --to "+15555550123" --message "Hello from Clawdbot"
clawdbot voicecall continue --call-id <id> --message "Any questions?"
clawdbot voicecall speak --call-id <id> --message "One moment"
clawdbot voicecall end --call-id <id>
clawdbot voicecall status --call-id <id>
clawdbot voicecall tail
clawdbot voicecall expose --mode funnel

Agent tool

Tool name: voice_call

Actions:

initiate_call (message, to?, mode?)
continue_call (callId, message)
speak_to_user (callId, message)
end_call (callId)
get_status (callId)

If you want a ready-made skill entry, grab it from ClawdHub.com.

Gateway RPC

voicecall.initiate (to?, message, mode?)
voicecall.continue (callId, message)
voicecall.speak (callId, message)
voicecall.end (callId)
voicecall.status (callId)

8.0 KiB Raw Blame History