clawdbot/docs/plugins/voice-call.md

---
summary: "Voice Call plugin: outbound and inbound calls via Twilio/Telnyx, with CLI, tools, and streaming"
read_when:
  - You want to place an outbound voice call from Clawdbot
  - You are configuring or developing the voice-call plugin
---

# Voice Call (plugin)

Voice calls for Clawdbot. Use it to place outbound notifications, run multi-turn
phone conversations, and accept inbound calls with an explicit policy.

Current providers:
- `twilio` (Programmable Voice + Media Streams)
- `telnyx` (Call Control v2)
- `mock` (dev/no network)

What you get:
- Outbound calls in notify or conversation mode
- Inbound calls with allowlist or open policies
- Provider webhooks with signature verification
- Optional streaming (Twilio Media Streams + OpenAI Realtime STT)
- CLI commands, a tool surface, and JSONL call logs

Quick mental model:
1. Install plugin
2. Restart Gateway
3. Configure `plugins.entries.voice-call.config`
4. Expose a public webhook URL
5. Call via `clawdbot voicecall ...` or the `voice_call` tool

## Where it runs (local vs remote)

The Voice Call plugin runs inside the Gateway process.

If you use a remote Gateway, install and configure the plugin on the machine
running the Gateway, then restart the Gateway to load it.

## Install

### Option A: install from npm (recommended)

```bash
clawdbot plugins install @clawdbot/voice-call
```

Restart the Gateway afterwards.

### Option B: install from a local folder (dev, no copying)

```bash
clawdbot plugins install ./extensions/voice-call
cd ./extensions/voice-call && pnpm install
```

Restart the Gateway afterwards.

Note: use `pnpm` for repo work. Bun is not recommended and can cause issues in
other Clawdbot channels (especially WhatsApp and Telegram).

## Config overview

All config lives under `plugins.entries.voice-call.config`. Phone numbers must
be in E.164 format (`+15550001234`).

Minimal example (Twilio outbound only):

```json5
{
  plugins: {
    entries: {
      "voice-call": {
        enabled: true,
        config: {
          provider: "twilio",
          fromNumber: "+15550001234",
          toNumber: "+15550005678",
          twilio: {
            accountSid: "ACxxxxxxxx",
            authToken: "..."
          },
          serve: { port: 3334, bind: "127.0.0.1", path: "/voice/webhook" },
          publicUrl: "https://example.ngrok.app/voice/webhook",
          outbound: { defaultMode: "notify", notifyHangupDelaySec: 3 }
        }
      }
    }
  }
}
```

Notes:
- Twilio/Telnyx require a publicly reachable webhook URL.
- `mock` is a local dev provider (no network calls).
- `skipSignatureVerification` is for local testing only.

## Public URL and webhook exposure

Providers send webhooks from the public internet. Your `serve.path` must be
reachable from them.

You have three options:
- `publicUrl`: you already have a public HTTPS URL pointing at the Gateway host.
- `tunnel`: use ngrok or Tailscale (recommended for quick setup).
- `tailscale`: legacy Tailscale serve/funnel config (still supported, but
  `tunnel` is preferred).

Example using ngrok:

```json5
{
  tunnel: {
    provider: "ngrok",
    ngrokAuthToken: "..."
  }
}
```

Example using Tailscale Funnel:

```json5
{
  tunnel: { provider: "tailscale-funnel" }
}
```

CLI helper (Tailscale only):

```bash
clawdbot voicecall expose --mode funnel
```

If you use Tailscale Serve without Funnel, the URL is private to your tailnet,
so Twilio/Telnyx will not be able to reach it.

## Providers

### Twilio

Twilio uses Programmable Voice with optional Media Streams for real-time audio.

Required config:
- `twilio.accountSid` and `twilio.authToken`
- (or `TWILIO_ACCOUNT_SID` / `TWILIO_AUTH_TOKEN`)
- A Twilio phone number that can reach your webhook

Inbound setup:
- In the Twilio Console for your phone number, set the Voice webhook to your
  public `serve.path` URL (HTTP POST).

Outbound setup:
- Outbound calls are created via Twilio API; the plugin supplies the webhook URL
  per call.

Streaming (optional, Twilio only):
- Enable `streaming.enabled` and set `streaming.streamPath`
- Provide `OPENAI_API_KEY` or `streaming.openaiApiKey`
- The stream WebSocket URL is derived from your `publicUrl` host + `streamPath`
  (https -> wss)

Signature verification:
- Webhooks are verified by default.
- If you are using ngrok free tier, leave `tunnel.allowNgrokFreeTier` as `true`
  so URL rewriting does not break verification.
- Use `skipSignatureVerification` only for local dev.

### Telnyx

Telnyx uses Call Control v2.

Required config:
- `telnyx.apiKey` and `telnyx.connectionId`
- (or `TELNYX_API_KEY` / `TELNYX_CONNECTION_ID`)

Inbound setup:
- In your Telnyx Call Control App, set the webhook URL to your public
  `serve.path`.

Signature verification:
- Set `telnyx.publicKey` to enable Ed25519 signature verification.
- If you do not set a public key, webhooks are accepted without verification
  (not recommended for production).

Transcription:
- Telnyx uses its own transcription events for `continue` responses.

### Mock (dev)

`mock` is for local testing and does not make network calls.

## Call modes

Outbound calls support two modes:
- `notify`: speak a message and auto-hangup after `notifyHangupDelaySec`.
- `conversation`: keep the call open and allow back-and-forth.

Examples:

```bash
clawdbot voicecall call --to "+15555550123" --message "Hello" --mode notify
clawdbot voicecall call --to "+15555550123" --message "Ready to talk?" --mode conversation
```

## Inbound calls and policy

Inbound calls are blocked by default.

Policies:
- `disabled`: block all inbound calls
- `allowlist`: allow only numbers in `allowFrom`
- `pairing`: currently behaves like `allowlist`
- `open`: accept all inbound calls

Inbound greeting:
- `inboundGreeting` controls the first message spoken when a call is accepted.

## Auto-responses and models

When a caller speaks, the plugin can auto-respond using the embedded Clawdbot
agent.

Key settings:
- `responseModel`: model reference for voice responses (default `openai/gpt-4o-mini`)
- `responseSystemPrompt`: optional override for the voice system prompt
- `responseTimeoutMs`: response generation timeout

Responses use the same agent system as messaging, including tool access.
The default system prompt keeps replies short and conversational (about 1-2 sentences).

## Streaming (Twilio only)

When `streaming.enabled` is on:
- The webhook server also accepts WebSocket upgrades at `streaming.streamPath`.
- Audio is forwarded to OpenAI Realtime STT.
- Final transcripts are fed into the call manager and used by `continue` and
  auto-responses.

Required:
- A public HTTPS URL for the Gateway (used to derive `wss://...`).
- `OPENAI_API_KEY` or `streaming.openaiApiKey`.

If no OpenAI key is available, streaming does not start and real-time transcripts
will not arrive.

## Limits and timeouts

These settings are enforced by the call manager:
- `maxDurationSeconds`: auto-hangup after this many seconds (starts when answered).
- `maxConcurrentCalls`: max simultaneous active calls.
- `transcriptTimeoutMs`: how long `continue` waits for a final transcript.

## Logs and debugging

Calls are appended as JSONL to:
- `${store}/calls.jsonl`, or
- `~/clawd/voice-calls/calls.jsonl` by default

Set `store` if you want a different base directory for call logs.

Use:

```bash
clawdbot voicecall tail
```

## CLI

```bash
clawdbot voicecall call --to "+15555550123" --message "Hello from Clawdbot"
clawdbot voicecall continue --call-id <id> --message "Any questions?"
clawdbot voicecall speak --call-id <id> --message "One moment"
clawdbot voicecall end --call-id <id>
clawdbot voicecall status --call-id <id>
clawdbot voicecall tail
clawdbot voicecall expose --mode funnel
```

## Agent tool

Tool name: `voice_call`

Actions:
- `initiate_call` (message, to?, mode?)
- `continue_call` (callId, message)
- `speak_to_user` (callId, message)
- `end_call` (callId)
- `get_status` (callId)

If you want a ready-made skill entry, grab it from [ClawdHub.com](https://ClawdHub.com).

## Gateway RPC

- `voicecall.initiate` (`to?`, `message`, `mode?`)
- `voicecall.continue` (`callId`, `message`)
- `voicecall.speak` (`callId`, `message`)
- `voicecall.end` (`callId`)
- `voicecall.status` (`callId`)