Files
clawdbot/extensions/telegram-tts/README.md
Glucksberg 104d977d12 feat(telegram-tts): add latency logging, status tracking, and unit tests
- Add latency metrics to summarizeText and textToSpeech functions
- Add /tts_status command showing config and last attempt result
- Add /tts_summary command for feature flag control
- Fix atomic write to clean up temp file on rename failure
- Add timer.unref() to prevent blocking process shutdown
- Add unit tests for validation functions (13 tests)
- Update README with new commands and features

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 08:00:44 +00:00

4.2 KiB

Telegram TTS Extension

Automatic text-to-speech for chat responses using ElevenLabs or OpenAI.

Features

  • Auto-TTS: Automatically converts all text responses to voice when enabled
  • speak Tool: Converts text to speech and sends as voice message
  • RPC Methods: Control TTS via Gateway (tts.status, tts.enable, tts.disable, tts.convert, tts.providers)
  • User Commands: /tts_on, /tts_off, /tts_provider, /tts_limit, /tts_summary, /tts_status
  • Auto-Summarization: Long texts are automatically summarized before TTS conversion
  • Multi-provider: ElevenLabs and OpenAI TTS with automatic fallback
  • Self-contained: No external CLI dependencies - calls APIs directly

Requirements

  • For TTS: ElevenLabs API key OR OpenAI API key
  • For Auto-Summarization: OpenAI API key (uses gpt-4o-mini to summarize long texts)

Installation

The extension is bundled with Clawdbot. Enable it in your config:

{
  "plugins": {
    "entries": {
      "telegram-tts": {
        "enabled": true,
        "provider": "elevenlabs",
        "elevenlabs": {
          "apiKey": "your-api-key"
        }
      }
    }
  }
}

Or use OpenAI:

{
  "plugins": {
    "entries": {
      "telegram-tts": {
        "enabled": true,
        "provider": "openai",
        "openai": {
          "apiKey": "your-api-key",
          "voice": "nova"
        }
      }
    }
  }
}

Or set API keys via environment variables:

# For ElevenLabs
export ELEVENLABS_API_KEY=your-api-key
# or
export XI_API_KEY=your-api-key

# For OpenAI
export OPENAI_API_KEY=your-api-key

Configuration

Option Type Default Description
enabled boolean false Enable the plugin
provider string "openai" TTS provider (elevenlabs or openai)
elevenlabs.apiKey string - ElevenLabs API key
elevenlabs.voiceId string "pMsXgVXv3BLzUgSXRplE" ElevenLabs Voice ID
elevenlabs.modelId string "eleven_multilingual_v2" ElevenLabs Model ID
openai.apiKey string - OpenAI API key
openai.model string "gpt-4o-mini-tts" OpenAI model (gpt-4o-mini-tts, tts-1, or tts-1-hd)
openai.voice string "alloy" OpenAI voice
prefsPath string ~/clawd/.user-preferences.json User preferences file
maxTextLength number 4000 Max characters for TTS
timeoutMs number 30000 API request timeout in milliseconds

OpenAI Voices

Available voices: alloy, ash, coral, echo, fable, onyx, nova, sage, shimmer

Usage

Agent Tool

The agent can use the speak tool to send voice messages:

User: Send me a voice message saying hello
Agent: [calls speak({ text: "Hello! How can I help you today?" })]

RPC Methods

# Check TTS status
clawdbot gateway call tts.status

# Enable/disable TTS
clawdbot gateway call tts.enable
clawdbot gateway call tts.disable

# Convert text to audio
clawdbot gateway call tts.convert '{"text": "Hello world"}'

# List available providers
clawdbot gateway call tts.providers

Telegram Commands

The plugin registers the following commands automatically:

Command Description
/tts_on Enable auto-TTS for all responses
/tts_off Disable auto-TTS
/tts_provider [openai|elevenlabs] Switch TTS provider (with fallback)
/tts_limit [chars] Set max text length before summarization (default: 1500)
/tts_summary [on|off] Enable/disable auto-summarization for long texts
/tts_status Show TTS status, config, and last attempt result

Auto-Summarization

When enabled (default), texts exceeding the configured limit are automatically summarized using OpenAI's gpt-4o-mini before TTS conversion. This ensures long responses can still be converted to audio.

Requirements: OpenAI API key must be configured for summarization to work, even if using ElevenLabs for TTS.

Behavior:

  • Texts under the limit are converted directly
  • Texts over the limit are summarized first, then converted
  • If summarization is disabled (/tts_summary off), long texts are skipped (no audio)
  • After summarization, a hard limit is applied to prevent oversized TTS requests

License

MIT