feat: add Deepgram audio transcription
Co-authored-by: Safzan Pirani <safzanpirani@users.noreply.github.com>
This commit is contained in:
64
docs/providers/deepgram.md
Normal file
64
docs/providers/deepgram.md
Normal file
@@ -0,0 +1,64 @@
|
||||
---
|
||||
summary: "Deepgram transcription for inbound voice notes"
|
||||
read_when:
|
||||
- You want Deepgram speech-to-text for audio attachments
|
||||
- You need a quick Deepgram config example
|
||||
---
|
||||
# Deepgram (Audio Transcription)
|
||||
|
||||
Deepgram is a speech-to-text API. In Clawdbot it is used for **inbound audio/voice note
|
||||
transcription** via `tools.media.audio`.
|
||||
|
||||
When enabled, Clawdbot uploads the audio file to Deepgram and injects the transcript
|
||||
into the reply pipeline (`{{Transcript}}` + `[Audio]` block). This is **not streaming**;
|
||||
it uses the pre-recorded transcription endpoint.
|
||||
|
||||
Website: https://deepgram.com
|
||||
Docs: https://developers.deepgram.com
|
||||
|
||||
## Quick start
|
||||
|
||||
1) Set your API key:
|
||||
```
|
||||
DEEPGRAM_API_KEY=dg_...
|
||||
```
|
||||
|
||||
2) Enable the provider:
|
||||
```json5
|
||||
{
|
||||
tools: {
|
||||
media: {
|
||||
audio: {
|
||||
enabled: true,
|
||||
models: [{ provider: "deepgram", model: "nova-3" }]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Options
|
||||
|
||||
- `model`: Deepgram model id (default: `nova-3`)
|
||||
- `language`: language hint (optional)
|
||||
|
||||
Example with language:
|
||||
```json5
|
||||
{
|
||||
tools: {
|
||||
media: {
|
||||
audio: {
|
||||
enabled: true,
|
||||
models: [
|
||||
{ provider: "deepgram", model: "nova-3", language: "en" }
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- Authentication follows the standard provider auth order; `DEEPGRAM_API_KEY` is the simplest path.
|
||||
- Output follows the same audio rules as other providers (size caps, timeouts, transcript injection).
|
||||
@@ -34,5 +34,9 @@ Looking for chat channel docs (WhatsApp/Telegram/Discord/Slack/etc.)? See [Chann
|
||||
- [GLM models](/providers/glm)
|
||||
- [MiniMax](/providers/minimax)
|
||||
|
||||
## Transcription providers
|
||||
|
||||
- [Deepgram (audio transcription)](/providers/deepgram)
|
||||
|
||||
For the full provider catalog (xAI, Groq, Mistral, etc.) and advanced configuration,
|
||||
see [Model providers](/concepts/model-providers).
|
||||
|
||||
Reference in New Issue
Block a user