feat(skills): add media/transcription helpers

This commit is contained in:
Peter Steinberger
2025-12-20 12:53:09 +00:00
parent e0cd5650c5
commit e1a3bab7e5
10 changed files with 579 additions and 31 deletions

View File

@@ -0,0 +1,42 @@
---
name: openai-whisper-api
description: Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
metadata: {"clawdis":{"requires":{"bins":["curl"],"env":["OPENAI_API_KEY"]},"primaryEnv":"OPENAI_API_KEY"}}
---
# OpenAI Whisper API (curl)
Transcribe an audio file via OpenAIs `/v1/audio/transcriptions` endpoint.
## Quick start
```bash
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a
```
Defaults:
- Model: `whisper-1`
- Output: `<input>.txt`
## Useful flags
```bash
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model whisper-1 --out /tmp/transcript.txt
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --language en
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --prompt "Speaker names: Peter, Daniel"
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --json --out /tmp/transcript.json
```
## API key
Set `OPENAI_API_KEY`, or configure it in `~/.clawdis/clawdis.json`:
```json5
{
skills: {
"openai-whisper-api": {
apiKey: "OPENAI_KEY_HERE"
}
}
}
```