let5see/AI-Video

Fork 0

Files

puke c387137446 删除/修复老逻辑

2025-11-07 16:59:12 +08:00

7.1 KiB

Raw Blame History

ReelForge Capabilities Guide

Complete guide to using LLM, TTS, and Image generation capabilities

Overview

ReelForge provides three core AI capabilities:

LLM: Text generation using LiteLLM (supports 100+ models)
TTS: Text-to-speech using Edge TTS (free, 400+ voices)
Image: Image generation using ComfyKit (local or cloud)

Quick Start

from reelforge.service import reelforge

# LLM - Generate text
answer = await reelforge.llm("Summarize 'Atomic Habits' in 3 sentences")

# TTS - Generate speech
audio_path = await reelforge.tts("Hello, world!")

# Image - Generate images
image_url = await reelforge.image(
    workflow="workflows/book_cover_simple.json",
    prompt="minimalist book cover design"
)

1. LLM (Large Language Model)

Configuration

Edit config.yaml:

llm:
  default: qwen  # Choose: qwen, openai, deepseek, ollama
  
  qwen:
    api_key: "your-dashscope-api-key"
    base_url: "https://dashscope.aliyuncs.com/compatible-mode/v1"
    model: "openai/qwen-max"
  
  openai:
    api_key: "your-openai-api-key"
    model: "gpt-4"
  
  deepseek:
    api_key: "your-deepseek-api-key"
    base_url: "https://api.deepseek.com"
    model: "openai/deepseek-chat"
  
  ollama:
    base_url: "http://localhost:11434"
    model: "ollama/llama3.2"

Usage

# Basic usage
answer = await reelforge.llm("What is machine learning?")

# With parameters
answer = await reelforge.llm(
    prompt="Explain atomic habits",
    temperature=0.7,  # 0.0-2.0 (lower = more deterministic)
    max_tokens=2000
)

Environment Variables (Alternative)

Instead of config.yaml, you can use environment variables:

# Qwen
export DASHSCOPE_API_KEY="your-key"

# OpenAI
export OPENAI_API_KEY="your-key"

# DeepSeek
export DEEPSEEK_API_KEY="your-key"

2. TTS (Text-to-Speech)

Configuration

Edit config.yaml:

tts:
  default: edge
  
  edge:
    # No configuration needed - free to use!

Usage

# Basic usage (auto-generates temp path)
audio_path = await reelforge.tts("Hello, world!")
# Returns: "temp/abc123def456.mp3"

# With Chinese text
audio_path = await reelforge.tts(
    text="你好，世界！",
    voice="zh-CN-YunjianNeural"
)

# With custom parameters
audio_path = await reelforge.tts(
    text="Welcome to ReelForge",
    voice="en-US-JennyNeural",
    rate="+20%",  # Speed: +50% = faster, -20% = slower
    volume="+0%",
    pitch="+0Hz"
)

# Specify output path
audio_path = await reelforge.tts(
    text="Hello",
    output_path="output/greeting.mp3"
)

Popular Voices

Chinese:

zh-CN-YunjianNeural (male, default)
zh-CN-XiaoxiaoNeural (female)
zh-CN-YunxiNeural (male)
zh-CN-XiaoyiNeural (female)

English:

en-US-JennyNeural (female)
en-US-GuyNeural (male)
en-GB-SoniaNeural (female, British)

List All Voices

# Get all available voices
voices = await reelforge.tts.list_voices()

# Get Chinese voices only
voices = await reelforge.tts.list_voices(locale="zh-CN")

# Get English voices only
voices = await reelforge.tts.list_voices(locale="en-US")

3. Image Generation

Configuration

Edit config.yaml:

image:
  default: comfykit
  
  comfykit:
    # Local ComfyUI (optional, default: http://127.0.0.1:8188)
    comfyui_url: "http://127.0.0.1:8188"
    
    # RunningHub cloud (optional)
    runninghub_api_key: "rh-key-xxx"

Usage

# Basic usage (local ComfyUI)
image_url = await reelforge.image(
    workflow="workflows/book_cover_simple.json",
    prompt="minimalist book cover design, blue and white"
)

# With full parameters
image_url = await reelforge.image(
    workflow="workflows/book_cover_simple.json",
    prompt="book cover for 'Atomic Habits', professional, minimalist",
    negative_prompt="ugly, blurry, low quality",
    width=1024,
    height=1536,
    steps=20,
    seed=42
)

# Using RunningHub cloud
image_url = await reelforge.image(
    workflow="12345",  # RunningHub workflow ID
    prompt="a beautiful landscape"
)

# Check available workflows
workflows = reelforge.image.list_workflows()
print(f"Available workflows: {workflows}")

Environment Variables (Alternative)

# Local ComfyUI
export COMFYUI_BASE_URL="http://127.0.0.1:8188"

# RunningHub cloud
export RUNNINGHUB_API_KEY="rh-key-xxx"

Workflow DSL

ReelForge uses ComfyKit's DSL for workflow parameters:

{
  "6": {
    "class_type": "CLIPTextEncode",
    "_meta": {
      "title": "$prompt!"
    },
    "inputs": {
      "text": "default prompt",
      "clip": ["4", 1]
    }
  }
}

DSL Markers:

$param! - Required parameter
$param - Optional parameter
$param~ - Upload parameter (for images/audio/video)
$output.name - Output variable

Combined Workflow Example

Generate a complete book cover with narration:

import asyncio
from reelforge.service import reelforge

async def create_book_content(book_title, author):
    """Generate book summary, audio, and cover image"""
    
    # 1. Generate book summary with LLM
    summary = await reelforge.llm(
        prompt=f"Write a compelling 2-sentence summary for a book titled '{book_title}' by {author}",
        max_tokens=100
    )
    print(f"Summary: {summary}")
    
    # 2. Generate audio narration with TTS
    audio_path = await reelforge.tts(
        text=summary,
        voice="en-US-JennyNeural"
    )
    print(f"Audio: {audio_path}")
    
    # 3. Generate book cover image
    image_url = await reelforge.image(
        workflow="workflows/book_cover_simple.json",
        prompt=f"book cover for '{book_title}' by {author}, professional, modern design",
        width=1024,
        height=1536
    )
    print(f"Cover: {image_url}")
    
    return {
        "summary": summary,
        "audio": audio_path,
        "cover": image_url
    }

# Run
result = asyncio.run(create_book_content("Atomic Habits", "James Clear"))

Troubleshooting

LLM Issues

"API key not found"

Make sure you've set the API key in config.yaml or environment variables
For Qwen: DASHSCOPE_API_KEY
For OpenAI: OPENAI_API_KEY
For DeepSeek: DEEPSEEK_API_KEY

"Connection error"

Check base_url in config
Verify API endpoint is accessible
For Ollama, make sure server is running (ollama serve)

TTS Issues

"SSL error"

Edge TTS is free but requires internet connection
SSL verification is disabled by default for development

Image Issues

"ComfyUI connection refused"

Make sure ComfyUI is running at http://127.0.0.1:8188
Or configure RunningHub API key for cloud execution

"Workflow file not found"

Check workflow path is correct
Use relative path from project root: workflows/your_workflow.json

"No images generated"

Check workflow has SaveImage node
Verify workflow parameters are correct
Check ComfyUI logs for errors

Next Steps

See /examples/ directory for complete examples
Run python test_integration.py to test all capabilities
Create custom workflows in /workflows/ directory
Check ComfyKit documentation: https://puke3615.github.io/ComfyKit

Happy creating with ReelForge! 📚🎬

7.1 KiB Raw Blame History

ReelForge Capabilities Guide

Overview

Quick Start

1. LLM (Large Language Model)

Configuration

Usage

Environment Variables (Alternative)

2. TTS (Text-to-Speech)

Configuration

Usage

Popular Voices

List All Voices

3. Image Generation

Configuration

Usage

Environment Variables (Alternative)

Workflow DSL

Combined Workflow Example

Troubleshooting

LLM Issues

TTS Issues

Image Issues

Next Steps

7.1 KiB

Raw Blame History