# 🙋‍♀️ Pixelle-Video Frequently Asked Questions

### What is Pixelle-Video and how does it work?

Pixelle-Video is an AI-powered video generation tool that creates complete videos from a single topic input. The workflow is:
- **Script Generation** → **Image Planning** → **Frame Processing** → **Video Synthesis**

Simply input a topic keyword, and Pixelle-Video automatically handles scriptwriting, image generation, voice synthesis, background music, and final video compilation - requiring zero video editing experience.

### What installation methods are supported?

Pixelle-Video supports the following installation methods:

1. **Standard Installation**:
   ```bash
   git clone https://github.com/AIDC-AI/Pixelle-Video.git
   cd Pixelle-Video
   uv run streamlit run web/app.py
   ```

2. **Prerequisites**:
   - Install `uv` package manager (see official documentation for your system)
   - Install `ffmpeg` for video processing:
     - **macOS**: `brew install ffmpeg`
     - **Ubuntu/Debian**: `sudo apt update && sudo apt install ffmpeg`
     - **Windows**: Download from ffmpeg.org and add to PATH

### What are the system requirements?

- **Basic**: Python 3.10+, uv package manager, ffmpeg
- **For Image Generation**: ComfyUI server (local or cloud)
- **For LLM Integration**: API keys for supported models (optional for local models)
- **Hardware**: GPU recommended for local image generation, but cloud options available

### How to configure the system for first use?

1. Open the web interface at http://localhost:8501
2. Expand the "⚙️ System Configuration" panel
3. Configure two main sections:
   - **LLM Configuration**: 
     - Select preset model (Qwen, GPT-4o, DeepSeek, etc.)
     - Enter API key or configure local model (Ollama)
   - **Image Configuration**:
     - **Local deployment**: Set ComfyUI URL (default: http://127.0.0.1:8188)
     - **Cloud deployment**: Enter RunningHub API key
4. Click "Save Configuration" to complete setup

### What generation modes are available?

Pixelle-Video offers two main generation modes:

1. **AI Generated Content**:
   - Input just a topic keyword
   - AI automatically writes the script and creates the video
   - Example: "Why develop a reading habit"

2. **Fixed Script Content**:
   - Provide your complete script text
   - Skip AI scriptwriting, go directly to video generation
   - Ideal when you already have prepared content

### How to customize the audio settings?

Audio customization options include:

- **Background Music (BGM)**:
  - No BGM: Pure voice narration
  - Built-in music: Select from preset tracks (e.g., default.mp3)
  - Custom music: Place your MP3/WAV files in the `bgm/` folder

- **Text-to-Speech (TTS)**:
  - Select from available TTS workflows (Edge-TTS, Index-TTS, etc.)
  - System automatically scans `workflows/` folder for available options
  - Preview voice effect with test text

- **Voice Cloning**:
  - Upload reference audio (MP3/WAV/FLAC) for supported TTS workflows
  - Use reference audio during preview and generation

### How to customize the visual style?

Visual customization includes:

- **Image Generation Workflow**:
  - Select from available ComfyUI workflows (local or cloud)
  - Default workflow: `image_flux.json`
  - Custom workflows can be added to `workflows/` folder

- **Image Dimensions**:
  - Set width and height in pixels (default: 1024x1024)
  - Note: Different models have different size limitations

- **Style Prompt Prefix**:
  - Control overall image style (must be in English)
  - Example: "Minimalist black-and-white matchstick figure style illustration, clean lines, simple sketch style"
  - Click "Preview Style" to test the effect

- **Video Templates**:
  - Choose from multiple templates grouped by aspect ratio (vertical/horizontal/square)
  - Preview templates with custom parameters
  - Advanced users can create custom HTML templates in `templates/` folder

### What AI models are supported?

Pixelle-Video supports multiple AI model providers:

- **LLM Models**: GPT, Qwen (通义千问), DeepSeek, Ollama (local)
- **Image Generation**: ComfyUI with various models (FLUX, SDXL, etc.)
- **TTS Engines**: Edge-TTS, Index-TTS, ChatTTS, and more

The modular architecture allows flexible replacement of any component - for example, you can replace the image generation model with FLUX or switch TTS to ChatTTS.

### What are the cost options for running Pixelle-Video?

Pixelle-Video offers three cost tiers:

1. **Completely Free**:
   - LLM: Ollama (local)
   - Image Generation: Local ComfyUI deployment
   - Total cost: $0

2. **Recommended Balanced Option**:
   - LLM: Qwen (通义千问) - very low cost, high value
   - Image Generation: Local ComfyUI deployment
   - Cost: Minimal API fees for text generation only

3. **Cloud-Only Option**:
   - LLM: OpenAI API
   - Image Generation: RunningHub cloud service
   - Cost: Higher but requires no local hardware

**Recommendation**: Use local deployment if you have a GPU, otherwise Qwen + local ComfyUI offers the best value.

### How long does video generation take?

Generation time depends on several factors:
- Number of scenes in the script
- Network speed for API calls
- AI inference speed (local vs cloud)
- Video length and resolution

Typical generation time: **2-10 minutes** for most videos. The interface shows real-time progress through each stage (script → images → audio → final video).

### What to do if the video quality is unsatisfactory?

Try these adjustments:

- **Script Quality**:
  - Switch to a different LLM model (different models have different writing styles)
  - Use "Fixed Script Content" mode with your own refined script

- **Image Quality**:
  - Adjust image dimensions to match model requirements
  - Modify the prompt prefix to change visual style
  - Try different ComfyUI workflows

- **Audio Quality**:
  - Switch TTS workflow (Edge-TTS vs Index-TTS vs others)
  - Upload reference audio for voice cloning
  - Adjust TTS parameters

- **Video Layout**:
  - Try different video templates
  - Change video dimensions (vertical/horizontal/square)

### Where are the generated videos saved?

All generated videos are automatically saved to the `output/` folder in the project directory. The interface displays detailed information after generation:
- Video duration
- File size
- Number of scenes
- Download link

### How to troubleshoot common errors?

1. **FFmpeg Errors**:
   - Verify ffmpeg installation with `ffmpeg -version`
   - Ensure ffmpeg is in your system PATH

2. **API Connection Issues**:
   - Verify API keys are correct
   - Test LLM connection in system configuration
   - For ComfyUI: Click "Test Connection" in image configuration

3. **Image Generation Failures**:
   - Ensure ComfyUI server is running
   - Check image dimensions are supported by your model
   - Verify workflow files exist in `workflows/` folder

4. **Audio Generation Issues**:
   - Ensure selected TTS workflow is properly configured
   - For voice cloning: verify reference audio format is supported

### How to extend Pixelle-Video with custom features?

Pixelle-Video is built on ComfyUI architecture, allowing deep customization:

- **Custom Workflows**: Add your own ComfyUI workflows to `workflows/` folder
- **Custom Templates**: Create HTML templates in `templates/` folder
- **Custom BGM**: Add your music files to `bgm/` folder
- **Advanced Integration**: Since it's based on ComfyUI, you can integrate any ComfyUI custom nodes

The atomic capability design means you can mix and match any component - replace text generation, image models, TTS engines, or video templates independently.

### What community resources are available?

- **GitHub Repository**: https://github.com/AIDC-AI/Pixelle-Video
- **Issue Tracking**: Submit bugs or feature requests via GitHub Issues
- **Community Support**: Join discussion groups for help and sharing
- **Template Gallery**: View all available templates and their effects
- **Contributions**: The project welcomes contributions under MIT license

💡 **Tip**: If your question isn't answered here, please submit an issue on GitHub or join our community discussions. We continuously update this FAQ based on user feedback!