Create FAQ.md

2025-11-24 15:07:06 +08:00
parent 111acac705
commit 9a1e83b39d
1 changed files with 213 additions and 0 deletions
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@@ -0,0 +1,213 @@
+# 🙋‍♀️ Pixelle-Video Frequently Asked Questions
+
+### What is Pixelle-Video and how does it work?
+
+Pixelle-Video is an AI-powered video generation tool that creates complete videos from a single topic input. The workflow is:
+- **Script Generation** → **Image Planning** → **Frame Processing** → **Video Synthesis**
+
+Simply input a topic keyword, and Pixelle-Video automatically handles scriptwriting, image generation, voice synthesis, background music, and final video compilation - requiring zero video editing experience.
+
+### What installation methods are supported?
+
+Pixelle-Video supports the following installation methods:
+
+1. **Standard Installation**:
+   ```bash
+   git clone https://github.com/AIDC-AI/Pixelle-Video.git
+   cd Pixelle-Video
+   uv run streamlit run web/app.py
+   ```
+
+2. **Prerequisites**:
+   - Install `uv` package manager (see official documentation for your system)
+   - Install `ffmpeg` for video processing:
+     - **macOS**: `brew install ffmpeg`
+     - **Ubuntu/Debian**: `sudo apt update && sudo apt install ffmpeg`
+     - **Windows**: Download from ffmpeg.org and add to PATH
+
+### What are the system requirements?
+
+- **Basic**: Python 3.10+, uv package manager, ffmpeg
+- **For Image Generation**: ComfyUI server (local or cloud)
+- **For LLM Integration**: API keys for supported models (optional for local models)
+- **Hardware**: GPU recommended for local image generation, but cloud options available
+
+### How to configure the system for first use?
+
+1. Open the web interface at http://localhost:8501
+2. Expand the "⚙️ System Configuration" panel
+3. Configure two main sections:
+   - **LLM Configuration**: 
+     - Select preset model (Qwen, GPT-4o, DeepSeek, etc.)
+     - Enter API key or configure local model (Ollama)
+   - **Image Configuration**:
+     - **Local deployment**: Set ComfyUI URL (default: http://127.0.0.1:8188)
+     - **Cloud deployment**: Enter RunningHub API key
+4. Click "Save Configuration" to complete setup
+
+### What generation modes are available?
+
+Pixelle-Video offers two main generation modes:
+
+1. **AI Generated Content**:
+   - Input just a topic keyword
+   - AI automatically writes the script and creates the video
+   - Example: "Why develop a reading habit"
+
+2. **Fixed Script Content**:
+   - Provide your complete script text
+   - Skip AI scriptwriting, go directly to video generation
+   - Ideal when you already have prepared content
+
+### How to customize the audio settings?
+
+Audio customization options include:
+
+- **Background Music (BGM)**:
+  - No BGM: Pure voice narration
+  - Built-in music: Select from preset tracks (e.g., default.mp3)
+  - Custom music: Place your MP3/WAV files in the `bgm/` folder
+
+- **Text-to-Speech (TTS)**:
+  - Select from available TTS workflows (Edge-TTS, Index-TTS, etc.)
+  - System automatically scans `workflows/` folder for available options
+  - Preview voice effect with test text
+
+- **Voice Cloning**:
+  - Upload reference audio (MP3/WAV/FLAC) for supported TTS workflows
+  - Use reference audio during preview and generation
+
+### How to customize the visual style?
+
+Visual customization includes:
+
+- **Image Generation Workflow**:
+  - Select from available ComfyUI workflows (local or cloud)
+  - Default workflow: `image_flux.json`
+  - Custom workflows can be added to `workflows/` folder
+
+- **Image Dimensions**:
+  - Set width and height in pixels (default: 1024x1024)
+  - Note: Different models have different size limitations
+
+- **Style Prompt Prefix**:
+  - Control overall image style (must be in English)
+  - Example: "Minimalist black-and-white matchstick figure style illustration, clean lines, simple sketch style"
+  - Click "Preview Style" to test the effect
+
+- **Video Templates**:
+  - Choose from multiple templates grouped by aspect ratio (vertical/horizontal/square)
+  - Preview templates with custom parameters
+  - Advanced users can create custom HTML templates in `templates/` folder
+
+### What AI models are supported?
+
+Pixelle-Video supports multiple AI model providers:
+
+- **LLM Models**: GPT, Qwen (通义千问), DeepSeek, Ollama (local)
+- **Image Generation**: ComfyUI with various models (FLUX, SDXL, etc.)
+- **TTS Engines**: Edge-TTS, Index-TTS, ChatTTS, and more
+
+The modular architecture allows flexible replacement of any component - for example, you can replace the image generation model with FLUX or switch TTS to ChatTTS.
+
+### What are the cost options for running Pixelle-Video?
+
+Pixelle-Video offers three cost tiers:
+
+1. **Completely Free**:
+   - LLM: Ollama (local)
+   - Image Generation: Local ComfyUI deployment
+   - Total cost: $0
+
+2. **Recommended Balanced Option**:
+   - LLM: Qwen (通义千问) - very low cost, high value
+   - Image Generation: Local ComfyUI deployment
+   - Cost: Minimal API fees for text generation only
+
+3. **Cloud-Only Option**:
+   - LLM: OpenAI API
+   - Image Generation: RunningHub cloud service
+   - Cost: Higher but requires no local hardware
+
+**Recommendation**: Use local deployment if you have a GPU, otherwise Qwen + local ComfyUI offers the best value.
+
+### How long does video generation take?
+
+Generation time depends on several factors:
+- Number of scenes in the script
+- Network speed for API calls
+- AI inference speed (local vs cloud)
+- Video length and resolution
+
+Typical generation time: **2-10 minutes** for most videos. The interface shows real-time progress through each stage (script → images → audio → final video).
+
+### What to do if the video quality is unsatisfactory?
+
+Try these adjustments:
+
+- **Script Quality**:
+  - Switch to a different LLM model (different models have different writing styles)
+  - Use "Fixed Script Content" mode with your own refined script
+
+- **Image Quality**:
+  - Adjust image dimensions to match model requirements
+  - Modify the prompt prefix to change visual style
+  - Try different ComfyUI workflows
+
+- **Audio Quality**:
+  - Switch TTS workflow (Edge-TTS vs Index-TTS vs others)
+  - Upload reference audio for voice cloning
+  - Adjust TTS parameters
+
+- **Video Layout**:
+  - Try different video templates
+  - Change video dimensions (vertical/horizontal/square)
+
+### Where are the generated videos saved?
+
+All generated videos are automatically saved to the `output/` folder in the project directory. The interface displays detailed information after generation:
+- Video duration
+- File size
+- Number of scenes
+- Download link
+
+### How to troubleshoot common errors?
+
+1. **FFmpeg Errors**:
+   - Verify ffmpeg installation with `ffmpeg -version`
+   - Ensure ffmpeg is in your system PATH
+
+2. **API Connection Issues**:
+   - Verify API keys are correct
+   - Test LLM connection in system configuration
+   - For ComfyUI: Click "Test Connection" in image configuration
+
+3. **Image Generation Failures**:
+   - Ensure ComfyUI server is running
+   - Check image dimensions are supported by your model
+   - Verify workflow files exist in `workflows/` folder
+
+4. **Audio Generation Issues**:
+   - Ensure selected TTS workflow is properly configured
+   - For voice cloning: verify reference audio format is supported
+
+### How to extend Pixelle-Video with custom features?
+
+Pixelle-Video is built on ComfyUI architecture, allowing deep customization:
+
+- **Custom Workflows**: Add your own ComfyUI workflows to `workflows/` folder
+- **Custom Templates**: Create HTML templates in `templates/` folder
+- **Custom BGM**: Add your music files to `bgm/` folder
+- **Advanced Integration**: Since it's based on ComfyUI, you can integrate any ComfyUI custom nodes
+
+The atomic capability design means you can mix and match any component - replace text generation, image models, TTS engines, or video templates independently.
+
+### What community resources are available?
+
+- **GitHub Repository**: https://github.com/AIDC-AI/Pixelle-Video
+- **Issue Tracking**: Submit bugs or feature requests via GitHub Issues
+- **Community Support**: Join discussion groups for help and sharing
+- **Template Gallery**: View all available templates and their effects
+- **Contributions**: The project welcomes contributions under MIT license
+
+💡 **Tip**: If your question isn't answered here, please submit an issue on GitHub or join our community discussions. We continuously update this FAQ based on user feedback!