Create FAQ.md
This commit is contained in:
213
docs/FAQ.md
Normal file
213
docs/FAQ.md
Normal file
@@ -0,0 +1,213 @@
|
||||
# 🙋♀️ Pixelle-Video Frequently Asked Questions
|
||||
|
||||
### What is Pixelle-Video and how does it work?
|
||||
|
||||
Pixelle-Video is an AI-powered video generation tool that creates complete videos from a single topic input. The workflow is:
|
||||
- **Script Generation** → **Image Planning** → **Frame Processing** → **Video Synthesis**
|
||||
|
||||
Simply input a topic keyword, and Pixelle-Video automatically handles scriptwriting, image generation, voice synthesis, background music, and final video compilation - requiring zero video editing experience.
|
||||
|
||||
### What installation methods are supported?
|
||||
|
||||
Pixelle-Video supports the following installation methods:
|
||||
|
||||
1. **Standard Installation**:
|
||||
```bash
|
||||
git clone https://github.com/AIDC-AI/Pixelle-Video.git
|
||||
cd Pixelle-Video
|
||||
uv run streamlit run web/app.py
|
||||
```
|
||||
|
||||
2. **Prerequisites**:
|
||||
- Install `uv` package manager (see official documentation for your system)
|
||||
- Install `ffmpeg` for video processing:
|
||||
- **macOS**: `brew install ffmpeg`
|
||||
- **Ubuntu/Debian**: `sudo apt update && sudo apt install ffmpeg`
|
||||
- **Windows**: Download from ffmpeg.org and add to PATH
|
||||
|
||||
### What are the system requirements?
|
||||
|
||||
- **Basic**: Python 3.10+, uv package manager, ffmpeg
|
||||
- **For Image Generation**: ComfyUI server (local or cloud)
|
||||
- **For LLM Integration**: API keys for supported models (optional for local models)
|
||||
- **Hardware**: GPU recommended for local image generation, but cloud options available
|
||||
|
||||
### How to configure the system for first use?
|
||||
|
||||
1. Open the web interface at http://localhost:8501
|
||||
2. Expand the "⚙️ System Configuration" panel
|
||||
3. Configure two main sections:
|
||||
- **LLM Configuration**:
|
||||
- Select preset model (Qwen, GPT-4o, DeepSeek, etc.)
|
||||
- Enter API key or configure local model (Ollama)
|
||||
- **Image Configuration**:
|
||||
- **Local deployment**: Set ComfyUI URL (default: http://127.0.0.1:8188)
|
||||
- **Cloud deployment**: Enter RunningHub API key
|
||||
4. Click "Save Configuration" to complete setup
|
||||
|
||||
### What generation modes are available?
|
||||
|
||||
Pixelle-Video offers two main generation modes:
|
||||
|
||||
1. **AI Generated Content**:
|
||||
- Input just a topic keyword
|
||||
- AI automatically writes the script and creates the video
|
||||
- Example: "Why develop a reading habit"
|
||||
|
||||
2. **Fixed Script Content**:
|
||||
- Provide your complete script text
|
||||
- Skip AI scriptwriting, go directly to video generation
|
||||
- Ideal when you already have prepared content
|
||||
|
||||
### How to customize the audio settings?
|
||||
|
||||
Audio customization options include:
|
||||
|
||||
- **Background Music (BGM)**:
|
||||
- No BGM: Pure voice narration
|
||||
- Built-in music: Select from preset tracks (e.g., default.mp3)
|
||||
- Custom music: Place your MP3/WAV files in the `bgm/` folder
|
||||
|
||||
- **Text-to-Speech (TTS)**:
|
||||
- Select from available TTS workflows (Edge-TTS, Index-TTS, etc.)
|
||||
- System automatically scans `workflows/` folder for available options
|
||||
- Preview voice effect with test text
|
||||
|
||||
- **Voice Cloning**:
|
||||
- Upload reference audio (MP3/WAV/FLAC) for supported TTS workflows
|
||||
- Use reference audio during preview and generation
|
||||
|
||||
### How to customize the visual style?
|
||||
|
||||
Visual customization includes:
|
||||
|
||||
- **Image Generation Workflow**:
|
||||
- Select from available ComfyUI workflows (local or cloud)
|
||||
- Default workflow: `image_flux.json`
|
||||
- Custom workflows can be added to `workflows/` folder
|
||||
|
||||
- **Image Dimensions**:
|
||||
- Set width and height in pixels (default: 1024x1024)
|
||||
- Note: Different models have different size limitations
|
||||
|
||||
- **Style Prompt Prefix**:
|
||||
- Control overall image style (must be in English)
|
||||
- Example: "Minimalist black-and-white matchstick figure style illustration, clean lines, simple sketch style"
|
||||
- Click "Preview Style" to test the effect
|
||||
|
||||
- **Video Templates**:
|
||||
- Choose from multiple templates grouped by aspect ratio (vertical/horizontal/square)
|
||||
- Preview templates with custom parameters
|
||||
- Advanced users can create custom HTML templates in `templates/` folder
|
||||
|
||||
### What AI models are supported?
|
||||
|
||||
Pixelle-Video supports multiple AI model providers:
|
||||
|
||||
- **LLM Models**: GPT, Qwen (通义千问), DeepSeek, Ollama (local)
|
||||
- **Image Generation**: ComfyUI with various models (FLUX, SDXL, etc.)
|
||||
- **TTS Engines**: Edge-TTS, Index-TTS, ChatTTS, and more
|
||||
|
||||
The modular architecture allows flexible replacement of any component - for example, you can replace the image generation model with FLUX or switch TTS to ChatTTS.
|
||||
|
||||
### What are the cost options for running Pixelle-Video?
|
||||
|
||||
Pixelle-Video offers three cost tiers:
|
||||
|
||||
1. **Completely Free**:
|
||||
- LLM: Ollama (local)
|
||||
- Image Generation: Local ComfyUI deployment
|
||||
- Total cost: $0
|
||||
|
||||
2. **Recommended Balanced Option**:
|
||||
- LLM: Qwen (通义千问) - very low cost, high value
|
||||
- Image Generation: Local ComfyUI deployment
|
||||
- Cost: Minimal API fees for text generation only
|
||||
|
||||
3. **Cloud-Only Option**:
|
||||
- LLM: OpenAI API
|
||||
- Image Generation: RunningHub cloud service
|
||||
- Cost: Higher but requires no local hardware
|
||||
|
||||
**Recommendation**: Use local deployment if you have a GPU, otherwise Qwen + local ComfyUI offers the best value.
|
||||
|
||||
### How long does video generation take?
|
||||
|
||||
Generation time depends on several factors:
|
||||
- Number of scenes in the script
|
||||
- Network speed for API calls
|
||||
- AI inference speed (local vs cloud)
|
||||
- Video length and resolution
|
||||
|
||||
Typical generation time: **2-10 minutes** for most videos. The interface shows real-time progress through each stage (script → images → audio → final video).
|
||||
|
||||
### What to do if the video quality is unsatisfactory?
|
||||
|
||||
Try these adjustments:
|
||||
|
||||
- **Script Quality**:
|
||||
- Switch to a different LLM model (different models have different writing styles)
|
||||
- Use "Fixed Script Content" mode with your own refined script
|
||||
|
||||
- **Image Quality**:
|
||||
- Adjust image dimensions to match model requirements
|
||||
- Modify the prompt prefix to change visual style
|
||||
- Try different ComfyUI workflows
|
||||
|
||||
- **Audio Quality**:
|
||||
- Switch TTS workflow (Edge-TTS vs Index-TTS vs others)
|
||||
- Upload reference audio for voice cloning
|
||||
- Adjust TTS parameters
|
||||
|
||||
- **Video Layout**:
|
||||
- Try different video templates
|
||||
- Change video dimensions (vertical/horizontal/square)
|
||||
|
||||
### Where are the generated videos saved?
|
||||
|
||||
All generated videos are automatically saved to the `output/` folder in the project directory. The interface displays detailed information after generation:
|
||||
- Video duration
|
||||
- File size
|
||||
- Number of scenes
|
||||
- Download link
|
||||
|
||||
### How to troubleshoot common errors?
|
||||
|
||||
1. **FFmpeg Errors**:
|
||||
- Verify ffmpeg installation with `ffmpeg -version`
|
||||
- Ensure ffmpeg is in your system PATH
|
||||
|
||||
2. **API Connection Issues**:
|
||||
- Verify API keys are correct
|
||||
- Test LLM connection in system configuration
|
||||
- For ComfyUI: Click "Test Connection" in image configuration
|
||||
|
||||
3. **Image Generation Failures**:
|
||||
- Ensure ComfyUI server is running
|
||||
- Check image dimensions are supported by your model
|
||||
- Verify workflow files exist in `workflows/` folder
|
||||
|
||||
4. **Audio Generation Issues**:
|
||||
- Ensure selected TTS workflow is properly configured
|
||||
- For voice cloning: verify reference audio format is supported
|
||||
|
||||
### How to extend Pixelle-Video with custom features?
|
||||
|
||||
Pixelle-Video is built on ComfyUI architecture, allowing deep customization:
|
||||
|
||||
- **Custom Workflows**: Add your own ComfyUI workflows to `workflows/` folder
|
||||
- **Custom Templates**: Create HTML templates in `templates/` folder
|
||||
- **Custom BGM**: Add your music files to `bgm/` folder
|
||||
- **Advanced Integration**: Since it's based on ComfyUI, you can integrate any ComfyUI custom nodes
|
||||
|
||||
The atomic capability design means you can mix and match any component - replace text generation, image models, TTS engines, or video templates independently.
|
||||
|
||||
### What community resources are available?
|
||||
|
||||
- **GitHub Repository**: https://github.com/AIDC-AI/Pixelle-Video
|
||||
- **Issue Tracking**: Submit bugs or feature requests via GitHub Issues
|
||||
- **Community Support**: Join discussion groups for help and sharing
|
||||
- **Template Gallery**: View all available templates and their effects
|
||||
- **Contributions**: The project welcomes contributions under MIT license
|
||||
|
||||
💡 **Tip**: If your question isn't answered here, please submit an issue on GitHub or join our community discussions. We continuously update this FAQ based on user feedback!
|
||||
Reference in New Issue
Block a user