Files
AI-Video/docs/FAQ.md
xianshi-yyds 9a1e83b39d Create FAQ.md
2025-11-24 15:07:06 +08:00

8.0 KiB

🙋‍♀️ Pixelle-Video Frequently Asked Questions

What is Pixelle-Video and how does it work?

Pixelle-Video is an AI-powered video generation tool that creates complete videos from a single topic input. The workflow is:

  • Script GenerationImage PlanningFrame ProcessingVideo Synthesis

Simply input a topic keyword, and Pixelle-Video automatically handles scriptwriting, image generation, voice synthesis, background music, and final video compilation - requiring zero video editing experience.

What installation methods are supported?

Pixelle-Video supports the following installation methods:

  1. Standard Installation:

    git clone https://github.com/AIDC-AI/Pixelle-Video.git
    cd Pixelle-Video
    uv run streamlit run web/app.py
    
  2. Prerequisites:

    • Install uv package manager (see official documentation for your system)
    • Install ffmpeg for video processing:
      • macOS: brew install ffmpeg
      • Ubuntu/Debian: sudo apt update && sudo apt install ffmpeg
      • Windows: Download from ffmpeg.org and add to PATH

What are the system requirements?

  • Basic: Python 3.10+, uv package manager, ffmpeg
  • For Image Generation: ComfyUI server (local or cloud)
  • For LLM Integration: API keys for supported models (optional for local models)
  • Hardware: GPU recommended for local image generation, but cloud options available

How to configure the system for first use?

  1. Open the web interface at http://localhost:8501
  2. Expand the "⚙️ System Configuration" panel
  3. Configure two main sections:
    • LLM Configuration:
      • Select preset model (Qwen, GPT-4o, DeepSeek, etc.)
      • Enter API key or configure local model (Ollama)
    • Image Configuration:
      • Local deployment: Set ComfyUI URL (default: http://127.0.0.1:8188)
      • Cloud deployment: Enter RunningHub API key
  4. Click "Save Configuration" to complete setup

What generation modes are available?

Pixelle-Video offers two main generation modes:

  1. AI Generated Content:

    • Input just a topic keyword
    • AI automatically writes the script and creates the video
    • Example: "Why develop a reading habit"
  2. Fixed Script Content:

    • Provide your complete script text
    • Skip AI scriptwriting, go directly to video generation
    • Ideal when you already have prepared content

How to customize the audio settings?

Audio customization options include:

  • Background Music (BGM):

    • No BGM: Pure voice narration
    • Built-in music: Select from preset tracks (e.g., default.mp3)
    • Custom music: Place your MP3/WAV files in the bgm/ folder
  • Text-to-Speech (TTS):

    • Select from available TTS workflows (Edge-TTS, Index-TTS, etc.)
    • System automatically scans workflows/ folder for available options
    • Preview voice effect with test text
  • Voice Cloning:

    • Upload reference audio (MP3/WAV/FLAC) for supported TTS workflows
    • Use reference audio during preview and generation

How to customize the visual style?

Visual customization includes:

  • Image Generation Workflow:

    • Select from available ComfyUI workflows (local or cloud)
    • Default workflow: image_flux.json
    • Custom workflows can be added to workflows/ folder
  • Image Dimensions:

    • Set width and height in pixels (default: 1024x1024)
    • Note: Different models have different size limitations
  • Style Prompt Prefix:

    • Control overall image style (must be in English)
    • Example: "Minimalist black-and-white matchstick figure style illustration, clean lines, simple sketch style"
    • Click "Preview Style" to test the effect
  • Video Templates:

    • Choose from multiple templates grouped by aspect ratio (vertical/horizontal/square)
    • Preview templates with custom parameters
    • Advanced users can create custom HTML templates in templates/ folder

What AI models are supported?

Pixelle-Video supports multiple AI model providers:

  • LLM Models: GPT, Qwen (通义千问), DeepSeek, Ollama (local)
  • Image Generation: ComfyUI with various models (FLUX, SDXL, etc.)
  • TTS Engines: Edge-TTS, Index-TTS, ChatTTS, and more

The modular architecture allows flexible replacement of any component - for example, you can replace the image generation model with FLUX or switch TTS to ChatTTS.

What are the cost options for running Pixelle-Video?

Pixelle-Video offers three cost tiers:

  1. Completely Free:

    • LLM: Ollama (local)
    • Image Generation: Local ComfyUI deployment
    • Total cost: $0
  2. Recommended Balanced Option:

    • LLM: Qwen (通义千问) - very low cost, high value
    • Image Generation: Local ComfyUI deployment
    • Cost: Minimal API fees for text generation only
  3. Cloud-Only Option:

    • LLM: OpenAI API
    • Image Generation: RunningHub cloud service
    • Cost: Higher but requires no local hardware

Recommendation: Use local deployment if you have a GPU, otherwise Qwen + local ComfyUI offers the best value.

How long does video generation take?

Generation time depends on several factors:

  • Number of scenes in the script
  • Network speed for API calls
  • AI inference speed (local vs cloud)
  • Video length and resolution

Typical generation time: 2-10 minutes for most videos. The interface shows real-time progress through each stage (script → images → audio → final video).

What to do if the video quality is unsatisfactory?

Try these adjustments:

  • Script Quality:

    • Switch to a different LLM model (different models have different writing styles)
    • Use "Fixed Script Content" mode with your own refined script
  • Image Quality:

    • Adjust image dimensions to match model requirements
    • Modify the prompt prefix to change visual style
    • Try different ComfyUI workflows
  • Audio Quality:

    • Switch TTS workflow (Edge-TTS vs Index-TTS vs others)
    • Upload reference audio for voice cloning
    • Adjust TTS parameters
  • Video Layout:

    • Try different video templates
    • Change video dimensions (vertical/horizontal/square)

Where are the generated videos saved?

All generated videos are automatically saved to the output/ folder in the project directory. The interface displays detailed information after generation:

  • Video duration
  • File size
  • Number of scenes
  • Download link

How to troubleshoot common errors?

  1. FFmpeg Errors:

    • Verify ffmpeg installation with ffmpeg -version
    • Ensure ffmpeg is in your system PATH
  2. API Connection Issues:

    • Verify API keys are correct
    • Test LLM connection in system configuration
    • For ComfyUI: Click "Test Connection" in image configuration
  3. Image Generation Failures:

    • Ensure ComfyUI server is running
    • Check image dimensions are supported by your model
    • Verify workflow files exist in workflows/ folder
  4. Audio Generation Issues:

    • Ensure selected TTS workflow is properly configured
    • For voice cloning: verify reference audio format is supported

How to extend Pixelle-Video with custom features?

Pixelle-Video is built on ComfyUI architecture, allowing deep customization:

  • Custom Workflows: Add your own ComfyUI workflows to workflows/ folder
  • Custom Templates: Create HTML templates in templates/ folder
  • Custom BGM: Add your music files to bgm/ folder
  • Advanced Integration: Since it's based on ComfyUI, you can integrate any ComfyUI custom nodes

The atomic capability design means you can mix and match any component - replace text generation, image models, TTS engines, or video templates independently.

What community resources are available?

  • GitHub Repository: https://github.com/AIDC-AI/Pixelle-Video
  • Issue Tracking: Submit bugs or feature requests via GitHub Issues
  • Community Support: Join discussion groups for help and sharing
  • Template Gallery: View all available templates and their effects
  • Contributions: The project welcomes contributions under MIT license

💡 Tip: If your question isn't answered here, please submit an issue on GitHub or join our community discussions. We continuously update this FAQ based on user feedback!