新增示例库逻辑

This commit is contained in:
puke
2025-11-03 17:44:33 +08:00
parent 6f114b61c7
commit ec395196cd
43 changed files with 2567 additions and 406 deletions

View File

@@ -0,0 +1,54 @@
# Architecture
Technical architecture overview of Pixelle-Video.
---
## Core Architecture
Pixelle-Video uses a layered architecture design:
- **Web Layer**: Streamlit Web interface
- **Service Layer**: Core business logic
- **ComfyUI Layer**: Image and TTS generation
---
## Main Components
### PixelleVideoCore
Core service class coordinating all sub-services.
### LLM Service
Responsible for calling large language models to generate scripts.
### Image Service
Responsible for calling ComfyUI to generate images.
### TTS Service
Responsible for calling ComfyUI to generate speech.
### Video Generator
Responsible for composing the final video.
---
## Tech Stack
- **Backend**: Python 3.10+, AsyncIO
- **Web**: Streamlit
- **AI**: OpenAI API, ComfyUI
- **Configuration**: YAML
- **Tools**: uv (package management)
---
## More Information
Detailed architecture documentation coming soon.

View File

@@ -0,0 +1,50 @@
# Contributing
Thank you for your interest in contributing to Pixelle-Video!
---
## How to Contribute
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
---
## Development Setup
```bash
# Clone your fork
git clone https://github.com/your-username/Pixelle-Video.git
cd Pixelle-Video
# Install development dependencies
uv sync
# Run tests
pytest
```
---
## Code Standards
- All code and comments in English
- Follow PEP 8 standards
- Add appropriate tests
---
## Submit Issues
Having problems or feature suggestions? Please submit at [GitHub Issues](https://github.com/PixelleLab/Pixelle-Video/issues).
---
## Code of Conduct
Please be friendly and respectful. We are committed to fostering an inclusive community environment.

78
docs/en/faq.md Normal file
View File

@@ -0,0 +1,78 @@
# FAQ
Frequently Asked Questions.
---
## Installation
### Q: How to install uv?
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
### Q: Can I use something other than uv?
Yes, you can use traditional pip + venv approach.
---
## Configuration
### Q: Do I need to configure ComfyUI?
Not necessarily. You can use RunningHub cloud service without local deployment.
### Q: Which LLMs are supported?
All OpenAI-compatible LLMs, including:
- Qianwen
- GPT-4o
- DeepSeek
- Ollama (local)
---
## Usage
### Q: How long does first-time usage take?
Generating a 3-5 scene video takes approximately 2-5 minutes.
### Q: What if I'm not satisfied with the video?
Try:
1. Change LLM model
2. Adjust image dimensions and prompt prefix
3. Change TTS workflow
4. Try different video templates
### Q: What are the costs?
- **Completely Free**: Ollama + Local ComfyUI = $0
- **Recommended**: Qianwen + Local ComfyUI ≈ $0.01-0.05/video
- **Cloud Solution**: OpenAI + RunningHub (higher cost)
---
## Troubleshooting
### Q: ComfyUI connection failed
1. Confirm ComfyUI is running
2. Check if URL is correct
3. Click "Test Connection" in Web interface
### Q: LLM API call failed
1. Check if API Key is correct
2. Check network connection
3. Review error messages
---
## Other Questions
Have other questions? Check [Troubleshooting](troubleshooting.md) or submit an [Issue](https://github.com/PixelleLab/Pixelle-Video/issues).

45
docs/en/gallery/index.md Normal file
View File

@@ -0,0 +1,45 @@
# 🎬 Video Gallery
Showcase of videos created with Pixelle-Video. Click on cards to view complete workflows and configuration files.
---
<div class="grid cards" markdown>
- **Reading Habit**
---
<video controls width="100%" style="border-radius: 8px;">
<source src="https://your-oss-bucket.oss-cn-hangzhou.aliyuncs.com/pixelle-video/reading-habit/video.mp4" type="video/mp4">
</video>
[:octicons-mark-github-16: View Workflows & Config](https://github.com/PixelleLab/Pixelle-Video/tree/main/docs/gallery/reading-habit)
- **Work Efficiency**
---
<video controls width="100%" style="border-radius: 8px;">
<source src="https://your-oss-bucket.oss-cn-hangzhou.aliyuncs.com/pixelle-video/work-efficiency/video.mp4" type="video/mp4">
</video>
[:octicons-mark-github-16: View Workflows & Config](https://github.com/PixelleLab/Pixelle-Video/tree/main/docs/gallery/work-efficiency)
- **Healthy Diet**
---
<video controls width="100%" style="border-radius: 8px;">
<source src="https://your-oss-bucket.oss-cn-hangzhou.aliyuncs.com/pixelle-video/healthy-diet/video.mp4" type="video/mp4">
</video>
[:octicons-mark-github-16: View Workflows & Config](https://github.com/PixelleLab/Pixelle-Video/tree/main/docs/gallery/healthy-diet)
</div>
---
!!! tip "How to Use"
Click on a case card to jump to GitHub, download workflow files and configuration, and reproduce the video effect with one click.

View File

@@ -0,0 +1,60 @@
# Configuration
After installation, you need to configure services to use Pixelle-Video.
---
## LLM Configuration
LLM (Large Language Model) is used to generate video scripts.
### Quick Preset Selection
1. Select a preset model from the dropdown:
- Qianwen (recommended, great value)
- GPT-4o
- DeepSeek
- Ollama (local, completely free)
2. The system will auto-fill `base_url` and `model`
3. Click「🔑 Get API Key」to register and obtain credentials
4. Enter your API Key
---
## Image Configuration
Two options available:
### Local Deployment (Recommended)
Using local ComfyUI service:
1. Install and start ComfyUI
2. Enter ComfyUI URL (default `http://127.0.0.1:8188`)
3. Click "Test Connection" to verify
### Cloud Deployment
Using RunningHub cloud service:
1. Register for a RunningHub account
2. Obtain API Key
3. Enter API Key in configuration
---
## Save Configuration
After filling in all required configuration, click the "Save Configuration" button.
Configuration will be saved to `config.yaml` file.
---
## Next Steps
- [Quick Start](quick-start.md) - Create your first video

View File

@@ -0,0 +1,115 @@
# Installation
This page will guide you through installing Pixelle-Video.
---
## System Requirements
### Required
- **Python**: 3.10 or higher
- **Operating System**: Windows, macOS, or Linux
- **Package Manager**: uv (recommended) or pip
### Optional
- **GPU**: NVIDIA GPU with 6GB+ VRAM recommended for local ComfyUI
- **Network**: Stable internet connection for LLM API and image generation services
---
## Installation Steps
### Step 1: Clone the Repository
```bash
git clone https://github.com/PixelleLab/Pixelle-Video.git
cd Pixelle-Video
```
### Step 2: Install Dependencies
!!! tip "Recommended: Use uv"
This project uses `uv` as the package manager, which is faster and more reliable than traditional pip.
#### Using uv (Recommended)
```bash
# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install project dependencies (uv will create a virtual environment automatically)
uv sync
```
#### Using pip
```bash
# Create virtual environment
python -m venv venv
# Activate virtual environment
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate
# Install dependencies
pip install -e .
```
---
## Verify Installation
Run the following command to verify the installation:
```bash
# Using uv
uv run streamlit run web/app.py
# Or using pip (activate virtual environment first)
streamlit run web/app.py
```
Your browser should automatically open `http://localhost:8501` and display the Pixelle-Video web interface.
!!! success "Installation Successful!"
If you can see the web interface, the installation was successful! Next, check out the [Configuration Guide](configuration.md) to set up your services.
---
## Optional: Install ComfyUI (Local Deployment)
If you want to run image generation locally, you'll need to install ComfyUI:
### Quick Install
```bash
# Clone ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
# Install dependencies
pip install -r requirements.txt
```
### Start ComfyUI
```bash
python main.py
```
ComfyUI runs on `http://127.0.0.1:8188` by default.
!!! info "ComfyUI Models"
ComfyUI requires downloading model files to work. Please refer to the [ComfyUI documentation](https://github.com/comfyanonymous/ComfyUI) for information on downloading and configuring models.
---
## Next Steps
- [Configuration](configuration.md) - Configure LLM and image generation services
- [Quick Start](quick-start.md) - Create your first video

View File

@@ -0,0 +1,107 @@
# Quick Start
Already installed and configured? Let's create your first video!
---
## Start the Web Interface
```bash
# Using uv
uv run streamlit run web/app.py
```
Your browser will automatically open `http://localhost:8501`
---
## Create Your First Video
### Step 1: Check Configuration
On first use, expand the「⚙ System Configuration」panel and confirm:
- **LLM Configuration**: Select an AI model (e.g., Qianwen, GPT) and enter API Key
- **Image Configuration**: Configure ComfyUI address or RunningHub API Key
If not yet configured, see the [Configuration Guide](configuration.md).
Click "Save Configuration" when done.
---
### Step 2: Enter a Topic
In the left panel's「📝 Content Input」section:
1. Select「**AI Generate Content**」mode
2. Enter a topic in the text box, for example:
```
Why develop a reading habit
```
3. (Optional) Set number of scenes, default is 5 frames
!!! tip "Topic Examples"
- Why develop a reading habit
- How to improve work efficiency
- The importance of healthy eating
- The meaning of travel
---
### Step 3: Configure Voice and Visuals
In the middle panel:
**Voice Settings**
- Select TTS workflow (default Edge-TTS works well)
- For voice cloning, upload a reference audio file
**Visual Settings**
- Select image generation workflow (default works well)
- Set image dimensions (default 1024x1024)
- Choose video template (recommend portrait 1080x1920)
---
### Step 4: Generate Video
Click the「🎬 Generate Video」button in the right panel!
The system will show real-time progress:
- Generate script
- Generate images (for each scene)
- Synthesize voice
- Compose video
!!! info "Generation Time"
Generating a 5-scene video takes about 2-5 minutes, depending on: LLM API response speed, image generation speed, TTS workflow type, and network conditions
---
### Step 5: Preview Video
Once complete, the video will automatically play in the right panel!
You'll see:
- 📹 Video preview player
- ⏱️ Video duration
- 📦 File size
- 🎬 Number of scenes
- 📐 Video dimensions
The video file is saved in the `output/` folder.
---
## Next Steps
Congratulations! You've successfully created your first video 🎉
Next, you can:
- **Adjust Styles** - See the [Custom Visual Style](../tutorials/custom-style.md) tutorial
- **Clone Voices** - See the [Voice Cloning with Reference Audio](../tutorials/voice-cloning.md) tutorial
- **Use API** - See the [API Usage Guide](../user-guide/api.md)
- **Develop Templates** - See the [Template Development Guide](../user-guide/templates.md)

97
docs/en/index.md Normal file
View File

@@ -0,0 +1,97 @@
# Pixelle-Video 🎬
<div align="center" markdown="1">
**AI Video Creator - Generate a short video in 3 minutes**
[![Stars](https://img.shields.io/github/stars/PixelleLab/Pixelle-Video.svg?style=flat-square)](https://github.com/PixelleLab/Pixelle-Video/stargazers)
[![Issues](https://img.shields.io/github/issues/PixelleLab/Pixelle-Video.svg?style=flat-square)](https://github.com/PixelleLab/Pixelle-Video/issues)
[![License](https://img.shields.io/github/license/PixelleLab/Pixelle-Video.svg?style=flat-square)](https://github.com/PixelleLab/Pixelle-Video/blob/main/LICENSE)
</div>
---
## 🎯 Overview
Simply input a **topic**, and Pixelle-Video will automatically:
- ✍️ Write video scripts
- 🎨 Generate AI images
- 🗣️ Synthesize voice narration
- 🎵 Add background music
- 🎬 Create the final video
**No barriers, no video editing experience required** - turn video creation into a one-line task!
---
## ✨ Features
-**Fully Automated** - Input a topic, get a complete video in 3 minutes
-**AI-Powered Scripts** - Intelligently create narration based on your topic
-**AI-Generated Images** - Each sentence comes with beautiful AI illustrations
-**AI Voice Synthesis** - Support for Edge-TTS, Index-TTS and more mainstream TTS solutions
-**Background Music** - Add BGM for enhanced atmosphere
-**Visual Styles** - Multiple templates to create unique video styles
-**Flexible Dimensions** - Support for portrait, landscape and more video sizes
-**Multiple AI Models** - Support for GPT, Qianwen, DeepSeek, Ollama, etc.
-**Flexible Composition** - Based on ComfyUI architecture, use preset workflows or customize any capability
---
## 🎬 Video Examples
!!! info "Sample Videos"
Coming soon: Video examples will be added here
---
## 🚀 Quick Start
Ready to get started? Just three steps:
1. **[Install Pixelle-Video](getting-started/installation.md)** - Download and install the project
2. **[Configure Services](getting-started/configuration.md)** - Set up LLM and image generation services
3. **[Create Your First Video](getting-started/quick-start.md)** - Start creating your first video
---
## 💰 Pricing
!!! success "Completely free to run!"
- **Completely Free**: Use Ollama (local) + Local ComfyUI = $0
- **Recommended**: Use Qianwen LLM (≈$0.01-0.05 per 3-scene video) + Local ComfyUI
- **Cloud Solution**: Use OpenAI + RunningHub (higher cost but no local setup required)
**Recommendation**: If you have a local GPU, go with the completely free solution. Otherwise, we recommend Qianwen for best value.
---
## 🤝 Acknowledgments
Pixelle-Video was inspired by the following excellent open source projects:
- [Pixelle-MCP](https://github.com/AIDC-AI/Pixelle-MCP) - ComfyUI MCP server
- [MoneyPrinterTurbo](https://github.com/harry0703/MoneyPrinterTurbo) - Excellent video generation tool
- [NarratoAI](https://github.com/linyqh/NarratoAI) - Video narration automation tool
- [MoneyPrinterPlus](https://github.com/ddean2009/MoneyPrinterPlus) - Video creation platform
- [ComfyKit](https://github.com/puke3615/ComfyKit) - ComfyUI workflow wrapper library
Thanks to these projects for their open source spirit! 🙏
---
## 📢 Feedback & Support
- 🐛 **Found a bug**: Submit an [Issue](https://github.com/PixelleLab/Pixelle-Video/issues)
- 💡 **Feature request**: Submit a [Feature Request](https://github.com/PixelleLab/Pixelle-Video/issues)
-**Give us a Star**: If this project helps you, please give us a star!
---
## 📝 License
This project is licensed under the MIT License. See the [LICENSE](https://github.com/PixelleLab/Pixelle-Video/blob/main/LICENSE) file for details.

View File

@@ -0,0 +1,52 @@
# API Overview
Pixelle-Video Python API reference documentation.
---
## Core Classes
### PixelleVideoCore
Main service class providing video generation functionality.
```python
from pixelle_video.service import PixelleVideoCore
pixelle = PixelleVideoCore()
await pixelle.initialize()
```
---
## Main Methods
### generate_video()
Primary method for generating videos.
**Parameters**:
- `text` (str): Topic or complete script
- `mode` (str): Generation mode ("generate" or "fixed")
- `n_scenes` (int): Number of scenes
- `title` (str, optional): Video title
- `tts_workflow` (str): TTS workflow
- `image_workflow` (str): Image generation workflow
- `frame_template` (str): Video template
- `bgm_path` (str, optional): BGM file path
**Returns**: `VideoResult` object
---
## Examples
Check the `examples/` directory for more examples.
---
## More Information
Detailed API documentation coming soon.

View File

@@ -0,0 +1,60 @@
# Config Schema
Detailed explanation of the `config.yaml` configuration file.
---
## Configuration Structure
```yaml
llm:
provider: openai
api_key: "your-api-key"
base_url: "https://dashscope.aliyuncs.com/compatible-mode/v1"
model: "qwen-plus"
comfyui:
comfyui_url: "http://127.0.0.1:8188"
runninghub_api_key: ""
image:
default_workflow: "runninghub/image_flux.json"
prompt_prefix: "Minimalist illustration style"
tts:
default_workflow: "selfhost/tts_edge.json"
```
---
## LLM Configuration
- `provider`: Provider (currently only supports openai-compatible interfaces)
- `api_key`: API key
- `base_url`: API service address
- `model`: Model name
---
## ComfyUI Configuration
### Basic Configuration
- `comfyui_url`: Local ComfyUI address
- `runninghub_api_key`: RunningHub API key (optional)
### Image Configuration
- `default_workflow`: Default image generation workflow
- `prompt_prefix`: Prompt prefix
### TTS Configuration
- `default_workflow`: Default TTS workflow
---
## More Information
The configuration file is automatically created on first run.

108
docs/en/troubleshooting.md Normal file
View File

@@ -0,0 +1,108 @@
# Troubleshooting
Having issues? Here are solutions to common problems.
---
## Installation Issues
### Dependency installation failed
```bash
# Clean cache
uv cache clean
# Reinstall
uv sync
```
---
## Configuration Issues
### ComfyUI connection failed
**Possible Causes**:
- ComfyUI not running
- Incorrect URL configuration
- Firewall blocking
**Solutions**:
1. Confirm ComfyUI is running
2. Check URL configuration (default `http://127.0.0.1:8188`)
3. Test by accessing ComfyUI address in browser
4. Check firewall settings
### LLM API call failed
**Possible Causes**:
- Incorrect API Key
- Network issues
- Insufficient balance
**Solutions**:
1. Verify API Key is correct
2. Check network connection
3. Review error message details
4. Check account balance
---
## Generation Issues
### Video generation failed
**Possible Causes**:
- Corrupted workflow file
- Models not downloaded
- Insufficient resources
**Solutions**:
1. Check if workflow file exists
2. Confirm ComfyUI has downloaded required models
3. Check disk space and memory
### Image generation failed
**Solutions**:
1. Check if ComfyUI is running properly
2. Try manually testing workflow in ComfyUI
3. Check workflow configuration
### TTS generation failed
**Solutions**:
1. Check if TTS workflow is correct
2. If using voice cloning, check reference audio format
3. Review error logs
---
## Performance Issues
### Slow generation speed
**Optimization Tips**:
1. Use local ComfyUI (faster than cloud)
2. Reduce number of scenes
3. Use faster LLM (e.g., Qianwen)
4. Check network connection
---
## Other Issues
Still having problems?
1. Check project [GitHub Issues](https://github.com/PixelleLab/Pixelle-Video/issues)
2. Submit a new Issue describing your problem
3. Include error logs and configuration details for quick diagnosis
---
## View Logs
Log files are located in project root:
- `api_server.log` - API service logs
- `test_output.log` - Test logs

View File

@@ -0,0 +1,36 @@
# Custom Visual Style
Learn how to adjust image generation parameters to create unique visual styles.
---
## Adjust Prompt Prefix
The prompt prefix controls overall visual style:
```
Minimalist black-and-white illustration, clean lines, simple style
```
---
## Adjust Image Dimensions
Different dimensions for different scenarios:
- **1024x1024**: Square, suitable for Xiaohongshu
- **1080x1920**: Portrait, suitable for TikTok, Kuaishou
- **1920x1080**: Landscape, suitable for Bilibili, YouTube
---
## Preview Effects
Use the "Preview Style" feature to test different configurations.
---
## More Information
More style customization tips coming soon.

View File

@@ -0,0 +1,35 @@
# Voice Cloning
Use reference audio to implement voice cloning functionality.
---
## Prepare Reference Audio
1. Prepare a clear audio file (MP3/WAV/FLAC)
2. Recommended duration: 10-30 seconds
3. Avoid background noise
---
## Usage Steps
1. Select a TTS workflow that supports voice cloning (e.g., Index-TTS) in voice settings
2. Upload reference audio file
3. Test effects with "Preview Voice"
4. Generate video
---
## Notes
- Not all TTS workflows support voice cloning
- Reference audio quality affects cloning results
- Edge-TTS does not support voice cloning
---
## More Information
Detailed voice cloning tutorial coming soon.

View File

@@ -0,0 +1,33 @@
# Your First Video
Step-by-step guide to creating your first video with Pixelle-Video.
---
## Prerequisites
Make sure you've completed:
- ✅ [Installation](../getting-started/installation.md)
- ✅ [Configuration](../getting-started/configuration.md)
---
## Tutorial Steps
For detailed steps, see [Quick Start](../getting-started/quick-start.md).
---
## Tips
- Choose an appropriate topic for better results
- Start with 3-5 scenes for first generation
- Preview voice and image effects before generating
---
## Troubleshooting
Having issues? Check out [FAQ](../faq.md) or [Troubleshooting](../troubleshooting.md).

42
docs/en/user-guide/api.md Normal file
View File

@@ -0,0 +1,42 @@
# API Usage
Pixelle-Video provides a complete Python API for easy integration into your projects.
---
## Quick Start
```python
from pixelle_video.service import PixelleVideoCore
import asyncio
async def main():
# Initialize
pixelle = PixelleVideoCore()
await pixelle.initialize()
# Generate video
result = await pixelle.generate_video(
text="Why develop a reading habit",
mode="generate",
n_scenes=5
)
print(f"Video generated: {result.video_path}")
# Run
asyncio.run(main())
```
---
## API Reference
For detailed API documentation, see [API Overview](../reference/api-overview.md).
---
## Examples
For more usage examples, check the `examples/` directory in the project.

View File

@@ -0,0 +1,48 @@
# Template Development
How to create custom video templates.
---
## Template Introduction
Video templates use HTML to define the layout and style of video frames.
---
## Template Structure
Templates are located in the `templates/` directory, grouped by size:
```
templates/
├── 1080x1920/ # Portrait
├── 1920x1080/ # Landscape
└── 1080x1080/ # Square
```
---
## Creating Templates
1. Copy an existing template file
2. Modify HTML and CSS
3. Save to the corresponding size directory
4. Select and use in Web interface
---
## Template Variables
Templates support the following variables:
- `{{ title }}` - Video title
- `{{ text }}` - Scene text
- `{{ image }}` - Scene image
---
## More Information
Detailed template development guide coming soon.

View File

@@ -0,0 +1,77 @@
# Web UI Guide
Detailed introduction to the Pixelle-Video Web interface features.
---
## Interface Layout
The Web interface uses a three-column layout:
- **Left Panel**: Content input and audio settings
- **Middle Panel**: Voice and visual settings
- **Right Panel**: Video generation and preview
---
## System Configuration
First-time use requires configuring LLM and image generation services. See [Configuration Guide](../getting-started/configuration.md).
---
## Content Input
### Generation Mode
- **AI Generate Content**: Enter a topic, AI creates script automatically
- **Fixed Script Content**: Enter complete script directly
### Background Music
- Built-in music supported
- Custom music files supported
---
## Voice Settings
### TTS Workflow
- Select TTS workflow
- Supports Edge-TTS, Index-TTS, etc.
### Reference Audio
- Upload reference audio for voice cloning
- Supports MP3/WAV/FLAC formats
---
## Visual Settings
### Image Generation
- Select image generation workflow
- Set image dimensions
- Adjust prompt prefix to control style
### Video Template
- Choose video template
- Supports portrait/landscape/square
- Preview template effects
---
## Generate Video
After clicking "Generate Video", the system will:
1. Generate video script
2. Generate images for each scene
3. Synthesize voice narration
4. Compose final video
Automatically previews when complete.

View File

@@ -0,0 +1,37 @@
# Workflow Customization
How to customize ComfyUI workflows to achieve specific functionality.
---
## Workflow Introduction
Pixelle-Video is built on the ComfyUI architecture and supports custom workflows.
---
## Workflow Types
### TTS Workflows
Located in `workflows/selfhost/` or `workflows/runninghub/`
### Image Generation Workflows
Located in `workflows/selfhost/` or `workflows/runninghub/`
---
## Custom Workflows
1. Design your workflow in ComfyUI
2. Export as JSON file
3. Place in `workflows/` directory
4. Select and use in Web interface
---
## More Information
Detailed workflow customization guide coming soon.