Go to file

Deploy Documentation / deploy (push) Has been cancelled

Details

feat: Add smart paragraph merging mode with AI grouping

- Add "smart" split mode that uses LLM to intelligently merge related paragraphs
- Implement two-step approach: analyze text structure, then group by semantic relevance
- Add paragraph_merging.py with analysis and grouping prompts
- Update UI to support smart mode selection with auto-detect hint
- Add i18n translations for smart mode (en_US, zh_CN)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-17 00:19:46 +08:00

.github/workflows

支持GitHub Actions

2025-11-07 16:59:12 +08:00

.pids

feat: Add editor enhancements - export video, audio preview, publish panel, configurable ports

2026-01-06 17:29:43 +08:00

.serena

feat: Add editor enhancements - export video, audio preview, publish panel, configurable ports

2026-01-06 17:29:43 +08:00

api

fix: Remove hardcoded ports, support custom port configuration

2026-01-10 16:13:02 +08:00

bgm

add bgm

2025-11-07 16:59:11 +08:00

docs

docs: Add port configuration guide

2026-01-10 16:13:23 +08:00

frontend

fix: Remove hardcoded ports, support custom port configuration

2026-01-10 16:13:02 +08:00

packaging/windows

兼容docker和整合包下FAQ的加载逻辑

2025-12-10 16:45:24 +08:00

pixelle_video

feat: Add smart paragraph merging mode with AI grouping

2026-01-17 00:19:46 +08:00

resources

更新README

2025-11-19 17:55:56 +08:00

templates

feat: update templates

2025-12-29 15:45:41 +08:00

tests

feat: Add hybrid quality evaluation system with CLIP and VLM support

2026-01-05 15:56:44 +08:00

web

feat: Add smart paragraph merging mode with AI grouping

2026-01-17 00:19:46 +08:00

workflows

feat: Add inpainting (局部重绘) feature for timeline editor

2026-01-05 23:44:51 +08:00

.dockerignore

优化docker忽略逻辑

2025-12-10 16:49:42 +08:00

.env.example

fix: Remove hardcoded ports, support custom port configuration

2026-01-10 16:13:02 +08:00

.gitignore

fix: Filter non-serializable objects in pipeline metadata

2026-01-10 15:56:06 +08:00

config.example.yaml

feat: Add VLM config to config.example.yaml with config.yaml support

2026-01-07 09:32:16 +08:00

dev.sh

fix: Remove hardcoded ports, support custom port configuration

2026-01-10 16:13:02 +08:00

docker-compose.yml

优化docker-compose部署时的初始化配置文件问题

2025-12-29 15:19:39 +08:00

docker-start.sh

修复config.yaml的初始化逻辑

2025-11-07 16:59:12 +08:00

Dockerfile

优化docker构建逻辑

2025-12-10 16:48:14 +08:00

LICENSE

Initial commit

2025-11-07 14:34:28 +08:00

mkdocs.yml

PixelleLab全局重命名

2025-11-07 16:59:12 +08:00

NOTICE

新增NOTICE

2025-11-07 16:59:44 +08:00

pyproject.toml

chore: Merge upstream/main with RunningHub 48G support and FAQ improvements

2026-01-06 17:48:02 +08:00

README_EN.md

更新更新日志

2026-01-06 11:19:52 +08:00

README.md

chore: Merge upstream/main with RunningHub 48G support and FAQ improvements

2026-01-06 17:48:02 +08:00

requirements-docs.txt

支持GitHub Actions

2025-11-07 16:59:12 +08:00

start_web.bat

添加start_web.bat的出错时的用户引导

2025-12-05 11:17:16 +08:00

start_web.sh

优化启动脚本

2025-11-10 15:52:51 +08:00

uv.lock

chore: Update dependency lock file

2026-01-10 15:56:22 +08:00

README_EN.md

🎬 Pixelle-Video —— AI Fully Automated Short Video Engine

English | 中文

https://github.com/user-attachments/assets/a42e7457-fcc8-40da-83fc-784c45a8b95d

Just input a topic, and Pixelle-Video will automatically:

✍️ Write video script
🎨 Generate AI images/videos
🗣️ Synthesize voice narration
🎵 Add background music
🎬 Create video with one click

Zero threshold, zero editing experience - Make video creation as simple as typing a sentence!

🖥️ Web Interface Preview

📋 Recent Updates

✅ 2026-01-06: Added RunningHub 48G VRAM machine support
✅ 2025-12-28: Configurable RunningHub concurrency limit, improved LLM structured data response handling
✅ 2025-12-17: Added ComfyUI API Key configuration, Nano Banana model support, API template custom parameters
✅ 2025-12-10: Built-in FAQ in sidebar, fixed edge-tts version to resolve TTS service instability
✅ 2025-12-08: Support multiple script split modes (paragraph/line/sentence), improved template selection with direct preview
✅ 2025-12-06: Fixed video generation API URL path handling with cross-platform compatibility
✅ 2025-12-05: Added Windows all-in-one package download, optimized image and video analysis workflows
✅ 2025-12-04: New "Custom Media" feature - upload your photos/videos with AI-powered analysis and script generation
✅ 2025-11-18: Parallel processing for RunningHub, added history page, batch video task creation support

✨ Key Features

✅ Fully Automatic Generation - Input a topic, automatically generate complete video
✅ AI Smart Copywriting - Intelligently create narration based on topic, no need to write scripts yourself
✅ AI Generated Images - Each sentence comes with beautiful AI illustrations
✅ AI Generated Videos - Support AI video generation models (like WAN 2.1) to create dynamic video content
✅ AI Generated Voice - Support Edge-TTS, Index-TTS and many other mainstream TTS solutions
✅ Background Music - Support adding BGM to make videos more atmospheric
✅ Visual Styles - Multiple templates to choose from, create unique video styles
✅ Flexible Dimensions - Support portrait, landscape and other video dimensions
✅ Multiple AI Models - Support GPT, Qwen, DeepSeek, Ollama and more
✅ Flexible Atomic Capability Combination - Based on ComfyUI architecture, can use preset workflows or customize any capability (such as replacing image generation model with FLUX, replacing TTS with ChatTTS, etc.)

📊 Video Generation Pipeline

Pixelle-Video adopts a modular design, the entire video generation process is clear and concise:

From input text to final video output, the entire process is clear and simple: Script Generation → Image Planning → Frame-by-Frame Processing → Video Composition

Each step supports flexible customization, allowing you to choose different AI models, audio engines, visual styles, etc., to meet personalized creation needs.

🎬 Video Examples

Here are actual cases generated using Pixelle-Video, showcasing video effects with different themes and styles:

📱 Portrait Video Showcase

🌄 Documentary & Lifestyle – Default Template

The Scenery Along the Journey

🔍 Cultural Deconstruction – Default Template

Santa ID

🔭 Scientific Inquiry – Default Template

Why Haven’t We Found Alien Civilizations Yet?

🌱 Personal Growth – Cloned Voice

How to Level Up Yourself

🧠 Deep Thinking – Default Template

Understanding Antifragility

🏯 History & Culture – Static Frame

Zizhi Tongjian (Comprehensive Mirror for Aid in Governance)

☀️ Emotional Storytelling – Cloned Voice

Winter Sunlight

📜 Novel Adaptation – Custom Script

Doupo Cangqiong (Battle Through the Heavens)

🧬 Knowledge Explainer – Qwen Image Generation

Essential Wellness Tips

🖥️ Landscape Video Showcase

💰 Side Hustle Money Making - Movie Template

Side Hustle Money Making

🏛️ Historical Commentary - Custom Template

Insights from Zizhi Tongjian

💡 Tip: All these videos are fully automatically generated by AI just by inputting a topic keyword, without any video editing experience required!

🚀 Quick Start

🪟 Windows All-in-One Package (Recommended for Windows Users)

No need to install Python, uv, or ffmpeg - ready to use out of the box!

👉 Download Windows All-in-One Package

Download the latest Windows All-in-One Package and extract it
Double-click start.bat to launch the Web interface
Browser will automatically open http://localhost:8501
Configure LLM API and image generation service in "⚙️ System Configuration"
Start generating videos!

💡 Tip: The package includes all dependencies, no need to manually install any environment. On first use, you only need to configure API keys.

Install from Source (For macOS / Linux Users or Users Who Need Customization)

Prerequisites

Before starting, you need to install Python package manager uv and video processing tool ffmpeg:

Install uv

Please visit the uv official documentation to see the installation method for your system:
👉 uv Installation Guide

After installation, run uv --version in the terminal to verify successful installation.

Install ffmpeg

macOS

brew install ffmpeg

Ubuntu / Debian

sudo apt update
sudo apt install ffmpeg

Windows

Download URL: https://ffmpeg.org/download.html
After downloading, extract and add the bin directory to the system environment variable PATH

After installation, run ffmpeg -version in the terminal to verify successful installation.

Step 1: Clone Project

git clone https://github.com/AIDC-AI/Pixelle-Video.git
cd Pixelle-Video

Step 2: Launch Web Interface

# Run with uv (recommended, will automatically install dependencies)
uv run streamlit run web/app.py

Browser will automatically open http://localhost:8501

Step 3: Configure in Web Interface

On first use, expand the "⚙️ System Configuration" panel and fill in:

LLM Configuration: Select AI model (such as Qwen, GPT, etc.) and enter API Key
Image Configuration: If you need to generate images, configure ComfyUI address or RunningHub API Key

After configuration, click "Save Configuration", and you can start generating videos!

💻 Usage

After opening the Web interface, you will see a three-column layout. Here's a detailed explanation of each part:

⚙️ System Configuration (Required on First Use)

Configuration is required on first use. Click to expand the "⚙️ System Configuration" panel:

1. LLM Configuration (Large Language Model)

Used for generating video scripts.

Quick Select Preset

Select preset model from dropdown menu (Qwen, GPT-4o, DeepSeek, etc.)
After selection, base_url and model will be automatically filled
Click "🔑 Get API Key" link to register and obtain key

Manual Configuration

API Key: Enter your key
Base URL: API address
Model: Model name

2. Image Configuration

Used for generating video images.

Local Deployment (Recommended)

ComfyUI URL: Local ComfyUI service address (default http://127.0.0.1:8188)
Click "Test Connection" to confirm service is available

Cloud Deployment

RunningHub API Key: Cloud image generation service key

After configuration, click "Save Configuration".

📝 Content Input (Left Column)

Generation Mode

AI Generated Content: Input topic, AI automatically creates script
- Suitable for: Want to quickly generate video, let AI write script
- Example: "Why develop a reading habit"
Fixed Script Content: Directly input complete script, skip AI creation
- Suitable for: Already have ready-made script, directly generate video

Background Music (BGM)

No BGM: Pure voice narration
Built-in Music: Select preset background music (such as default.mp3)
Custom Music: Put your music files (MP3/WAV, etc.) in the bgm/ folder
Click "Preview BGM" to preview music

🎤 Voice Settings (Middle Column)

TTS Workflow

Select TTS workflow from dropdown menu (supports Edge-TTS, Index-TTS, etc.)
System will automatically scan TTS workflows in the workflows/ folder
If you know ComfyUI, you can customize TTS workflows

Reference Audio (Optional)

Upload reference audio file for voice cloning (supports MP3/WAV/FLAC and other formats)
Suitable for TTS workflows that support voice cloning (such as Index-TTS)
Can listen directly after upload

Preview Function

Enter test text, click "Preview Voice" to listen to the effect
Supports using reference audio for preview

🎨 Visual Settings (Middle Column)

Image Generation

Determine what style of images AI generates.

ComfyUI Workflow

Select image generation workflow from dropdown menu
Supports local deployment (selfhost) and cloud (RunningHub) workflows
Default uses image_flux.json
If you know ComfyUI, you can put your own workflows in the workflows/ folder

Image Dimensions

Set width and height of generated images (unit: pixels)
Default 1024x1024, can be adjusted as needed
Note: Different models have different dimension limitations

Prompt Prefix

Controls overall image style (language needs to be English)
Example: Minimalist black-and-white matchstick figure style illustration, clean lines, simple sketch style
Click "Preview Style" to test effect

Video Template

Determines video layout and design.

Template Naming Convention

static_*.html: Static templates (no AI-generated media, text-only styles)
image_*.html: Image templates (uses AI-generated images as background)
video_*.html: Video templates (uses AI-generated videos as background)

Usage

Select template from dropdown menu, displayed grouped by dimension (portrait/landscape/square)
Click "Preview Template" to test effect with custom parameters
If you know HTML, you can create your own templates in the templates/ folder
🔗 View All Template Previews

🎬 Generate Video (Right Column)

Generate Button

After configuring all parameters, click "🎬 Generate Video"
Shows real-time progress (generating script → generating images → synthesizing voice → composing video)
Automatically shows video preview after completion

Progress Display

Shows current step in real-time
Example: "Frame 3/5 - Generating Image"

Video Preview

Automatically plays after generation
Shows video duration, file size, number of frames, etc.
Video files are saved in the output/ folder

❓ FAQ

Q: How long does it take to use for the first time?
A: Generation time depends on the number of video frames, network conditions, and AI inference speed, typically completed within a few minutes.

Q: What if I'm not satisfied with the video?
A: You can try:

Change LLM model (different models have different script styles)
Adjust image dimensions and prompt prefix (change image style)
Change TTS workflow or upload reference audio (change voice effect)
Try different video templates and dimensions

Q: What about the cost?
A: This project fully supports free operation!

Completely Free Solution: LLM using Ollama (local) + ComfyUI local deployment = 0 cost
Recommended Solution: LLM using Qwen (extremely low cost, highly cost-effective) + ComfyUI local deployment
Cloud Solution: LLM using OpenAI + Image using RunningHub (higher cost but no need for local environment)

Selection Suggestion: If you have a local GPU, recommend completely free solution, otherwise recommend using Qwen (cost-effective)

🤝 Referenced Projects

Pixelle-Video design is inspired by the following excellent open-source projects:

Pixelle-MCP - ComfyUI MCP server, allows AI assistants to directly call ComfyUI
MoneyPrinterTurbo - Excellent video generation tool
NarratoAI - Film commentary automation tool
MoneyPrinterPlus - Video creation platform
ComfyKit - ComfyUI workflow wrapper library

Thanks for the open-source spirit of these projects! 🙏

💬 Community

Scan the QR codes below to join our communities for latest updates and technical support:

Discord Community	WeChat Group

📢 Feedback and Support

🐛 Encountered Issues: Submit Issue
💡 Feature Suggestions: Submit Feature Request
⭐ Give a Star: If this project helps you, feel free to give a Star for support!

📝 License

This project is released under the Apache License 2.0. For details, please see the LICENSE file.

⭐ Star History

Languages

Python 73%

HTML 16.8%

TypeScript 8.9%

Shell 0.6%

CSS 0.3%

Other 0.4%

README_EN.md Unescape Escape

🎬 Pixelle-Video —— AI Fully Automated Short Video Engine

🖥️ Web Interface Preview

📋 Recent Updates

✨ Key Features

📊 Video Generation Pipeline

🎬 Video Examples

📱 Portrait Video Showcase

🌄 Documentary & Lifestyle – Default Template

🔍 Cultural Deconstruction – Default Template

🔭 Scientific Inquiry – Default Template

🌱 Personal Growth – Cloned Voice

🧠 Deep Thinking – Default Template

🏯 History & Culture – Static Frame

☀️ Emotional Storytelling – Cloned Voice

📜 Novel Adaptation – Custom Script

🧬 Knowledge Explainer – Qwen Image Generation

🖥️ Landscape Video Showcase

💰 Side Hustle Money Making - Movie Template

🏛️ Historical Commentary - Custom Template

🚀 Quick Start

🪟 Windows All-in-One Package (Recommended for Windows Users)

Install from Source (For macOS / Linux Users or Users Who Need Customization)

Prerequisites

Install uv

Install ffmpeg

Step 1: Clone Project

Step 2: Launch Web Interface

Step 3: Configure in Web Interface

💻 Usage

⚙️ System Configuration (Required on First Use)

1. LLM Configuration (Large Language Model)

2. Image Configuration

📝 Content Input (Left Column)

Generation Mode

Background Music (BGM)

🎤 Voice Settings (Middle Column)

TTS Workflow

Reference Audio (Optional)

Preview Function

🎨 Visual Settings (Middle Column)

Image Generation

Video Template

🎬 Generate Video (Right Column)

Generate Button

Progress Display

Video Preview

❓ FAQ

🤝 Referenced Projects

💬 Community

📢 Feedback and Support

📝 License

⭐ Star History

README_EN.md