From aae4004254f61f214e313cb0fda60bfe955de9b2 Mon Sep 17 00:00:00 2001
From: puke <1129090915@qq.com>
Date: Fri, 7 Nov 2025 17:28:27 +0800
Subject: [PATCH] =?UTF-8?q?=E8=A1=A5=E5=85=85=E8=8B=B1=E6=96=87README?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
---
README.md | 10 +-
README_EN.md | 292 +++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 298 insertions(+), 4 deletions(-)
create mode 100644 README_EN.md
diff --git a/README.md b/README.md
index c865cc3..8c6d12d 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,7 @@
Pixelle-Video 🎬
+English | 中文
+
@@ -7,7 +9,7 @@
-🚀 AI 视频创作工具 - 3 分钟生成一个短视频
+🚀 AI 全自动短视频引擎
只需输入一个 **主题**,Pixelle-Video 就能自动完成:
- ✍️ 撰写视频文案
@@ -23,7 +25,7 @@
## ✨ 功能亮点
-- ✅ **全自动生成** - 输入主题,3 分钟自动生成完整视频
+- ✅ **全自动生成** - 输入主题,自动生成完整视频
- ✅ **AI 智能文案** - 根据主题智能创作解说词,无需自己写脚本
- ✅ **AI 生成配图** - 每句话都配上精美的 AI 插图
- ✅ **AI 生成语音** - 支持 Edge-TTS、Index-TTS 等众多主流 TTS 方案
@@ -236,7 +238,7 @@ uv run streamlit run web/app.py
### ❓ 常见问题
**Q: 第一次使用需要多久?**
-A: 生成一个 3 段视频大约需要 2-5 分钟,取决于你的网络和 AI 推理速度。
+A: 生成时长取决于视频分镜数量、网络状况和 AI 推理速度,通常几分钟内即可完成。
**Q: 视频效果不满意怎么办?**
A: 可以尝试:
@@ -249,7 +251,7 @@ A: 可以尝试:
A: **本项目完全支持免费运行!**
- **完全免费方案**: LLM 使用 Ollama(本地运行)+ ComfyUI 本地部署 = 0 元
-- **推荐方案**: LLM 使用通义千问(生成一个 3 段视频约 0.01-0.05 元)+ ComfyUI 本地部署
+- **推荐方案**: LLM 使用通义千问(成本极低,性价比高)+ ComfyUI 本地部署
- **云端方案**: LLM 使用 OpenAI + 图像使用 RunningHub(费用较高但无需本地环境)
**选择建议**:本地有显卡建议完全免费方案,否则推荐使用通义千问(性价比高)
diff --git a/README_EN.md b/README_EN.md
new file mode 100644
index 0000000..63885cd
--- /dev/null
+++ b/README_EN.md
@@ -0,0 +1,292 @@
+Pixelle-Video 🎬
+
+English | 中文
+
+
+
+
+
+
+
+
+🚀 AI Fully Automated Short Video Engine
+
+Just input a **topic**, and Pixelle-Video will automatically:
+- ✍️ Write video script
+- 🎨 Generate AI images
+- 🗣️ Synthesize voice narration
+- 🎵 Add background music
+- 🎬 Create video with one click
+
+
+**Zero threshold, zero editing experience** - Make video creation as simple as typing a sentence!
+
+---
+
+## ✨ Key Features
+
+- ✅ **Fully Automatic Generation** - Input a topic, automatically generate complete video
+- ✅ **AI Smart Copywriting** - Intelligently create narration based on topic, no need to write scripts yourself
+- ✅ **AI Generated Images** - Each sentence comes with beautiful AI illustrations
+- ✅ **AI Generated Voice** - Support Edge-TTS, Index-TTS and many other mainstream TTS solutions
+- ✅ **Background Music** - Support adding BGM to make videos more atmospheric
+- ✅ **Visual Styles** - Multiple templates to choose from, create unique video styles
+- ✅ **Flexible Dimensions** - Support portrait, landscape and other video dimensions
+- ✅ **Multiple AI Models** - Support GPT, Qwen, DeepSeek, Ollama and more
+- ✅ **Flexible Atomic Capability Combination** - Based on ComfyUI architecture, can use preset workflows or customize any capability (such as replacing image generation model with FLUX, replacing TTS with ChatTTS, etc.)
+
+---
+
+## 📊 Video Generation Pipeline
+
+Pixelle-Video adopts a modular design, the entire video generation process is clear and concise:
+
+
+
+From input text to final video output, the entire process is clear and simple: **Script Generation → Image Planning → Frame-by-Frame Processing → Video Composition**
+
+Each step supports flexible customization, allowing you to choose different AI models, audio engines, visual styles, etc., to meet personalized creation needs.
+
+---
+
+## 🎬 Video Examples
+
+> To be added: Video examples can be added here
+
+---
+
+## 🚀 Quick Start
+
+### Prerequisites
+
+Before starting, you need to install Python package manager `uv` and video processing tool `ffmpeg`:
+
+#### Install uv
+
+Please visit the uv official documentation to see the installation method for your system:
+👉 **[uv Installation Guide](https://docs.astral.sh/uv/getting-started/installation/)**
+
+After installation, run `uv --version` in the terminal to verify successful installation.
+
+#### Install ffmpeg
+
+**macOS**
+```bash
+brew install ffmpeg
+```
+
+**Ubuntu / Debian**
+```bash
+sudo apt update
+sudo apt install ffmpeg
+```
+
+**Windows**
+- Download URL: https://ffmpeg.org/download.html
+- After downloading, extract and add the `bin` directory to the system environment variable PATH
+
+After installation, run `ffmpeg -version` in the terminal to verify successful installation.
+
+---
+
+### Step 1: Clone Project
+
+```bash
+git clone https://github.com/AIDC-AI/Pixelle-Video.git
+cd Pixelle-Video
+```
+
+### Step 2: Launch Web Interface
+
+```bash
+# Run with uv (recommended, will automatically install dependencies)
+uv run streamlit run web/app.py
+```
+
+Browser will automatically open http://localhost:8501
+
+### Step 3: Configure in Web Interface
+
+On first use, expand the "⚙️ System Configuration" panel and fill in:
+- **LLM Configuration**: Select AI model (such as Qwen, GPT, etc.) and enter API Key
+- **Image Configuration**: If you need to generate images, configure ComfyUI address or RunningHub API Key
+
+After configuration, click "Save Configuration", and you can start generating videos!
+
+---
+
+## 💻 Usage
+
+After opening the Web interface, you will see a three-column layout. Here's a detailed explanation of each part:
+
+---
+
+### ⚙️ System Configuration (Required on First Use)
+
+Configuration is required on first use. Click to expand the "⚙️ System Configuration" panel:
+
+#### 1. LLM Configuration (Large Language Model)
+Used for generating video scripts.
+
+**Quick Select Preset**
+- Select preset model from dropdown menu (Qwen, GPT-4o, DeepSeek, etc.)
+- After selection, base_url and model will be automatically filled
+- Click "🔑 Get API Key" link to register and obtain key
+
+**Manual Configuration**
+- API Key: Enter your key
+- Base URL: API address
+- Model: Model name
+
+#### 2. Image Configuration
+Used for generating video images.
+
+**Local Deployment (Recommended)**
+- ComfyUI URL: Local ComfyUI service address (default http://127.0.0.1:8188)
+- Click "Test Connection" to confirm service is available
+
+**Cloud Deployment**
+- RunningHub API Key: Cloud image generation service key
+
+After configuration, click "Save Configuration".
+
+---
+
+### 📝 Content Input (Left Column)
+
+#### Generation Mode
+- **AI Generated Content**: Input topic, AI automatically creates script
+ - Suitable for: Want to quickly generate video, let AI write script
+ - Example: "Why develop a reading habit"
+- **Fixed Script Content**: Directly input complete script, skip AI creation
+ - Suitable for: Already have ready-made script, directly generate video
+
+#### Background Music (BGM)
+- **No BGM**: Pure voice narration
+- **Built-in Music**: Select preset background music (such as default.mp3)
+- **Custom Music**: Put your music files (MP3/WAV, etc.) in the `bgm/` folder
+- Click "Preview BGM" to preview music
+
+---
+
+### 🎤 Voice Settings (Middle Column)
+
+#### TTS Workflow
+- Select TTS workflow from dropdown menu (supports Edge-TTS, Index-TTS, etc.)
+- System will automatically scan TTS workflows in the `workflows/` folder
+- If you know ComfyUI, you can customize TTS workflows
+
+#### Reference Audio (Optional)
+- Upload reference audio file for voice cloning (supports MP3/WAV/FLAC and other formats)
+- Suitable for TTS workflows that support voice cloning (such as Index-TTS)
+- Can listen directly after upload
+
+#### Preview Function
+- Enter test text, click "Preview Voice" to listen to the effect
+- Supports using reference audio for preview
+
+---
+
+### 🎨 Visual Settings (Middle Column)
+
+#### Image Generation
+Determine what style of images AI generates.
+
+**ComfyUI Workflow**
+- Select image generation workflow from dropdown menu
+- Supports local deployment (selfhost) and cloud (RunningHub) workflows
+- Default uses `image_flux.json`
+- If you know ComfyUI, you can put your own workflows in the `workflows/` folder
+
+**Image Dimensions**
+- Set width and height of generated images (unit: pixels)
+- Default 1024x1024, can be adjusted as needed
+- Note: Different models have different dimension limitations
+
+**Prompt Prefix**
+- Controls overall image style (language needs to be English)
+- Example: Minimalist black-and-white matchstick figure style illustration, clean lines, simple sketch style
+- Click "Preview Style" to test effect
+
+#### Video Template
+Determines video layout and design.
+
+- Select template from dropdown menu, displayed grouped by dimension (portrait/landscape/square)
+- Click "Preview Template" to test effect with custom parameters
+- If you know HTML, you can create your own templates in the `templates/` folder
+
+---
+
+### 🎬 Generate Video (Right Column)
+
+#### Generate Button
+- After configuring all parameters, click "🎬 Generate Video"
+- Shows real-time progress (generating script → generating images → synthesizing voice → composing video)
+- Automatically shows video preview after completion
+
+#### Progress Display
+- Shows current step in real-time
+- Example: "Frame 3/5 - Generating Image"
+
+#### Video Preview
+- Automatically plays after generation
+- Shows video duration, file size, number of frames, etc.
+- Video files are saved in the `output/` folder
+
+---
+
+### ❓ FAQ
+
+**Q: How long does it take to use for the first time?**
+A: Generation time depends on the number of video frames, network conditions, and AI inference speed, typically completed within a few minutes.
+
+**Q: What if I'm not satisfied with the video?**
+A: You can try:
+1. Change LLM model (different models have different script styles)
+2. Adjust image dimensions and prompt prefix (change image style)
+3. Change TTS workflow or upload reference audio (change voice effect)
+4. Try different video templates and dimensions
+
+**Q: What about the cost?**
+A: **This project fully supports free operation!**
+
+- **Completely Free Solution**: LLM using Ollama (local) + ComfyUI local deployment = 0 cost
+- **Recommended Solution**: LLM using Qwen (extremely low cost, highly cost-effective) + ComfyUI local deployment
+- **Cloud Solution**: LLM using OpenAI + Image using RunningHub (higher cost but no need for local environment)
+
+**Selection Suggestion**: If you have a local GPU, recommend completely free solution, otherwise recommend using Qwen (cost-effective)
+
+---
+
+## 🤝 Referenced Projects
+
+Pixelle-Video design is inspired by the following excellent open-source projects:
+
+- [Pixelle-MCP](https://github.com/AIDC-AI/Pixelle-MCP) - ComfyUI MCP server, allows AI assistants to directly call ComfyUI
+- [MoneyPrinterTurbo](https://github.com/harry0703/MoneyPrinterTurbo) - Excellent video generation tool
+- [NarratoAI](https://github.com/linyqh/NarratoAI) - Film commentary automation tool
+- [MoneyPrinterPlus](https://github.com/ddean2009/MoneyPrinterPlus) - Video creation platform
+- [ComfyKit](https://github.com/puke3615/ComfyKit) - ComfyUI workflow wrapper library
+
+Thanks for the open-source spirit of these projects! 🙏
+
+---
+
+## 📢 Feedback and Support
+
+- 🐛 **Encountered Issues**: Submit [Issue](https://github.com/AIDC-AI/Pixelle-Video/issues)
+- 💡 **Feature Suggestions**: Submit [Feature Request](https://github.com/AIDC-AI/Pixelle-Video/issues)
+- ⭐ **Give a Star**: If this project helps you, feel free to give a Star for support!
+
+---
+
+## 📝 License
+
+This project is released under the MIT License. For details, please see the [LICENSE](LICENSE) file.
+
+---
+
+## ⭐ Star History
+
+[](https://star-history.com/#AIDC-AI/Pixelle-Video&Date)
+