Commit Graph

102 Commits

Author SHA1 Message Date
let5sne.win10
a356c481ca Add task list page for Video Learning sessions
This commit adds a dedicated task list page to view and manage all video
learning sessions, solving the issue where users couldn't find their
background tasks after navigating away.

Features:
- New sessions.html page with card-based layout for all sessions
- Real-time polling for session status updates (every 3 seconds)
- Session control buttons (pause/resume/stop/delete)
- localStorage integration for session persistence across page refreshes
- Navigation links added to main page and video learning page
- Empty state UI when no sessions exist

New files:
- dashboard/static/sessions.html - Task list page
- dashboard/static/js/sessions.js - Sessions module with API calls
- dashboard/static/css/sessions.css - Styling for sessions page

Modified files:
- dashboard/api/video_learning.py - Added /sessions/list endpoint
- dashboard/static/index.html - Added "任务列表" button
- dashboard/static/video-learning.html - Added "任务列表" button and localStorage

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-10 02:22:42 +08:00
let5sne.win10
a223d63088 Add multi-device support for Video Learning Agent
- Add device-session bidirectional mapping (_device_sessions)
- Integrate DeviceManager lock mechanism (acquire/release)
- Add device-level API endpoints:
  - GET /devices/{device_id}/sessions - List sessions on device
  - POST /devices/{device_id}/stop-all - Stop all sessions on device
  - GET /devices/sessions - Overview of all devices
- Update session cleanup to maintain mapping consistency
- Prevent concurrent sessions on same device

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-10 01:55:16 +08:00
let5sne.win10
b97d3f3a9f Improve Video Learning Agent with action-based detection and analysis toggle
- Change video detection from screenshot hash to action-based (Swipe detection)
- Add enable_analysis toggle to disable VLM screenshot analysis
- Improve task prompt to prevent VLM from stopping prematurely
- Add debug logging for action detection troubleshooting
- Fix ModelResponse attribute error (content -> raw_content)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-10 01:47:09 +08:00
let5sne.win10
6b770832aa Skip app startup screens with warmup logic
- Added warmup counter to skip first 3 steps after entering app
- Reset counter when leaving target app
- Prevents recording splash screens as videos

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-09 23:56:21 +08:00
let5sne.win10
a823c03788 Improve screenshot analysis prompt and add debug logs
- Simplified prompt to force JSON-only response
- Added debug logs to track VLM response and parsing
- Better error messages for troubleshooting

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-09 23:37:35 +08:00
let5sne.win10
c4325d57d4 Fix record_video method to accept analysis fields
Added missing parameters: shares, tags, category, elements
Now analysis results are properly saved to VideoRecord

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-09 23:23:57 +08:00
let5sne.win10
195a93b7e0 Add screenshot content analysis using VLM
Features:
- ScreenshotAnalyzer class for VLM-based image analysis
- Real-time analysis during video recording
- Extract likes, comments, tags, category from screenshots
- Frontend display for category badges and tags
- Batch analysis API endpoint

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-09 23:20:52 +08:00
let5sne.win10
5b3f214e20 Add Video Learning Agent for short video platforms
Features:
- VideoLearningAgent for automated video watching on Douyin/Kuaishou/TikTok
- Web dashboard UI for video learning sessions
- Real-time progress tracking with screenshot capture
- App detection using get_current_app() for accurate recording
- Session management with pause/resume/stop controls

Technical improvements:
- Simplified video detection logic using direct app detection
- Full base64 hash for sensitive screenshot change detection
- Immediate stop when target video count is reached
- Fixed circular import issues with ModelConfig

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 22:54:57 +08:00
let5sne.win10
3552df23d6 Add Web Dashboard with multi-device control and callback hooks
Features:
- Web Dashboard: FastAPI-based dashboard with Vue.js frontend
  - Multi-device support (ADB, HDC, iOS)
  - Real-time WebSocket updates for task progress
  - Device management with status tracking
  - Task queue with execution controls (start/stop/re-execute)
  - Detailed task information display (thinking, actions, completion messages)
  - Screenshot viewing per device
  - LAN deployment support with configurable CORS

- Callback Hooks: Interrupt and modify task execution
  - step_callback: Called after each step with StepResult
  - before_action_callback: Called before executing action
  - Support for task interruption and dynamic task switching
  - Example scripts demonstrating callback usage

- Configuration: Environment-based configuration
  - .env file support for all settings
  - .env.example template with documentation
  - Model API configuration (base URL, model name, API key)
  - Dashboard configuration (host, port, CORS, device type)
  - Phone agent configuration (delays, max steps, language)

Technical improvements:
- Fixed forward reference issue with StepResult
- Added package exports for callback types and configs
- Enhanced dependencies with FastAPI, WebSocket support
- Thread-safe task execution with device locking
- Async WebSocket broadcasting from sync thread pool

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-09 02:20:06 +08:00
yongbin-buaa
9fe189a8f8 Merge pull request #263 from floatingstarZ/hzy_1220_hdc
修复hdc的get_current_app函数
2026-01-05 15:26:53 +08:00
zRzRzRzRzRzRzR
53bf7d2644 Update wechat.jpeg 2025-12-31 18:33:15 +08:00
zRzRzRzRzRzRzR
2140330a98 Update wechat.jpeg 2025-12-31 13:56:27 +08:00
yongbin-buaa
326cadd5d3 Merge pull request #257 from floatingstarZ/hzy_1220
完善IOS配置文档
2025-12-22 13:01:48 +08:00
floatingstarZ
75f0e26ae4 修复hdc的get_current_app函数 2025-12-22 10:29:57 +08:00
floatingstarZ
ab66e47906 添加docs,完善IOS的Readme 2025-12-20 22:12:53 +08:00
yongbin-buaa
1ab6e1edf6 Merge pull request #249 from zai-org/support-ios
support ios in main.py
2025-12-19 19:00:20 +08:00
liuyongbin
780b756e21 fix format 2025-12-19 18:56:43 +08:00
yongbin-buaa
7729568ae0 Merge pull request #143 from gekowa/ios-support-3
feat: Added iOS support
2025-12-19 18:15:50 +08:00
yongbin-buaa
5fcb2f5146 Merge pull request #237 from zai-org/update-hdc-readme
Update README.md and README_en.md to support HarmonyOS devices
2025-12-18 15:55:05 +08:00
liuyongbin
4b30344f15 fix precommit 2025-12-18 15:52:13 +08:00
zRzRzRzRzRzRzR
34457ded4c new group 2025-12-17 17:25:14 +08:00
yongbin-buaa
728c53f512 Merge pull request #221 from floatingstarZ/hzy_1216_v4_ok
支持鸿蒙OSNEXT的HDC
2025-12-17 17:11:54 +08:00
floatingstarZ
9bbf112dda 优化HDC文本输入:支持多行文本和简化接口
- 在hdc/input.py中实现多行文本支持,使用HarmonyOS keyEvent 2054处理换行
- 移除type_text函数的x/y坐标参数,简化接口
- 将多行文本处理逻辑从handler.py移至hdc/input.py,统一处理
- 优化parse_action函数,支持Type动作的text参数提取

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-16 19:54:59 +08:00
floatingstarZ
c0573c097f fix the entry ability of apps 2025-12-16 16:38:09 +08:00
floatingstarZ
4a258c1284 支持鸿蒙OSNEXT_HDC 2025-12-16 14:56:22 +08:00
floatingstarZ
95f5921887 支持鸿蒙OSNEXT_HDC 2025-12-16 14:28:49 +08:00
yongbin-buaa
6855503f20 Merge pull request #212 from zai-org/fix-multiline-type
fix adb multiline input
2025-12-16 14:22:24 +08:00
liuyongbin
707b7f43f8 fix adb multiline input 2025-12-16 14:14:46 +08:00
floatingstarZ
4d427bcd31 解决issue154中的model_output有换行的问题以及多行文本输出。解决subprocessing编码问题 2025-12-15 18:10:46 +08:00
yongbin-buaa
b873c32917 Merge pull request #196 from xiaoman-kb/didi_error
Fix Didi error(滴滴出行包名错误修正)
2025-12-15 16:50:10 +08:00
xiaoman-kb
172d3e8e51 Fix Didi error 2025-12-15 15:17:53 +08:00
yongbin-buaa
7eda6372b8 Merge pull request #192 from zai-org/support-delay-config
support delay config
2025-12-15 13:17:10 +08:00
liuyongbin
430c13d22d support delay config 2025-12-15 11:56:48 +08:00
zRzRzRzRzRzRzR
61c1522174 use 5th wechat group (25-30) 2025-12-14 16:29:49 +08:00
yongbin-buaa
fefbed00c4 Merge pull request #179 from zai-org/add-latency-log
fix format
2025-12-14 14:04:31 +08:00
liuyongbin
b1ddd98552 fix format 2025-12-14 14:03:41 +08:00
xhguo
483b4f3bff chore: Added iOS guide into README 2025-12-13 16:39:03 +08:00
Yuxuan Zhang
b18993adf7 Merge pull request #113 from ksDreamer/main
docs(README): typos on line 54 “Downloads”
2025-12-13 12:23:00 +08:00
yongbin-buaa
0cf1fe746a Merge pull request #152 from zai-org/update-eval-security
replace eval with ast
2025-12-13 00:59:16 +08:00
yongbin-buaa
8c10bf7e99 replace eval with ast 2025-12-13 00:57:22 +08:00
yongbin-buaa
b37eef6506 Merge pull request #151 from zai-org/support-stream-thinking
support stream thinking
2025-12-13 00:44:34 +08:00
yongbin-buaa
0653d6ea65 support stream thinking 2025-12-13 00:41:40 +08:00
xhguo
e500501635 chore: Deleted local config file 2025-12-12 19:45:36 +08:00
xhguo
57ed003a30 chore: Removed local settings 2025-12-12 18:10:28 +08:00
xhguo
7c23ca549b feat: Added iOS support 2025-12-12 17:58:20 +08:00
yongbin-buaa
b2e985a790 Merge pull request #129 from zai-org/update-check-deployment-en
add en deployment check
2025-12-12 13:04:22 +08:00
yongbin-buaa
d4fc4dd2ad add en deployment check 2025-12-12 13:02:36 +08:00
zRzRzRzRzRzRzR
1fa7348905 Merge branch 'main' of https://github.com/zai-org/Open-AutoGLM 2025-12-12 12:14:59 +08:00
zRzRzRzRzRzRzR
44bef33d47 Update wechat.jpeg 2025-12-12 12:13:59 +08:00
Shaw
ac3b223c28 Merge pull request #123 from zai-org/Xiao9905-patch-1
Add link to Voice Input Method App
2025-12-12 11:03:27 +08:00