Add Web Dashboard with multi-device control and callback hooks

Features:
- Web Dashboard: FastAPI-based dashboard with Vue.js frontend
  - Multi-device support (ADB, HDC, iOS)
  - Real-time WebSocket updates for task progress
  - Device management with status tracking
  - Task queue with execution controls (start/stop/re-execute)
  - Detailed task information display (thinking, actions, completion messages)
  - Screenshot viewing per device
  - LAN deployment support with configurable CORS

- Callback Hooks: Interrupt and modify task execution
  - step_callback: Called after each step with StepResult
  - before_action_callback: Called before executing action
  - Support for task interruption and dynamic task switching
  - Example scripts demonstrating callback usage

- Configuration: Environment-based configuration
  - .env file support for all settings
  - .env.example template with documentation
  - Model API configuration (base URL, model name, API key)
  - Dashboard configuration (host, port, CORS, device type)
  - Phone agent configuration (delays, max steps, language)

Technical improvements:
- Fixed forward reference issue with StepResult
- Added package exports for callback types and configs
- Enhanced dependencies with FastAPI, WebSocket support
- Thread-safe task execution with device locking
- Async WebSocket broadcasting from sync thread pool

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
let5sne.win10
2026-01-09 02:20:06 +08:00
parent 9fe189a8f8
commit 3552df23d6
31 changed files with 4221 additions and 8 deletions

110
.env.example Normal file
View File

@@ -0,0 +1,110 @@
# Phone Agent Environment Configuration
# Copy this file to .env and fill in your values
# ============================================================================
# Model Configuration / 模型配置
# ============================================================================
# Model API base URL / 模型 API 地址
# Default: http://localhost:8000/v1
# For BigModel (智谱AI): https://open.bigmodel.cn/api/paas/v4
# For ModelScope: https://api-inference.modelscope.cn/v1
PHONE_AGENT_BASE_URL=http://localhost:8000/v1
# Model name / 模型名称
# Default: autoglm-phone-9b
PHONE_AGENT_MODEL=autoglm-phone-9b
# API key for authentication / API 密钥
# Default: EMPTY (no authentication)
# For BigModel: your actual API key
PHONE_AGENT_API_KEY=EMPTY
# ============================================================================
# Agent Configuration / Agent 配置
# ============================================================================
# Maximum steps per task / 每个任务的最大步数
PHONE_AGENT_MAX_STEPS=100
# Language for system prompt / 系统提示词语言
# Options: cn (Chinese), en (English)
PHONE_AGENT_LANG=cn
# Verbose output / 详细输出
# Options: true, false
PHONE_AGENT_VERBOSE=true
# ============================================================================
# Device Configuration / 设备配置
# ============================================================================
# Device ID (for multi-device setups) / 设备 ID多设备时使用
# Leave empty to auto-detect / 留空则自动检测
PHONE_AGENT_DEVICE_ID=
# Device type / 设备类型
# Options: adb (Android), hdc (HarmonyOS), ios (iPhone)
PHONE_AGENT_DEVICE_TYPE=adb
# ============================================================================
# iOS Specific Configuration / iOS 专用配置
# ============================================================================
# WebDriverAgent URL for iOS / iOS WebDriverAgent 地址
PHONE_AGENT_WDA_URL=http://localhost:8100
# ============================================================================
# Timing Configuration (Optional) / 时序配置(可选)
# ============================================================================
# Action delays in seconds / 动作延迟(秒)
PHONE_AGENT_KEYBOARD_SWITCH_DELAY=1.0
PHONE_AGENT_TEXT_INPUT_DELAY=1.0
# Device delays in seconds / 设备延迟(秒)
PHONE_AGENT_TAP_DELAY=1.0
PHONE_AGENT_SWIPE_DELAY=1.0
PHONE_AGENT_LAUNCH_DELAY=1.0
# Connection delays in seconds / 连接延迟(秒)
PHONE_AGENT_ADB_RESTART_DELAY=2.0
# ============================================================================
# Web Dashboard Configuration / Web 仪表板配置
# ============================================================================
# Dashboard server host / 仪表板服务器主机
# Use 0.0.0.0 for LAN access / 使用 0.0.0.0 以支持局域网访问
DASHBOARD_HOST=0.0.0.0
# Dashboard server port / 仪表板服务器端口
DASHBOARD_PORT=8080
# Dashboard debug mode / 仪表板调试模式
# Options: true, false
DASHBOARD_DEBUG=false
# CORS allowed origins / CORS 允许的来源
# Comma-separated list, use * for all / 逗号分隔列表,使用 * 表示全部
# Examples: http://localhost:3000,http://192.168.1.*
DASHBOARD_CORS_ORIGINS=*
# Default device type for dashboard / 仪表板默认设备类型
# Options: adb, hdc, ios
DEFAULT_DEVICE_TYPE=adb
# Maximum concurrent tasks / 最大并发任务数
MAX_CONCURRENT_TASKS=10
# Task timeout in seconds / 任务超时时间(秒)
TASK_TIMEOUT_SECONDS=300
# Screenshot quality (1-100) / 截图质量 (1-100)
SCREENSHOT_QUALITY=80
# Screenshot throttle delay in milliseconds / 截图节流延迟(毫秒)
SCREENSHOT_THROTTLE_MS=500
# Maximum task history to keep / 保留的最大任务历史数
MAX_TASK_HISTORY=100

5
.gitignore vendored
View File

@@ -54,6 +54,11 @@ Thumbs.db
*.log
/tmp/
screenshots/
*.apk
# Environment variables (keep .env.example as template)
.env
.env.local
# Keep old files during transition
call_model.py

8
dashboard/__init__.py Normal file
View File

@@ -0,0 +1,8 @@
"""
AutoGLM Dashboard - Web-based multi-device control interface.
This package provides a FastAPI web application for controlling multiple
Android (ADB), HarmonyOS (HDC), and iOS devices through a web interface.
"""
__version__ = "0.1.0"

13
dashboard/api/__init__.py Normal file
View File

@@ -0,0 +1,13 @@
"""
API endpoints for the dashboard.
"""
from dashboard.api.devices import router as devices_router
from dashboard.api.tasks import router as tasks_router
from dashboard.api.websocket import router as websocket_router
__all__ = [
"devices_router",
"tasks_router",
"websocket_router",
]

193
dashboard/api/devices.py Normal file
View File

@@ -0,0 +1,193 @@
"""
Device management API endpoints.
"""
from typing import List
from fastapi import APIRouter, Depends, HTTPException
from dashboard.dependencies import get_device_manager, get_ws_manager
from dashboard.models.device import DeviceSchema, DeviceStatus
from dashboard.services.device_manager import DeviceManager
from dashboard.services.websocket_manager import WebSocketManager
router = APIRouter(prefix="/devices", tags=["devices"])
@router.get("", response_model=List[DeviceSchema])
async def list_devices(
device_manager: DeviceManager = Depends(get_device_manager),
):
"""List all connected devices.
Returns:
List of device schemas
"""
devices = await device_manager.refresh_devices()
return [
DeviceSchema(
device_id=d.device_id,
status=d.status,
device_type=d.device_type,
model=d.model,
android_version=d.android_version,
current_app=d.current_app,
last_seen=d.last_seen,
is_connected=d.is_connected,
)
for d in devices
]
@router.get("/refresh")
async def refresh_devices(
device_manager: DeviceManager = Depends(get_device_manager),
ws_manager: WebSocketManager = Depends(get_ws_manager),
):
"""Rescan for connected devices.
Returns:
Refresh confirmation
"""
devices = await device_manager.refresh_devices()
# Broadcast device update
for device in devices:
await ws_manager.broadcast_device_update(
device.device_id,
{
"status": device.status,
"is_connected": device.is_connected,
"model": device.model,
"current_app": device.current_app,
"last_seen": device.last_seen.isoformat(),
},
)
return {"message": "Devices refreshed", "count": len(devices)}
@router.get("/{device_id}", response_model=DeviceSchema)
async def get_device(
device_id: str,
device_manager: DeviceManager = Depends(get_device_manager),
):
"""Get device details.
Args:
device_id: Device identifier
Returns:
Device schema
Raises:
HTTPException: If device not found
"""
device = await device_manager.get_device(device_id)
if device is None:
raise HTTPException(status_code=404, detail=f"Device {device_id} not found")
return DeviceSchema(
device_id=device.device_id,
status=device.status,
device_type=device.device_type,
model=device.model,
android_version=device.android_version,
current_app=device.current_app,
last_seen=device.last_seen,
is_connected=device.is_connected,
)
@router.post("/{device_id}/connect")
async def connect_device(
device_id: str,
address: str | None = None,
device_manager: DeviceManager = Depends(get_device_manager),
):
"""Connect to device via WiFi.
Args:
device_id: Device identifier (for route matching)
address: Device address (IP:PORT)
Returns:
Connection result
"""
if not address:
raise HTTPException(status_code=400, detail="Address is required")
success = await device_manager.connect_device(address)
if success:
return {"message": f"Connected to {address}"}
else:
raise HTTPException(status_code=500, detail=f"Failed to connect to {address}")
@router.post("/{device_id}/disconnect")
async def disconnect_device(
device_id: str,
address: str | None = None,
device_manager: DeviceManager = Depends(get_device_manager),
):
"""Disconnect from device.
Args:
device_id: Device identifier (for route matching)
address: Device address (IP:PORT)
Returns:
Disconnection result
"""
if not address:
raise HTTPException(status_code=400, detail="Address is required")
success = await device_manager.disconnect_device(address)
if success:
return {"message": f"Disconnected from {address}"}
else:
raise HTTPException(
status_code=500, detail=f"Failed to disconnect from {address}"
)
@router.get("/{device_id}/screenshot")
async def get_device_screenshot(
device_id: str,
device_manager: DeviceManager = Depends(get_device_manager),
):
"""Get current device screenshot.
Args:
device_id: Device identifier
Returns:
Screenshot data (base64 encoded)
"""
screenshot = await device_manager.get_screenshot(device_id)
if screenshot is None:
raise HTTPException(status_code=500, detail="Failed to capture screenshot")
return {"device_id": device_id, "screenshot": screenshot}
@router.get("/{device_id}/current-app")
async def get_current_app(
device_id: str,
device_manager: DeviceManager = Depends(get_device_manager),
):
"""Get current app for device.
Args:
device_id: Device identifier
Returns:
Current app package name
"""
app = await device_manager.get_current_app(device_id)
return {"device_id": device_id, "current_app": app}

156
dashboard/api/tasks.py Normal file
View File

@@ -0,0 +1,156 @@
"""
Task management API endpoints.
"""
from typing import List
from fastapi import APIRouter, Depends, HTTPException
from dashboard.config import config
from dashboard.dependencies import get_task_executor
from dashboard.models.task import TaskCreateRequest, TaskSchema, TaskStatus
from dashboard.services.task_executor import TaskExecutor
router = APIRouter(prefix="/tasks", tags=["tasks"])
@router.post("/execute", response_model=dict)
async def execute_task(
request: TaskCreateRequest,
executor: TaskExecutor = Depends(get_task_executor),
):
"""Execute task on device.
Args:
request: Task creation request
Returns:
Task ID
"""
# Fill in model config from environment if using defaults
if request.base_url == "http://localhost:8000/v1":
request.base_url = config.MODEL_BASE_URL
if request.model_name == "autoglm-phone-9b":
request.model_name = config.MODEL_NAME
if request.api_key == "EMPTY":
request.api_key = config.MODEL_API_KEY
task_id = await executor.execute_task(request)
return {"task_id": task_id, "message": "Task started"}
@router.post("/{task_id}/stop")
async def stop_task(
task_id: str,
executor: TaskExecutor = Depends(get_task_executor),
):
"""Stop running task.
Args:
task_id: Task identifier
Returns:
Stop confirmation
"""
await executor.stop_task(task_id)
return {"message": f"Task {task_id} stop requested"}
@router.get("", response_model=List[TaskSchema])
async def list_tasks(
executor: TaskExecutor = Depends(get_task_executor),
):
"""List all tasks (active and recent).
Returns:
List of task schemas
"""
return await executor.list_tasks()
@router.get("/{task_id}", response_model=TaskSchema)
async def get_task_status(
task_id: str,
executor: TaskExecutor = Depends(get_task_executor),
):
"""Get task status.
Args:
task_id: Task identifier
Returns:
Task schema
Raises:
HTTPException: If task not found
"""
task = await executor.get_task_status(task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"Task {task_id} not found")
return task
@router.get("/{task_id}/screenshot")
async def get_task_screenshot(
task_id: str,
executor: TaskExecutor = Depends(get_task_executor),
):
"""Get latest screenshot from task execution.
Args:
task_id: Task identifier
Returns:
Screenshot data (base64 encoded)
Raises:
HTTPException: If task not found or device unavailable
"""
task = await executor.get_task_status(task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"Task {task_id} not found")
# Get screenshot from device manager
from dashboard.dependencies import get_device_manager
device_manager = get_device_manager()
screenshot = await device_manager.get_screenshot(task.device_id)
if screenshot is None:
raise HTTPException(
status_code=500, detail="Failed to capture screenshot from device"
)
return {"task_id": task_id, "device_id": task.device_id, "screenshot": screenshot}
@router.get("/stats/summary")
async def get_task_stats(
executor: TaskExecutor = Depends(get_task_executor),
):
"""Get task execution statistics.
Returns:
Task statistics summary
"""
tasks = await executor.list_tasks()
total = len(tasks)
running = sum(1 for t in tasks if t.status == TaskStatus.RUNNING)
completed = sum(1 for t in tasks if t.status == TaskStatus.COMPLETED)
failed = sum(1 for t in tasks if t.status == TaskStatus.FAILED)
stopped = sum(1 for t in tasks if t.status == TaskStatus.STOPPED)
return {
"total": total,
"running": running,
"completed": completed,
"failed": failed,
"stopped": stopped,
"active_count": executor.get_active_task_count(),
}

124
dashboard/api/websocket.py Normal file
View File

@@ -0,0 +1,124 @@
"""
WebSocket API endpoints for real-time updates.
"""
from fastapi import APIRouter, WebSocket, WebSocketDisconnect, Depends, Query
from dashboard.dependencies import get_ws_manager, get_device_manager
from dashboard.services.websocket_manager import WebSocketManager
from dashboard.services.device_manager import DeviceManager
router = APIRouter(prefix="/ws", tags=["websocket"])
@router.websocket("")
async def websocket_endpoint(
websocket: WebSocket,
client_id: str | None = Query(None),
ws_manager: WebSocketManager = Depends(get_ws_manager),
device_manager: DeviceManager = Depends(get_device_manager),
):
"""WebSocket endpoint for real-time updates.
This endpoint provides real-time updates for:
- Device connection/disconnection
- Task execution progress
- Screenshot updates
- Task completion
Query parameters:
client_id: Optional client ID (auto-generated if not provided)
Message types (client -> server):
- {"type": "subscribe", "device_id": "device_id"} - Subscribe to device updates
- {"type": "unsubscribe", "device_id": "device_id"} - Unsubscribe from device updates
- {"type": "ping"} - Ping server
Message types (server -> client):
- {"type": "device_update", "data": {...}} - Device status update
- {"type": "task_started", "data": {...}} - Task started
- {"type": "task_step", "data": {...}} - Task step update
- {"type": "task_completed", "data": {...}} - Task completed
- {"type": "task_failed", "data": {...}} - Task failed
- {"type": "task_stopped", "data": {...}} - Task stopped
- {"type": "screenshot", "data": {...}} - Screenshot update
- {"type": "error", "data": {...}} - Error occurred
- {"type": "pong"} - Pong response
"""
# Accept connection
client_id = await ws_manager.connect(websocket, client_id)
try:
# Send initial devices
devices = await device_manager.refresh_devices()
for device in devices:
await websocket.send_json({
"type": "device_update",
"data": {
"device_id": device.device_id,
"status": device.status.value,
"device_type": device.device_type.value,
"model": device.model,
"android_version": device.android_version,
"current_app": device.current_app,
"is_connected": device.is_connected,
}
})
# Message loop
while True:
data = await websocket.receive_json()
msg_type = data.get("type")
if msg_type == "subscribe":
# Subscribe to device updates
device_id = data.get("device_id", "*")
ws_manager.subscribe_to_device(client_id, device_id)
elif msg_type == "unsubscribe":
# Unsubscribe from device updates
device_id = data.get("device_id")
if device_id:
ws_manager.unsubscribe_from_device(client_id, device_id)
elif msg_type == "ping":
# Respond to ping
await websocket.send_json({"type": "pong"})
except WebSocketDisconnect:
ws_manager.disconnect(client_id)
except Exception:
ws_manager.disconnect(client_id)
@router.websocket("/device/{device_id}")
async def device_websocket(
device_id: str,
websocket: WebSocket,
ws_manager: WebSocketManager = Depends(get_ws_manager),
):
"""Device-specific WebSocket endpoint for real-time updates.
This endpoint provides real-time updates for a specific device.
Automatically subscribes to the device's updates.
Args:
device_id: Device identifier
"""
# Accept connection and auto-subscribe to device
client_id = await ws_manager.connect(websocket)
ws_manager.subscribe_to_device(client_id, device_id)
try:
# Keep connection alive
while True:
data = await websocket.receive_json()
# Handle client messages
if data.get("type") == "ping":
await websocket.send_json({"type": "pong"})
except WebSocketDisconnect:
ws_manager.disconnect(client_id)
except Exception:
ws_manager.disconnect(client_id)

52
dashboard/config.py Normal file
View File

@@ -0,0 +1,52 @@
"""
Dashboard configuration settings.
"""
import os
from typing import List
from dotenv import load_dotenv
# Load .env file
load_dotenv()
class DashboardConfig:
"""Dashboard configuration."""
# Server settings
HOST: str = os.getenv("DASHBOARD_HOST", "0.0.0.0")
PORT: int = int(os.getenv("DASHBOARD_PORT", "8080"))
DEBUG: bool = os.getenv("DASHBOARD_DEBUG", "false").lower() == "true"
# CORS settings
CORS_ORIGINS: List[str] = os.getenv(
"DASHBOARD_CORS_ORIGINS", "*"
).split(",") if os.getenv("DASHBOARD_CORS_ORIGINS") else ["*"]
# Device settings
DEFAULT_DEVICE_TYPE: str = os.getenv("DEFAULT_DEVICE_TYPE", "adb")
# Task settings
MAX_CONCURRENT_TASKS: int = int(os.getenv("MAX_CONCURRENT_TASKS", "10"))
TASK_TIMEOUT_SECONDS: int = int(os.getenv("TASK_TIMEOUT_SECONDS", "300"))
# Screenshot settings
SCREENSHOT_QUALITY: int = int(os.getenv("SCREENSHOT_QUALITY", "80"))
SCREENSHOT_THROTTLE_MS: int = int(os.getenv("SCREENSHOT_THROTTLE_MS", "500"))
# Model settings (defaults from environment)
MODEL_BASE_URL: str = os.getenv("PHONE_AGENT_BASE_URL", "http://localhost:8000/v1")
MODEL_NAME: str = os.getenv("PHONE_AGENT_MODEL", "autoglm-phone-9b")
MODEL_API_KEY: str = os.getenv("PHONE_AGENT_API_KEY", "EMPTY")
# Task history
MAX_TASK_HISTORY: int = int(os.getenv("MAX_TASK_HISTORY", "100"))
class Config:
env_file = ".env"
env_file_encoding = "utf-8"
# Global config instance
config = DashboardConfig()

39
dashboard/dependencies.py Normal file
View File

@@ -0,0 +1,39 @@
"""
Dependency injection for the dashboard.
"""
from typing import AsyncGenerator
from dashboard.config import config
from dashboard.services.device_manager import DeviceManager
from dashboard.services.task_executor import TaskExecutor
from dashboard.services.websocket_manager import WebSocketManager
# Global service instances
_device_manager: DeviceManager | None = None
_task_executor: TaskExecutor | None = None
_ws_manager: WebSocketManager | None = None
def get_device_manager() -> DeviceManager:
"""Get the device manager instance."""
global _device_manager
if _device_manager is None:
_device_manager = DeviceManager(device_type=config.DEFAULT_DEVICE_TYPE)
return _device_manager
def get_task_executor() -> TaskExecutor:
"""Get the task executor instance."""
global _task_executor
if _task_executor is None:
_task_executor = TaskExecutor(get_device_manager())
return _task_executor
def get_ws_manager() -> WebSocketManager:
"""Get the WebSocket manager instance."""
global _ws_manager
if _ws_manager is None:
_ws_manager = WebSocketManager()
return _ws_manager

175
dashboard/main.py Normal file
View File

@@ -0,0 +1,175 @@
"""
AutoGLM Dashboard - FastAPI Main Application.
This is the main entry point for the web dashboard.
Run with: uvicorn dashboard.main:app --host 0.0.0.0 --port 8080 --reload
"""
import os
from contextlib import asynccontextmanager
from datetime import datetime
from pathlib import Path
from dotenv import load_dotenv
from fastapi import FastAPI, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse, JSONResponse
from fastapi.staticfiles import StaticFiles
from dashboard.api import devices_router, tasks_router, websocket_router
from dashboard.config import config
from dashboard.dependencies import (
get_device_manager,
get_task_executor,
get_ws_manager,
)
from dashboard.services.device_manager import DeviceManager
from dashboard.services.task_executor import TaskExecutor
from dashboard.services.websocket_manager import WebSocketManager
# Load .env file
load_dotenv()
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Application lifespan manager.
Handles startup and shutdown events.
"""
# Startup
print("=" * 50)
print("AutoGLM Dashboard Starting...")
print("=" * 50)
# Initialize services
device_manager = get_device_manager()
task_executor = get_task_executor()
ws_manager = get_ws_manager()
# Link services
task_executor.set_ws_manager(ws_manager)
# Scan for devices on startup
print("Scanning for devices...")
try:
devices = await device_manager.refresh_devices()
print(f"Found {len(devices)} device(s)")
for device in devices:
status = "connected" if device.is_connected else "disconnected"
print(f" - {device.device_id} ({status})")
except Exception as e:
print(f"Error scanning devices: {e}")
print("=" * 50)
print(f"Dashboard running on http://{config.HOST}:{config.PORT}")
print(f"API docs: http://{config.HOST}:{config.PORT}/docs")
print("=" * 50)
yield
# Shutdown
print("Shutting down dashboard...")
# Create FastAPI app
app = FastAPI(
title="AutoGLM Dashboard",
description="Web-based multi-device control interface for AutoGLM",
version="0.1.0",
lifespan=lifespan,
)
# Configure CORS for LAN access
app.add_middleware(
CORSMiddleware,
allow_origins=config.CORS_ORIGINS,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Exception handlers
@app.exception_handler(Exception)
async def global_exception_handler(request: Request, exc: Exception):
"""Global exception handler."""
return JSONResponse(
status_code=500,
content={"error": str(exc), "type": type(exc).__name__},
)
# Include routers
app.include_router(devices_router, prefix="/api")
app.include_router(tasks_router, prefix="/api")
app.include_router(websocket_router)
# Health check
@app.get("/health")
async def health_check():
"""Health check endpoint."""
device_manager = get_device_manager()
task_executor = get_task_executor()
ws_manager = get_ws_manager()
devices = device_manager.list_all_devices()
connected_devices = sum(1 for d in devices if d.is_connected)
return {
"status": "healthy",
"timestamp": datetime.now().isoformat(),
"devices": {
"total": len(devices),
"connected": connected_devices,
},
"tasks": {
"active": task_executor.get_active_task_count(),
},
"websocket": {
"connections": ws_manager.get_connection_count(),
},
}
# Root endpoint - serve dashboard or API info
@app.get("/")
async def root():
"""Root endpoint - returns API info or serves dashboard."""
# Check if static files exist
static_path = Path(__file__).parent / "static" / "index.html"
if static_path.exists():
return FileResponse(static_path)
# Return API info if dashboard not built
return {
"name": "AutoGLM Dashboard API",
"version": "0.1.0",
"docs": "/docs",
"endpoints": {
"devices": "/api/devices",
"tasks": "/api/tasks",
"websocket": "/ws",
"health": "/health",
},
}
# Mount static files for dashboard (if exists)
static_path = Path(__file__).parent / "static"
if static_path.exists():
app.mount("/static", StaticFiles(directory=str(static_path)), name="static")
# Run script entry point
if __name__ == "__main__":
import uvicorn
uvicorn.run(
"dashboard.main:app",
host=config.HOST,
port=config.PORT,
reload=config.DEBUG,
)

View File

@@ -0,0 +1,37 @@
"""
Data models for the dashboard API.
Includes Pydantic schemas for devices, tasks, and WebSocket messages.
"""
from dashboard.models.device import (
DeviceType,
DeviceStatus,
DeviceSchema,
DeviceInfo,
)
from dashboard.models.task import (
TaskStatus,
TaskRequest,
TaskSchema,
TaskCreateRequest,
)
from dashboard.models.ws_messages import (
WSMessageType,
WSMessage,
StepUpdate,
)
__all__ = [
"DeviceType",
"DeviceStatus",
"DeviceSchema",
"DeviceInfo",
"TaskStatus",
"TaskRequest",
"TaskSchema",
"TaskCreateRequest",
"WSMessageType",
"WSMessage",
"StepUpdate",
]

View File

@@ -0,0 +1,67 @@
"""
Device data models for the dashboard.
"""
from datetime import datetime
from enum import Enum
from typing import Optional
from pydantic import BaseModel, Field
class DeviceType(str, Enum):
"""Device connection type."""
ADB = "adb"
HDC = "hdc"
IOS = "ios"
class DeviceStatus(str, Enum):
"""Device status."""
ONLINE = "online"
OFFLINE = "offline"
BUSY = "busy"
ERROR = "error"
class DeviceSchema(BaseModel):
"""Device schema for API responses."""
device_id: str = Field(..., description="Unique device identifier")
status: DeviceStatus = Field(default=DeviceStatus.OFFLINE, description="Device status")
device_type: DeviceType = Field(..., description="Device connection type")
model: Optional[str] = Field(None, description="Device model name")
android_version: Optional[str] = Field(None, description="Android/iOS version")
current_app: Optional[str] = Field(None, description="Currently active app")
last_seen: datetime = Field(default_factory=datetime.now, description="Last connection time")
is_connected: bool = Field(True, description="Whether device is connected")
screenshot: Optional[str] = Field(None, description="Base64 encoded screenshot")
class Config:
json_schema_extra = {
"example": {
"device_id": "emulator-5554",
"status": "online",
"device_type": "adb",
"model": "sdk_gphone64_x86_64",
"android_version": "14",
"current_app": "com.android.launcher3",
"is_connected": True,
}
}
class DeviceInfo(BaseModel):
"""Extended device information."""
device_id: str
status: DeviceStatus
device_type: DeviceType
model: Optional[str] = None
android_version: Optional[str] = None
current_app: Optional[str] = None
last_seen: datetime
screenshot: Optional[str] = None
is_connected: bool = True

81
dashboard/models/task.py Normal file
View File

@@ -0,0 +1,81 @@
"""
Task data models for the dashboard.
"""
from datetime import datetime
from enum import Enum
from typing import Any, Dict, Optional
from pydantic import BaseModel, Field
from phone_agent.model import ModelConfig
class TaskStatus(str, Enum):
"""Task execution status."""
PENDING = "pending"
RUNNING = "running"
COMPLETED = "completed"
FAILED = "failed"
STOPPED = "stopped"
class TaskCreateRequest(BaseModel):
"""Request to create a new task."""
device_id: str = Field(..., description="Target device ID")
task: str = Field(..., description="Task description")
max_steps: int = Field(100, description="Maximum execution steps")
lang: str = Field("cn", description="Language (cn or en)")
# Model config - use dict to avoid validation issues with ModelConfig
base_url: str = Field(
default="http://localhost:8000/v1", description="Model API base URL"
)
model_name: str = Field(default="autoglm-phone-9b", description="Model name")
api_key: str = Field(default="EMPTY", description="API key")
max_tokens: int = Field(default=3000, description="Max tokens per response")
temperature: float = Field(default=0.0, description="Sampling temperature")
top_p: float = Field(default=0.85, description="Top-p sampling parameter")
frequency_penalty: float = Field(default=0.2, description="Frequency penalty")
class TaskSchema(BaseModel):
"""Task schema for API responses."""
task_id: str = Field(..., description="Unique task identifier")
device_id: str = Field(..., description="Target device ID")
task: str = Field(..., description="Task description")
status: TaskStatus = Field(..., description="Task status")
current_step: int = Field(0, description="Current step number")
max_steps: int = Field(100, description="Maximum steps")
current_action: Optional[Dict[str, Any]] = Field(None, description="Current action")
thinking: Optional[str] = Field(None, description="Current thinking/reasoning")
started_at: datetime = Field(..., description="Task start time")
updated_at: datetime = Field(..., description="Last update time")
finished_at: Optional[datetime] = Field(None, description="Task completion time")
error: Optional[str] = Field(None, description="Error message if failed")
completion_message: Optional[str] = Field(
None, description="Task completion message with details"
)
class Config:
json_schema_extra = {
"example": {
"task_id": "task_123456",
"device_id": "emulator-5554",
"task": "Open WeChat",
"status": "running",
"current_step": 3,
"max_steps": 100,
"current_action": {"action": "Tap", "element": "WeChat icon"},
"thinking": "Looking for WeChat icon on home screen",
"started_at": "2024-01-09T10:00:00",
"updated_at": "2024-01-09T10:00:15",
}
}
# For backward compatibility
TaskRequest = TaskCreateRequest

View File

@@ -0,0 +1,70 @@
"""
WebSocket message models for real-time updates.
"""
from datetime import datetime
from enum import Enum
from typing import Any, Dict, Optional
from pydantic import BaseModel, Field
class WSMessageType(str, Enum):
"""WebSocket message types."""
DEVICE_UPDATE = "device_update"
TASK_STARTED = "task_started"
TASK_STEP = "task_step"
TASK_COMPLETED = "task_completed"
TASK_FAILED = "task_failed"
TASK_STOPPED = "task_stopped"
SCREENSHOT = "screenshot"
ERROR = "error"
PING = "ping"
PONG = "pong"
class WSMessage(BaseModel):
"""Base WebSocket message."""
type: WSMessageType = Field(..., description="Message type")
data: Dict[str, Any] = Field(default_factory=dict, description="Message data")
timestamp: datetime = Field(default_factory=datetime.now, description="Message timestamp")
class Config:
json_schema_extra = {
"example": {
"type": "task_step",
"data": {
"task_id": "task_123",
"device_id": "emulator-5554",
"step": 5,
"action": {"action": "Tap"},
"thinking": "Tapping on button",
},
"timestamp": "2024-01-09T10:00:00",
}
}
class StepUpdate(BaseModel):
"""Step update message data."""
task_id: str = Field(..., description="Task ID")
device_id: str = Field(..., description="Device ID")
step: int = Field(..., description="Step number")
action: Optional[Dict[str, Any]] = Field(None, description="Action taken")
thinking: Optional[str] = Field(None, description="AI reasoning")
finished: bool = Field(False, description="Whether task is finished")
success: bool = Field(True, description="Whether step succeeded")
message: Optional[str] = Field(None, description="Status message")
class ScreenshotUpdate(BaseModel):
"""Screenshot update message data."""
device_id: str = Field(..., description="Device ID")
screenshot: str = Field(..., description="Base64 encoded screenshot")
width: int = Field(..., description="Screenshot width")
height: int = Field(..., description="Screenshot height")
timestamp: datetime = Field(default_factory=datetime.now)

View File

@@ -0,0 +1,15 @@
"""
Services for the dashboard.
Includes device management, task execution, and WebSocket management.
"""
from dashboard.services.device_manager import DeviceManager
from dashboard.services.task_executor import TaskExecutor
from dashboard.services.websocket_manager import WebSocketManager
__all__ = [
"DeviceManager",
"TaskExecutor",
"WebSocketManager",
]

View File

@@ -0,0 +1,336 @@
"""
Device manager for handling device pool and connections.
"""
import asyncio
from datetime import datetime
from enum import Enum
from threading import Lock
from typing import Dict, List, Optional
from phone_agent.adb.connection import ADBConnection
from phone_agent.adb.screenshot import get_screenshot as adb_screenshot
from phone_agent.device_factory import (
DeviceFactory,
DeviceType,
get_device_factory,
set_device_type,
)
from phone_agent.hdc.connection import HDCConnection
from phone_agent.hdc.screenshot import get_screenshot as hdc_screenshot
from dashboard.models.device import DeviceInfo, DeviceStatus, DeviceSchema
class DeviceConnectionError(Exception):
"""Raised when device connection fails."""
pass
class DeviceManager:
"""Manage device pool and connections."""
def __init__(self, device_type: str = "adb"):
"""Initialize the device manager.
Args:
device_type: Default device type (adb, hdc, ios)
"""
self._devices: Dict[str, DeviceInfo] = {}
self._device_locks: Dict[str, Lock] = {}
self._device_type = self._parse_device_type(device_type)
self._screenshot_cache: Dict[str, tuple[str, datetime]] = {} # (base64, timestamp)
self._cache_ttl_seconds = 2.0 # Cache screenshots for 2 seconds
# Set the global device type
set_device_type(self._device_type)
def _parse_device_type(self, device_type: str) -> DeviceType:
"""Parse device type string to enum."""
try:
return DeviceType[device_type.upper()]
except KeyError:
return DeviceType.ADB
@property
def factory(self) -> DeviceFactory:
"""Get the device factory instance."""
return get_device_factory()
async def refresh_devices(self) -> List[DeviceInfo]:
"""Scan and return all connected devices.
Returns:
List of device info objects
"""
try:
# Run device listing in thread pool to avoid blocking
loop = asyncio.get_event_loop()
devices = await loop.run_in_executor(None, self.factory.list_devices)
# Update device cache
current_time = datetime.now()
for device in devices:
device_id = device.device_id
# Initialize lock if needed
if device_id not in self._device_locks:
self._device_locks[device_id] = Lock()
# Update or create device info
if device_id in self._devices:
# Update existing device
self._devices[device_id].last_seen = current_time
self._devices[device_id].is_connected = True
self._devices[device_id].status = (
DeviceStatus.BUSY
if not self._is_device_available(device_id)
else DeviceStatus.ONLINE
)
else:
# New device
self._devices[device_id] = DeviceInfo(
device_id=device_id,
status=DeviceStatus.ONLINE,
device_type=self._device_type_to_schema(device.connection_type),
model=device.model,
android_version=device.android_version,
current_app=None,
last_seen=current_time,
is_connected=True,
)
# Mark disconnected devices
connected_ids = {d.device_id for d in devices}
for device_id, device_info in self._devices.items():
if device_id not in connected_ids:
device_info.is_connected = False
device_info.status = DeviceStatus.OFFLINE
return list(self._devices.values())
except Exception as e:
raise DeviceConnectionError(f"Failed to refresh devices: {e}")
async def get_device(self, device_id: str) -> Optional[DeviceInfo]:
"""Get device info by ID.
Args:
device_id: Device identifier
Returns:
Device info or None if not found
"""
return self._devices.get(device_id)
def acquire_device(self, device_id: str) -> bool:
"""Acquire lock on device.
Args:
device_id: Device identifier
Returns:
True if acquired, False if device is busy
"""
if device_id not in self._device_locks:
return False
lock = self._device_locks[device_id]
acquired = lock.acquire(blocking=False)
if acquired:
# Update device status
if device_id in self._devices:
self._devices[device_id].status = DeviceStatus.BUSY
return acquired
def release_device(self, device_id: str):
"""Release device lock.
Args:
device_id: Device identifier
"""
if device_id in self._device_locks:
self._device_locks[device_id].release()
# Update device status
if device_id in self._devices:
device = self._devices[device_id]
if device.is_connected:
device.status = DeviceStatus.ONLINE
else:
device.status = DeviceStatus.OFFLINE
def is_device_available(self, device_id: str) -> bool:
"""Check if device is available for task execution.
Args:
device_id: Device identifier
Returns:
True if available, False if busy or offline
"""
if device_id not in self._devices:
return False
device = self._devices[device_id]
if not device.is_connected:
return False
if device.status == DeviceStatus.BUSY:
return False
return True
def _is_device_available(self, device_id: str) -> bool:
"""Internal check if device lock is available."""
if device_id not in self._device_locks:
return True
lock = self._device_locks[device_id]
return not lock.locked()
async def get_screenshot(self, device_id: str) -> Optional[str]:
"""Get screenshot for device.
Args:
device_id: Device identifier
Returns:
Base64 encoded screenshot or None
"""
# Check cache
if device_id in self._screenshot_cache:
screenshot, timestamp = self._screenshot_cache[device_id]
age = (datetime.now() - timestamp).total_seconds()
if age < self._cache_ttl_seconds:
return screenshot
try:
loop = asyncio.get_event_loop()
if self._device_type == DeviceType.HDC:
result = await loop.run_in_executor(
None, hdc_screenshot, device_id, 10
)
else: # ADB or default
result = await loop.run_in_executor(
None, adb_screenshot, device_id, 10
)
if result:
# Cache the screenshot
self._screenshot_cache[device_id] = (result.base64_data, datetime.now())
return result.base64_data
except Exception:
pass
return None
async def get_current_app(self, device_id: str) -> Optional[str]:
"""Get current app for device.
Args:
device_id: Device identifier
Returns:
Current app package name or None
"""
try:
loop = asyncio.get_event_loop()
app = await loop.run_in_executor(
None, self.factory.get_current_app, device_id
)
# Update device info
if device_id in self._devices and app:
self._devices[device_id].current_app = app
return app
except Exception:
return None
async def connect_device(self, address: str) -> bool:
"""Connect to device via WiFi.
Args:
address: Device address (IP:PORT)
Returns:
True if connected successfully
"""
try:
loop = asyncio.get_event_loop()
if self._device_type == DeviceType.HDC:
conn = HDCConnection()
else:
conn = ADBConnection()
success, _ = await loop.run_in_executor(None, conn.connect, address)
if success:
# Refresh devices after connecting
await self.refresh_devices()
return success
except Exception:
return False
async def disconnect_device(self, address: str) -> bool:
"""Disconnect from device.
Args:
address: Device address (IP:PORT)
Returns:
True if disconnected successfully
"""
try:
loop = asyncio.get_event_loop()
if self._device_type == DeviceType.HDC:
conn = HDCConnection()
else:
conn = ADBConnection()
success, _ = await loop.run_in_executor(None, conn.disconnect, address)
if success:
# Refresh devices after disconnecting
await self.refresh_devices()
return success
except Exception:
return False
def _device_type_to_schema(self, connection_type) -> "DeviceSchema":
"""Convert connection type to DeviceType enum."""
from dashboard.models.device import DeviceType as SchemaDeviceType
# Mapping from phone_agent connection types to schema types
type_map = {
"USB": SchemaDeviceType.ADB,
"WIFI": SchemaDeviceType.ADB,
"REMOTE": SchemaDeviceType.ADB,
}
# Try to get string value from enum
try:
type_str = connection_type.value if hasattr(connection_type, "value") else str(connection_type)
return type_map.get(type_str.upper(), SchemaDeviceType.ADB)
except (AttributeError, KeyError):
return SchemaDeviceType.ADB
def list_all_devices(self) -> List[DeviceInfo]:
"""Get all cached devices.
Returns:
List of all device info
"""
return list(self._devices.values())

View File

@@ -0,0 +1,410 @@
"""
Task executor for running tasks on devices with thread pool.
"""
import asyncio
import threading
import uuid
from concurrent.futures import ThreadPoolExecutor, Future
from copy import deepcopy
from dataclasses import dataclass, field
from datetime import datetime
from typing import Callable, Dict, Optional
from phone_agent import AgentConfig, PhoneAgent
from phone_agent.agent import StepResult
from phone_agent.model import ModelConfig
from dashboard.config import config
from dashboard.models.task import TaskCreateRequest, TaskSchema, TaskStatus
from dashboard.services.device_manager import DeviceManager
from dashboard.services.websocket_manager import WebSocketManager
def _run_async(coro):
"""Run async coroutine from sync context and wait for completion."""
try:
loop = asyncio.get_running_loop()
# We're already in an async context, use run_coroutine_threadsafe
import concurrent.futures
future = asyncio.run_coroutine_threadsafe(coro, loop)
# Wait for completion with timeout to avoid hanging
try:
future.result(timeout=5)
except concurrent.futures.TimeoutError:
pass
except Exception:
pass
except RuntimeError:
# No running loop, use asyncio.run
asyncio.run(coro)
@dataclass
class ActiveTask:
"""An active task being executed."""
task_id: str
device_id: str
task: str
status: TaskStatus
current_step: int
max_steps: int
started_at: datetime
updated_at: datetime
finished_at: Optional[datetime] = None
future: Optional[Future] = None
stop_requested: bool = False
current_action: Optional[dict] = None
thinking: Optional[str] = None
error: Optional[str] = None
completion_message: Optional[str] = None
class TaskExecutor:
"""Execute tasks on devices with thread pool."""
def __init__(
self,
device_manager: DeviceManager,
ws_manager: Optional[WebSocketManager] = None,
max_workers: int = 10,
):
"""Initialize the task executor.
Args:
device_manager: Device manager instance
ws_manager: WebSocket manager for real-time updates (optional)
max_workers: Maximum number of concurrent tasks
"""
self.device_manager = device_manager
self.ws_manager = ws_manager
self.executor = ThreadPoolExecutor(max_workers=max_workers)
self.active_tasks: Dict[str, ActiveTask] = {}
self.task_history: Dict[str, TaskSchema] = {}
self._lock = threading.Lock()
def set_ws_manager(self, ws_manager: WebSocketManager):
"""Set the WebSocket manager.
Args:
ws_manager: WebSocket manager instance
"""
self.ws_manager = ws_manager
async def execute_task(self, request: TaskCreateRequest) -> str:
"""Execute task on device.
Args:
request: Task creation request
Returns:
Task ID
"""
task_id = f"task_{uuid.uuid4().hex[:8]}"
# Create active task
active_task = ActiveTask(
task_id=task_id,
device_id=request.device_id,
task=request.task,
status=TaskStatus.RUNNING,
current_step=0,
max_steps=request.max_steps,
started_at=datetime.now(),
updated_at=datetime.now(),
)
with self._lock:
self.active_tasks[task_id] = active_task
# Notify WebSocket
if self.ws_manager:
await self.ws_manager.broadcast_task_started(task_id, request.dict())
# Submit to thread pool
future = self.executor.submit(
self._run_task,
task_id,
request,
)
active_task.future = future
return task_id
def _run_task(self, task_id: str, request: TaskCreateRequest):
"""Run task in thread pool.
Args:
task_id: Task ID
request: Task creation request
"""
import os
# Get model config from request
model_config = ModelConfig(
base_url=request.base_url,
model_name=request.model_name,
api_key=request.api_key,
max_tokens=request.max_tokens,
temperature=request.temperature,
top_p=request.top_p,
frequency_penalty=request.frequency_penalty,
lang=request.lang,
)
# Get agent config
agent_config = AgentConfig(
max_steps=request.max_steps,
device_id=request.device_id,
lang=request.lang,
step_callback=lambda result: self._step_callback(task_id, result),
before_action_callback=lambda action: self._before_action_callback(
task_id, action
),
)
# Create agent
agent = PhoneAgent(model_config=model_config, agent_config=agent_config)
try:
# Acquire device
if not self.device_manager.acquire_device(request.device_id):
raise Exception(f"Device {request.device_id} is not available")
try:
# Run task
result = agent.run(request.task)
# Update task status
with self._lock:
if task_id in self.active_tasks:
task = self.active_tasks[task_id]
task.status = (
TaskStatus.COMPLETED if result else TaskStatus.FAILED
)
task.finished_at = datetime.now()
task.updated_at = datetime.now()
finally:
# Release device
self.device_manager.release_device(request.device_id)
# Broadcast device status update
if self.ws_manager:
device = self.device_manager._devices.get(request.device_id)
if device:
_run_async(
self.ws_manager.broadcast_device_update(
request.device_id,
{
"status": device.status.value,
"is_connected": device.is_connected,
"current_app": device.current_app,
},
)
)
# Notify completion
if self.ws_manager:
with self._lock:
task = self.active_tasks.get(task_id)
task_status = task.status if task else TaskStatus.COMPLETED
message = result.message if hasattr(result, "message") else None
_run_async(
self.ws_manager.broadcast_task_completed(
task_id,
request.device_id,
task_status,
message,
)
)
except Exception as e:
# Release device on error
self.device_manager.release_device(request.device_id)
# Broadcast device status update
if self.ws_manager:
device = self.device_manager._devices.get(request.device_id)
if device:
_run_async(
self.ws_manager.broadcast_device_update(
request.device_id,
{
"status": device.status.value,
"is_connected": device.is_connected,
},
)
)
# Update task status
with self._lock:
if task_id in self.active_tasks:
task = self.active_tasks[task_id]
task.status = TaskStatus.FAILED
task.error = str(e)
task.finished_at = datetime.now()
task.updated_at = datetime.now()
# Notify error
if self.ws_manager:
_run_async(
self.ws_manager.broadcast_task_failed(
task_id, request.device_id, str(e)
)
)
finally:
# Move to history
with self._lock:
if task_id in self.active_tasks:
active_task = self.active_tasks.pop(task_id)
task_schema = self._active_task_to_schema(active_task)
self.task_history[task_id] = task_schema
# Trim history
if len(self.task_history) > config.MAX_TASK_HISTORY:
oldest = min(self.task_history.items(), key=lambda x: x[1].started_at)
del self.task_history[oldest[0]]
def _step_callback(
self, task_id: str, result: StepResult
) -> Optional[str]:
"""Callback after each step.
Args:
task_id: Task ID
result: Step result
Returns:
"stop" to interrupt, new task to switch, None to continue
"""
with self._lock:
if task_id not in self.active_tasks:
return None
task = self.active_tasks[task_id]
# Check if stop was requested
if task.stop_requested:
return "stop"
# Update task state
task.current_step = result.step_count
task.updated_at = datetime.now()
task.thinking = result.thinking
task.current_action = result.action
# Store completion message when finished
if result.finished and result.message:
task.completion_message = result.message
# Notify WebSocket
if self.ws_manager:
_run_async(
self.ws_manager.broadcast_step_update(
task_id=task_id,
device_id=task.device_id,
step=result.step_count,
action=result.action,
thinking=result.thinking,
finished=result.finished,
success=result.success,
message=result.message,
)
)
return None
def _before_action_callback(self, task_id: str, action: dict) -> Optional[dict]:
"""Callback before executing action.
Args:
task_id: Task ID
action: Action to execute
Returns:
Modified action or None to proceed as-is
"""
# Can be used for action validation/logging
return None
async def stop_task(self, task_id: str):
"""Stop running task.
Args:
task_id: Task ID
"""
with self._lock:
if task_id not in self.active_tasks:
return
task = self.active_tasks[task_id]
task.stop_requested = True
async def get_task_status(self, task_id: str) -> Optional[TaskSchema]:
"""Get task status.
Args:
task_id: Task ID
Returns:
Task schema or None if not found
"""
with self._lock:
if task_id in self.active_tasks:
return self._active_task_to_schema(self.active_tasks[task_id])
elif task_id in self.task_history:
return self.task_history[task_id]
return None
async def list_tasks(self) -> list[TaskSchema]:
"""List all tasks (active and recent).
Returns:
List of task schemas
"""
with self._lock:
active_schemas = [
self._active_task_to_schema(t) for t in self.active_tasks.values()
]
history_schemas = list(self.task_history.values())
return active_schemas + history_schemas
def _active_task_to_schema(self, active_task: ActiveTask) -> TaskSchema:
"""Convert ActiveTask to TaskSchema.
Args:
active_task: Active task
Returns:
Task schema
"""
return TaskSchema(
task_id=active_task.task_id,
device_id=active_task.device_id,
task=active_task.task,
status=active_task.status,
current_step=active_task.current_step,
max_steps=active_task.max_steps,
current_action=active_task.current_action,
thinking=active_task.thinking,
started_at=active_task.started_at,
updated_at=active_task.updated_at,
finished_at=active_task.finished_at,
error=active_task.error,
completion_message=active_task.completion_message,
)
def get_active_task_count(self) -> int:
"""Get number of active tasks.
Returns:
Active task count
"""
with self._lock:
return len(self.active_tasks)

View File

@@ -0,0 +1,338 @@
"""
WebSocket manager for real-time updates.
"""
import asyncio
import json
import uuid
from datetime import datetime
from typing import Any, Dict, Optional
from fastapi import WebSocket
from dashboard.models.ws_messages import (
ScreenshotUpdate,
WSMessage,
WSMessageType,
)
class WebSocketManager:
"""Manage WebSocket connections for real-time updates."""
def __init__(self):
"""Initialize the WebSocket manager."""
self.active_connections: Dict[str, WebSocket] = {}
self.client_subscriptions: Dict[str, set[str]] = {} # client_id -> set of device_ids
async def connect(self, websocket: WebSocket, client_id: Optional[str] = None) -> str:
"""Accept and store connection.
Args:
websocket: WebSocket connection
client_id: Optional client ID (auto-generated if not provided)
Returns:
Client ID
"""
await websocket.accept()
if client_id is None:
client_id = f"client_{uuid.uuid4().hex[:8]}"
self.active_connections[client_id] = websocket
self.client_subscriptions[client_id] = set()
return client_id
def disconnect(self, client_id: str):
"""Remove connection.
Args:
client_id: Client ID
"""
if client_id in self.active_connections:
del self.active_connections[client_id]
if client_id in self.client_subscriptions:
del self.client_subscriptions[client_id]
async def send_to_client(
self, client_id: str, message: WSMessage | Dict[str, Any]
):
"""Send message to specific client.
Args:
client_id: Client ID
message: Message to send
"""
if client_id not in self.active_connections:
return
websocket = self.active_connections[client_id]
# Convert dict to WSMessage if needed
if isinstance(message, dict):
message = WSMessage(
type=WSMessageType(message.get("type", "error")),
data=message.get("data", {}),
timestamp=message.get("timestamp", datetime.now()),
)
try:
await websocket.send_json(message.model_dump(mode="json"))
except Exception:
# Connection may be closed
self.disconnect(client_id)
async def broadcast(self, message: WSMessage | Dict[str, Any]):
"""Broadcast message to all connected clients.
Args:
message: Message to broadcast
"""
# Convert dict to WSMessage if needed
if isinstance(message, dict):
message = WSMessage(
type=WSMessageType(message.get("type", "error")),
data=message.get("data", {}),
timestamp=message.get("timestamp", datetime.now()),
)
# Create list of clients to avoid modification during iteration
clients = list(self.active_connections.items())
for client_id, websocket in clients:
try:
await websocket.send_json(message.model_dump(mode="json"))
except Exception:
self.disconnect(client_id)
async def broadcast_to_device_subscribers(
self, device_id: str, message: WSMessage | Dict[str, Any]
):
"""Broadcast message to clients subscribed to a device.
Args:
device_id: Device ID
message: Message to broadcast
"""
# Convert dict to WSMessage if needed
if isinstance(message, dict):
message = WSMessage(
type=WSMessageType(message.get("type", "error")),
data=message.get("data", {}),
timestamp=message.get("timestamp", datetime.now()),
)
# Find clients subscribed to this device
for client_id, subscriptions in self.client_subscriptions.items():
if device_id in subscriptions or "*" in subscriptions:
await self.send_to_client(client_id, message)
def subscribe_to_device(self, client_id: str, device_id: str):
"""Subscribe client to device updates.
Args:
client_id: Client ID
device_id: Device ID (use "*" for all devices)
"""
if client_id in self.client_subscriptions:
self.client_subscriptions[client_id].add(device_id)
def unsubscribe_from_device(self, client_id: str, device_id: str):
"""Unsubscribe client from device updates.
Args:
client_id: Client ID
device_id: Device ID
"""
if client_id in self.client_subscriptions:
self.client_subscriptions[client_id].discard(device_id)
# Convenience methods for specific message types
async def broadcast_device_update(self, device_id: str, device_data: Dict[str, Any]):
"""Broadcast device update.
Args:
device_id: Device ID
device_data: Device data
"""
await self.broadcast_to_device_subscribers(
device_id,
WSMessage(
type=WSMessageType.DEVICE_UPDATE,
data={"device_id": device_id, **device_data},
),
)
async def broadcast_task_started(self, task_id: str, task_data: Dict[str, Any]):
"""Broadcast task started.
Args:
task_id: Task ID
task_data: Task data
"""
await self.broadcast(
WSMessage(
type=WSMessageType.TASK_STARTED,
data={"task_id": task_id, **task_data},
)
)
async def broadcast_step_update(
self,
task_id: str,
device_id: str,
step: int,
action: Optional[Dict] = None,
thinking: Optional[str] = None,
finished: bool = False,
success: bool = True,
message: Optional[str] = None,
):
"""Broadcast task step update.
Args:
task_id: Task ID
device_id: Device ID
step: Step number
action: Current action
thinking: AI reasoning
finished: Whether task is finished
success: Whether step succeeded
message: Status message
"""
await self.broadcast_to_device_subscribers(
device_id,
WSMessage(
type=WSMessageType.TASK_STEP,
data={
"task_id": task_id,
"device_id": device_id,
"step": step,
"action": action,
"thinking": thinking,
"finished": finished,
"success": success,
"message": message,
},
),
)
async def broadcast_task_completed(
self, task_id: str, device_id: str, status: str, message: Optional[str] = None
):
"""Broadcast task completed.
Args:
task_id: Task ID
device_id: Device ID
status: Task status
message: Completion message
"""
await self.broadcast(
WSMessage(
type=WSMessageType.TASK_COMPLETED,
data={
"task_id": task_id,
"device_id": device_id,
"status": status,
"message": message,
},
)
)
async def broadcast_task_failed(self, task_id: str, device_id: str, error: str):
"""Broadcast task failed.
Args:
task_id: Task ID
device_id: Device ID
error: Error message
"""
await self.broadcast(
WSMessage(
type=WSMessageType.TASK_FAILED,
data={
"task_id": task_id,
"device_id": device_id,
"error": error,
},
)
)
async def broadcast_task_stopped(self, task_id: str, device_id: str):
"""Broadcast task stopped.
Args:
task_id: Task ID
device_id: Device ID
"""
await self.broadcast(
WSMessage(
type=WSMessageType.TASK_STOPPED,
data={
"task_id": task_id,
"device_id": device_id,
},
)
)
async def broadcast_screenshot(
self, device_id: str, screenshot: str, width: int, height: int
):
"""Broadcast screenshot update.
Args:
device_id: Device ID
screenshot: Base64 encoded screenshot
width: Image width
height: Image height
"""
await self.broadcast_to_device_subscribers(
device_id,
WSMessage(
type=WSMessageType.SCREENSHOT,
data={
"device_id": device_id,
"screenshot": screenshot,
"width": width,
"height": height,
},
),
)
async def broadcast_error(self, error: str, details: Optional[Dict] = None):
"""Broadcast error.
Args:
error: Error message
details: Additional error details
"""
await self.broadcast(
WSMessage(
type=WSMessageType.ERROR,
data={
"error": error,
"details": details,
},
)
)
def get_connection_count(self) -> int:
"""Get number of active connections.
Returns:
Connection count
"""
return len(self.active_connections)
def get_client_ids(self) -> list[str]:
"""Get list of connected client IDs.
Returns:
List of client IDs
"""
return list(self.active_connections.keys())

View File

@@ -0,0 +1,539 @@
/* AutoGLM Dashboard Styles */
:root {
--primary-color: #6366f1;
--primary-hover: #4f46e5;
--success-color: #10b981;
--warning-color: #f59e0b;
--danger-color: #ef4444;
--bg-color: #0f172a;
--card-bg: #1e293b;
--border-color: #334155;
--text-primary: #f1f5f9;
--text-secondary: #94a3b8;
--shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.3);
--shadow-lg: 0 10px 15px -3px rgba(0, 0, 0, 0.4);
}
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
background-color: var(--bg-color);
color: var(--text-primary);
line-height: 1.6;
min-height: 100vh;
}
/* Header */
.header {
background-color: var(--card-bg);
border-bottom: 1px solid var(--border-color);
padding: 1rem 2rem;
display: flex;
justify-content: space-between;
align-items: center;
position: sticky;
top: 0;
z-index: 100;
box-shadow: var(--shadow);
}
.header-content {
display: flex;
align-items: center;
gap: 2rem;
}
.header h1 {
font-size: 1.5rem;
font-weight: 600;
color: var(--text-primary);
}
.stats {
display: flex;
gap: 1.5rem;
}
.stat {
display: flex;
align-items: center;
gap: 0.5rem;
font-size: 0.875rem;
color: var(--text-secondary);
}
.stat svg {
opacity: 0.7;
}
.ws-status {
color: var(--danger-color);
}
.ws-status.connected {
color: var(--success-color);
}
.header-actions {
display: flex;
gap: 0.75rem;
}
/* Main Content */
.main-content {
padding: 2rem;
max-width: 1600px;
margin: 0 auto;
}
.main-content h2 {
font-size: 1.25rem;
font-weight: 600;
margin-bottom: 1rem;
color: var(--text-primary);
}
/* Device Grid */
.device-grid {
display: grid;
grid-template-columns: repeat(auto-fill, minmax(350px, 1fr));
gap: 1.5rem;
margin-bottom: 3rem;
}
.device-card {
background-color: var(--card-bg);
border: 1px solid var(--border-color);
border-radius: 12px;
padding: 1.25rem;
transition: box-shadow 0.2s, border-color 0.2s;
}
.device-card:hover {
box-shadow: var(--shadow-lg);
}
.device-card.busy {
border-color: var(--warning-color);
}
.device-card.offline {
opacity: 0.6;
}
.device-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 1rem;
}
.device-header h3 {
font-size: 1rem;
font-weight: 600;
color: var(--text-primary);
word-break: break-all;
}
.status-badge {
padding: 0.25rem 0.75rem;
border-radius: 9999px;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
}
.status-badge.online {
background-color: rgba(16, 185, 129, 0.2);
color: var(--success-color);
}
.status-badge.offline {
background-color: rgba(239, 68, 68, 0.2);
color: var(--danger-color);
}
.status-badge.busy {
background-color: rgba(245, 158, 11, 0.2);
color: var(--warning-color);
}
.device-info {
margin-bottom: 1rem;
}
.device-info p {
font-size: 0.875rem;
color: var(--text-secondary);
margin-bottom: 0.25rem;
display: flex;
align-items: center;
gap: 0.5rem;
}
.device-info p svg {
flex-shrink: 0;
}
.screenshot {
margin-bottom: 1rem;
border-radius: 8px;
overflow: hidden;
background-color: #000;
aspect-ratio: 9/16;
}
.screenshot img {
width: 100%;
height: 100%;
object-fit: contain;
}
.device-actions {
display: flex;
gap: 0.5rem;
}
.device-actions input {
flex: 1;
padding: 0.625rem 0.875rem;
background-color: var(--bg-color);
border: 1px solid var(--border-color);
border-radius: 8px;
color: var(--text-primary);
font-size: 0.875rem;
}
.device-actions input:focus {
outline: none;
border-color: var(--primary-color);
}
.device-actions input:disabled {
opacity: 0.5;
cursor: not-allowed;
}
/* Task List */
.task-list {
display: flex;
flex-direction: column;
gap: 0.75rem;
}
.task-item {
background-color: var(--card-bg);
border: 1px solid var(--border-color);
border-radius: 8px;
padding: 1rem;
display: flex;
justify-content: space-between;
align-items: start;
gap: 1rem;
transition: border-color 0.2s;
}
.task-item.active {
border-color: var(--primary-color);
box-shadow: 0 0 0 1px var(--primary-color);
}
.task-info {
flex: 1;
}
.task-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 0.5rem;
}
.task-id {
font-size: 0.75rem;
color: var(--text-secondary);
font-family: monospace;
}
.task-status {
padding: 0.125rem 0.5rem;
border-radius: 4px;
font-size: 0.7rem;
font-weight: 600;
text-transform: uppercase;
}
.task-status.running {
background-color: rgba(99, 102, 241, 0.2);
color: var(--primary-color);
}
.task-status.completed {
background-color: rgba(16, 185, 129, 0.2);
color: var(--success-color);
}
.task-status.failed {
background-color: rgba(239, 68, 68, 0.2);
color: var(--danger-color);
}
.task-status.stopped {
background-color: rgba(148, 163, 184, 0.2);
color: var(--text-secondary);
}
.task-description {
font-size: 0.95rem;
color: var(--text-primary);
margin-bottom: 0.5rem;
}
.task-meta {
display: flex;
gap: 1rem;
font-size: 0.8rem;
color: var(--text-secondary);
}
.task-progress {
height: 4px;
background-color: var(--bg-color);
border-radius: 2px;
overflow: hidden;
margin-top: 0.5rem;
}
.progress-bar {
height: 100%;
background-color: var(--primary-color);
transition: width 0.3s ease;
}
.task-thinking {
display: flex;
align-items: start;
gap: 0.5rem;
margin-top: 0.5rem;
padding: 0.5rem;
background-color: var(--bg-color);
border-radius: 6px;
font-size: 0.85rem;
color: var(--text-secondary);
}
.task-thinking svg {
flex-shrink: 0;
margin-top: 2px;
}
.task-action {
display: flex;
align-items: center;
gap: 0.5rem;
margin-top: 0.5rem;
padding: 0.5rem;
background-color: rgba(99, 102, 241, 0.1);
border-radius: 6px;
font-size: 0.85rem;
color: var(--primary-color);
}
.task-action svg {
flex-shrink: 0;
}
.task-action strong {
color: var(--text-primary);
}
.task-message {
display: flex;
align-items: start;
gap: 0.5rem;
margin-top: 0.5rem;
padding: 0.5rem;
border-radius: 6px;
font-size: 0.85rem;
line-height: 1.4;
}
.task-message svg {
flex-shrink: 0;
margin-top: 2px;
}
.task-message.completed {
background-color: rgba(16, 185, 129, 0.1);
color: var(--success-color);
}
.task-message.failed {
background-color: rgba(239, 68, 68, 0.1);
color: var(--danger-color);
}
.task-message.stopped {
background-color: rgba(148, 163, 184, 0.1);
color: var(--text-secondary);
}
.task-actions {
flex-shrink: 0;
}
/* Buttons */
.btn {
padding: 0.625rem 1rem;
border: none;
border-radius: 8px;
font-size: 0.875rem;
font-weight: 500;
cursor: pointer;
transition: all 0.2s;
display: inline-flex;
align-items: center;
gap: 0.5rem;
}
.btn:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.btn-primary {
background-color: var(--primary-color);
color: white;
}
.btn-primary:hover:not(:disabled) {
background-color: var(--primary-hover);
}
.btn-secondary {
background-color: var(--card-bg);
color: var(--text-primary);
border: 1px solid var(--border-color);
}
.btn-secondary:hover:not(:disabled) {
background-color: var(--border-color);
}
.btn-danger {
background-color: var(--danger-color);
color: white;
}
.btn-danger:hover:not(:disabled) {
background-color: #dc2626;
}
.btn-sm {
padding: 0.5rem 0.75rem;
font-size: 0.8rem;
}
.spinning {
animation: spin 1s linear infinite;
}
@keyframes spin {
from { transform: rotate(0deg); }
to { transform: rotate(360deg); }
}
/* Empty State */
.empty-state {
text-align: center;
padding: 4rem 2rem;
color: var(--text-secondary);
}
.empty-state svg {
margin-bottom: 1rem;
opacity: 0.3;
}
.empty-state p {
margin-bottom: 1.5rem;
}
.empty-state .hint {
font-size: 0.875rem;
opacity: 0.7;
}
/* Toast */
.toast-container {
position: fixed;
bottom: 2rem;
right: 2rem;
z-index: 1000;
display: flex;
flex-direction: column;
gap: 0.5rem;
}
.toast {
padding: 0.75rem 1rem;
border-radius: 8px;
font-size: 0.875rem;
box-shadow: var(--shadow-lg);
animation: slideIn 0.3s ease;
}
.toast.success {
background-color: var(--success-color);
color: white;
}
.toast.error {
background-color: var(--danger-color);
color: white;
}
.toast.info {
background-color: var(--primary-color);
color: white;
}
@keyframes slideIn {
from {
transform: translateX(100%);
opacity: 0;
}
to {
transform: translateX(0);
opacity: 1;
}
}
/* Responsive */
@media (max-width: 768px) {
.header {
flex-direction: column;
gap: 1rem;
padding: 1rem;
}
.header-content {
flex-direction: column;
gap: 0.75rem;
align-items: start;
}
.device-grid {
grid-template-columns: 1fr;
}
.main-content {
padding: 1rem;
}
}

219
dashboard/static/index.html Normal file
View File

@@ -0,0 +1,219 @@
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AutoGLM Dashboard</title>
<!-- Vue.js 3 -->
<script src="https://unpkg.com/vue@3/dist/vue.global.js"></script>
<!-- Axios for API requests -->
<script src="https://unpkg.com/axios/dist/axios.min.js"></script>
<!-- CSS -->
<link rel="stylesheet" href="/static/css/dashboard.css">
</head>
<body>
<div id="app">
<!-- Header -->
<header class="header">
<div class="header-content">
<h1>AutoGLM Dashboard</h1>
<div class="stats">
<span class="stat" title="Connected Devices">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<rect x="5" y="2" width="14" height="20" rx="2" ry="2"></rect>
<line x1="12" y1="18" x2="12" y2="18"></line>
</svg>
{{ connectedDeviceCount }} / {{ devices.length }}
</span>
<span class="stat" title="Active Tasks">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<polyline points="22 12 18 12 15 21 9 3 6 12 2 12"></polyline>
</svg>
{{ activeTasks.length }}
</span>
<span class="stat ws-status" :class="{ connected: wsConnected }" title="WebSocket">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<circle cx="12" cy="12" r="10"></circle>
<line x1="2" y1="12" x2="22" y2="12"></line>
<path d="M12 2a15.3 15.3 0 0 1 4 10 15.3 15.3 0 0 1-4 10 15.3 15.3 0 0 1-4-10 15.3 15.3 0 0 1 4-10z"></path>
</svg>
</span>
</div>
</div>
<div class="header-actions">
<button @click="refreshDevices" class="btn btn-secondary" :disabled="refreshing">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" :class="{ spinning: refreshing }">
<polyline points="23 4 23 10 17 10"></polyline>
<polyline points="1 20 1 14 7 14"></polyline>
<path d="M3.51 9a9 9 0 0 1 14.85-3.36L23 10M1 14l4.64 4.36A9 9 0 0 0 20.49 15"></path>
</svg>
Refresh
</button>
</div>
</header>
<!-- Main Content -->
<main class="main-content">
<!-- Device Grid -->
<section class="devices-section">
<h2>Devices</h2>
<div class="device-grid" v-if="devices.length > 0">
<div
class="device-card"
v-for="device in devices"
:key="device.device_id"
:class="{ busy: device.status === 'busy', offline: !device.is_connected }"
>
<div class="device-header">
<h3>{{ device.device_id }}</h3>
<span class="status-badge" :class="device.status">
{{ device.status }}
</span>
</div>
<div class="device-info">
<p v-if="device.model">{{ device.model }}</p>
<p v-if="device.android_version" class="version">
<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path d="M12 2L2 7l10 5 10-5-10-5z"></path>
<path d="M2 17l10 5 10-5"></path>
<path d="M2 12l10 5 10-5"></path>
</svg>
{{ device.android_version }}
</p>
<p v-if="device.current_app" class="app">
<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<rect x="3" y="3" width="18" height="18" rx="2"></rect>
</svg>
{{ formatAppName(device.current_app) }}
</p>
</div>
<div class="screenshot" v-if="device.screenshot">
<img :src="'data:image/png;base64,' + device.screenshot" alt="Screen">
</div>
<div class="device-actions">
<input
v-model="device.taskInput"
type="text"
placeholder="Enter task..."
@keyup.enter="executeTask(device)"
:disabled="device.status === 'busy' || !device.is_connected"
>
<button
@click="executeTask(device)"
class="btn btn-primary"
:disabled="device.status === 'busy' || !device.is_connected || !device.taskInput"
>
{{ device.status === 'busy' ? 'Running...' : 'Execute' }}
</button>
</div>
</div>
</div>
<div class="empty-state" v-else>
<svg width="64" height="64" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1">
<rect x="5" y="2" width="14" height="20" rx="2" ry="2"></rect>
<line x1="12" y1="18" x2="12" y2="18"></line>
</svg>
<p>No devices found</p>
<button @click="refreshDevices" class="btn btn-secondary">Scan for Devices</button>
</div>
</section>
<!-- Task Queue -->
<section class="tasks-section">
<h2>Task Queue</h2>
<div class="task-list" v-if="tasks.length > 0">
<div
class="task-item"
v-for="task in tasks"
:key="task.task_id"
:class="{ active: task.status === 'running' }"
>
<div class="task-info">
<div class="task-header">
<span class="task-id">{{ task.task_id }}</span>
<span class="task-status" :class="task.status">{{ task.status }}</span>
</div>
<p class="task-description">{{ task.task }}</p>
<div class="task-meta">
<span>Device: {{ task.device_id }}</span>
<span>Step: {{ task.current_step }}/{{ task.max_steps }}</span>
</div>
<div class="task-progress" v-if="task.status === 'running'">
<div class="progress-bar" :style="{ width: (task.current_step / task.max_steps * 100) + '%' }"></div>
</div>
<!-- Current Action -->
<div class="task-action" v-if="task.current_action && task.status === 'running'">
<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<polyline points="22 12 18 12 15 21 9 3 6 12 2 12"></polyline>
</svg>
<strong>{{ task.current_action.action }}</strong>
<span v-if="task.current_action.element"> on {{ task.current_action.element }}</span>
<span v-if="task.current_action.text">: "{{ task.current_action.text }}"</span>
</div>
<!-- Thinking -->
<div class="task-thinking" v-if="task.thinking">
<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<circle cx="12" cy="12" r="10"></circle>
<line x1="12" y1="16" x2="12" y2="12"></line>
<line x1="12" y1="8" x2="12.01" y2="8"></line>
</svg>
{{ task.thinking }}
</div>
<!-- Completion Message -->
<div class="task-message" v-if="task.completion_message && (task.status === 'completed' || task.status === 'failed' || task.status === 'stopped')">
<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path v-if="task.status === 'completed'" d="M22 11.08V12a10 10 0 1 1-5.93-9.14"></path>
<polyline v-if="task.status === 'completed'" points="22 4 12 14.01 9 11.01"></polyline>
<circle v-if="task.status === 'failed'" cx="12" cy="12" r="10"></circle>
<line v-if="task.status === 'failed'" x1="15" y1="9" x2="9" y2="15"></line>
<line v-if="task.status === 'failed'" x1="9" y1="9" x2="15" y2="15"></line>
</svg>
{{ task.completion_message }}
</div>
</div>
<div class="task-actions" v-if="task.status === 'running'">
<button @click="stopTask(task)" class="btn btn-danger btn-sm">
<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<rect x="6" y="6" width="12" height="12"></rect>
</svg>
Stop
</button>
</div>
<div class="task-actions" v-else>
<button @click="reExecuteTask(task)" class="btn btn-secondary btn-sm">
<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<polyline points="23 4 23 10 17 10"></polyline>
<polyline points="1 20 1 14 7 14"></polyline>
<path d="M3.51 9a9 9 0 0 1 14.85-3.36L23 10M1 14l4.64 4.36A9 9 0 0 0 20.49 15"></path>
</svg>
Re-execute
</button>
</div>
</div>
</div>
<div class="empty-state" v-else>
<svg width="64" height="64" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1">
<polyline points="22 12 18 12 15 21 9 3 6 12 2 12"></polyline>
</svg>
<p>No tasks yet</p>
<p class="hint">Enter a task above to get started</p>
</div>
</section>
</main>
<!-- Toast notifications -->
<div class="toast-container">
<div
class="toast"
v-for="toast in toasts"
:key="toast.id"
:class="toast.type"
>
{{ toast.message }}
</div>
</div>
</div>
<script src="/static/js/dashboard.js"></script>
</body>
</html>

View File

@@ -0,0 +1,404 @@
/**
* AutoGLM Dashboard - Vue.js Application
*/
const { createApp } = Vue;
createApp({
data() {
return {
devices: [],
tasks: [],
ws: null,
wsConnected: false,
refreshing: false,
toasts: [],
toastIdCounter: 0,
reconnectAttempts: 0,
maxReconnectAttempts: 5,
reconnectDelay: 2000,
};
},
computed: {
activeTasks() {
return this.tasks.filter(t => t.status === 'running');
},
connectedDeviceCount() {
return this.devices.filter(d => d.is_connected).length;
},
},
methods: {
/* API Methods */
async loadDevices() {
try {
const response = await axios.get('/api/devices');
this.devices = response.data.map(d => ({
...d,
taskInput: '',
screenshot: null,
}));
} catch (error) {
this.showToast('Failed to load devices', 'error');
console.error('Error loading devices:', error);
}
},
async loadTasks() {
try {
const response = await axios.get('/api/tasks');
this.tasks = response.data;
} catch (error) {
console.error('Error loading tasks:', error);
}
},
async refreshDevices() {
this.refreshing = true;
try {
await axios.get('/api/devices/refresh');
await this.loadDevices();
this.showToast('Devices refreshed', 'success');
} catch (error) {
this.showToast('Failed to refresh devices', 'error');
} finally {
this.refreshing = false;
}
},
async executeTask(device) {
if (!device.taskInput || !device.taskInput.trim()) {
return;
}
const task = device.taskInput.trim();
device.taskInput = '';
try {
const response = await axios.post('/api/tasks/execute', {
device_id: device.device_id,
task: task,
max_steps: 100,
lang: 'cn',
// Backend will use config from environment
});
this.showToast(`Task started: ${task}`, 'success');
await this.loadTasks();
} catch (error) {
this.showToast('Failed to start task', 'error');
console.error('Error executing task:', error);
}
},
async stopTask(task) {
try {
await axios.post(`/api/tasks/${task.task_id}/stop`);
this.showToast('Stopping task...', 'info');
} catch (error) {
this.showToast('Failed to stop task', 'error');
}
},
async reExecuteTask(task) {
const device = this.devices.find(d => d.device_id === task.device_id);
if (!device) {
this.showToast('Device not found', 'error');
return;
}
if (!device.is_connected) {
this.showToast('Device is not connected', 'error');
return;
}
if (device.status === 'busy') {
this.showToast('Device is busy', 'error');
return;
}
try {
const response = await axios.post('/api/tasks/execute', {
device_id: task.device_id,
task: task.task,
max_steps: task.max_steps || 100,
lang: 'cn',
});
this.showToast(`Task re-executed: ${task.task}`, 'success');
await this.loadTasks();
} catch (error) {
this.showToast('Failed to re-execute task', 'error');
console.error('Error re-executing task:', error);
}
},
async captureScreenshot(deviceId) {
try {
const response = await axios.get(`/api/devices/${deviceId}/screenshot`);
const device = this.devices.find(d => d.device_id === deviceId);
if (device) {
device.screenshot = response.data.screenshot;
}
} catch (error) {
console.error('Error capturing screenshot:', error);
}
},
/* WebSocket Methods */
connectWebSocket() {
const protocol = location.protocol === 'https:' ? 'wss:' : 'ws:';
const wsUrl = `${protocol}//${location.host}/ws`;
this.ws = new WebSocket(wsUrl);
this.ws.onopen = () => {
this.wsConnected = true;
this.reconnectAttempts = 0;
console.log('WebSocket connected');
// Subscribe to all devices updates
this.sendWebSocketMessage({
type: 'subscribe',
device_id: '*', // Subscribe to all devices
});
};
this.ws.onmessage = (event) => {
try {
const msg = JSON.parse(event.data);
this.handleWSMessage(msg);
} catch (error) {
console.error('Error parsing WebSocket message:', error);
}
};
this.ws.onclose = () => {
this.wsConnected = false;
console.log('WebSocket disconnected');
// Attempt to reconnect
if (this.reconnectAttempts < this.maxReconnectAttempts) {
this.reconnectAttempts++;
console.log(`Reconnecting... (${this.reconnectAttempts}/${this.maxReconnectAttempts})`);
setTimeout(() => this.connectWebSocket(), this.reconnectDelay);
} else {
this.showToast('WebSocket connection lost', 'error');
}
};
this.ws.onerror = (error) => {
console.error('WebSocket error:', error);
};
},
sendWebSocketMessage(data) {
if (this.ws && this.ws.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify(data));
}
},
handleWSMessage(msg) {
switch (msg.type) {
case 'device_update':
this.updateDevice(msg.data);
break;
case 'task_started':
this.handleTaskStarted(msg.data);
break;
case 'task_step':
this.handleTaskStep(msg.data);
break;
case 'task_completed':
this.handleTaskCompleted(msg.data);
break;
case 'task_failed':
this.handleTaskFailed(msg.data);
break;
case 'task_stopped':
this.handleTaskStopped(msg.data);
break;
case 'screenshot':
this.handleScreenshot(msg.data);
break;
case 'error':
this.showToast(msg.data.error || 'An error occurred', 'error');
break;
case 'pong':
// Ping response, ignore
break;
default:
console.log('Unknown WebSocket message type:', msg.type);
}
},
/* Message Handlers */
updateDevice(data) {
const device = this.devices.find(d => d.device_id === data.device_id);
if (device) {
Object.assign(device, {
status: data.status,
is_connected: data.is_connected,
model: data.model,
current_app: data.current_app,
});
}
},
handleTaskStarted(data) {
this.tasks.unshift({
task_id: data.task_id,
device_id: data.device_id,
task: data.task,
status: 'running',
current_step: 0,
max_steps: data.max_steps || 100,
current_action: null,
thinking: null,
started_at: new Date(),
});
// Update device status
const device = this.devices.find(d => d.device_id === data.device_id);
if (device) {
device.status = 'busy';
}
},
handleTaskStep(data) {
const task = this.tasks.find(t => t.task_id === data.task_id);
if (task) {
task.current_step = data.step;
task.current_action = data.action;
task.thinking = data.thinking;
// Store the completion message
if (data.message && data.finished) {
task.completion_message = data.message;
}
if (data.finished) {
task.status = data.success ? 'completed' : 'failed';
this.releaseDevice(data.device_id);
}
}
},
handleTaskCompleted(data) {
const task = this.tasks.find(t => t.task_id === data.task_id);
if (task) {
task.status = 'completed';
task.finished_at = new Date();
// Store completion message
if (data.message) {
task.completion_message = data.message;
}
this.releaseDevice(data.device_id);
this.showToast(`Task completed: ${task.task}`, 'success');
}
},
handleTaskFailed(data) {
const task = this.tasks.find(t => t.task_id === data.task_id);
if (task) {
task.status = 'failed';
task.error = data.error;
task.finished_at = new Date();
this.releaseDevice(data.device_id);
this.showToast(`Task failed: ${data.error}`, 'error');
}
},
handleTaskStopped(data) {
const task = this.tasks.find(t => t.task_id === data.task_id);
if (task) {
task.status = 'stopped';
task.finished_at = new Date();
this.releaseDevice(data.device_id);
this.showToast('Task stopped', 'info');
}
},
handleScreenshot(data) {
const device = this.devices.find(d => d.device_id === data.device_id);
if (device) {
device.screenshot = data.screenshot;
}
},
releaseDevice(deviceId) {
const device = this.devices.find(d => d.device_id === deviceId);
if (device && device.status === 'busy') {
device.status = 'online';
}
},
/* Utility Methods */
formatAppName(packageName) {
if (!packageName) return '';
// Format package name for display
const parts = packageName.split('.');
return parts[parts.length - 1] || packageName;
},
showToast(message, type = 'info') {
const id = this.toastIdCounter++;
this.toasts.push({ id, message, type });
// Auto-remove after 3 seconds
setTimeout(() => {
this.toasts = this.toasts.filter(t => t.id !== id);
}, 3000);
},
/* Heartbeat */
startHeartbeat() {
setInterval(() => {
if (this.ws && this.ws.readyState === WebSocket.OPEN) {
this.sendWebSocketMessage({ type: 'ping' });
}
}, 30000); // Ping every 30 seconds
},
},
mounted() {
// Load initial data
this.loadDevices();
this.loadTasks();
// Connect WebSocket
this.connectWebSocket();
// Start heartbeat
this.startHeartbeat();
// Refresh devices periodically
setInterval(() => {
if (!this.refreshing) {
this.loadDevices();
}
}, 10000); // Every 10 seconds
// Refresh tasks periodically
setInterval(() => {
this.loadTasks();
}, 5000); // Every 5 seconds
},
beforeUnmount() {
// Close WebSocket on unmount
if (this.ws) {
this.ws.close();
}
},
}).mount('#app');

415
examples/callback_hooks.py Normal file
View File

@@ -0,0 +1,415 @@
#!/usr/bin/env python3
"""
Phone Agent Callback Hooks Examples / Phone Agent 回调钩子示例
Demonstrates how to use callback hooks to control task execution.
演示如何使用回调钩子来控制任务执行。
Features / 功能:
- Interrupt tasks during execution / 在执行过程中中断任务
- Switch to new tasks dynamically / 动态切换到新任务
- Modify actions before execution / 在执行前修改操作
- Collect execution statistics / 收集执行统计信息
Configuration / 配置:
This script loads settings from .env file (if present).
本脚本会从 .env 文件加载配置(如果存在)。
"""
import threading
import time
from collections import defaultdict
from dotenv import load_dotenv
# Load .env file for configuration
# 加载 .env 配置文件
load_dotenv()
from phone_agent import PhoneAgent
from phone_agent.agent import AgentConfig, StepResult
from phone_agent.config import get_messages
from phone_agent.model import ModelConfig
def _create_agent_with_callbacks(
lang: str = "cn",
step_callback=None,
before_action_callback=None,
) -> PhoneAgent:
"""
Helper function to create PhoneAgent with config from environment variables.
从环境变量创建配置的 PhoneAgent 辅助函数。
"""
import os
model_config = ModelConfig(
base_url=os.getenv("PHONE_AGENT_BASE_URL", "http://localhost:8000/v1"),
model_name=os.getenv("PHONE_AGENT_MODEL", "autoglm-phone-9b"),
api_key=os.getenv("PHONE_AGENT_API_KEY", "EMPTY"),
)
agent_config = AgentConfig(
max_steps=int(os.getenv("PHONE_AGENT_MAX_STEPS", "100")),
device_id=os.getenv("PHONE_AGENT_DEVICE_ID"),
lang=lang,
step_callback=step_callback,
before_action_callback=before_action_callback,
)
return PhoneAgent(model_config=model_config, agent_config=agent_config)
# ============================================================================
# Example 1: Interrupt after max steps / 示例1达到最大步数后中断
# ============================================================================
def example_interrupt_after_steps(lang: str = "cn"):
"""Interrupt task after reaching maximum steps / 达到最大步数后中断任务"""
msgs = get_messages(lang)
max_allowed_steps = 5
steps_taken = []
def step_callback(result: StepResult) -> str | None:
"""Called after each step / 每步执行后调用"""
steps_taken.append(result.step_count)
print(f"[Callback] Step {result.step_count}: {result.action}")
# Interrupt after reaching max steps
if result.step_count >= max_allowed_steps:
return "stop"
return None
agent = _create_agent_with_callbacks(lang=lang, step_callback=step_callback)
print(f"\n{'='*50}")
print(f"Task will be interrupted after {max_allowed_steps} steps")
print(f"{'='*50}\n")
result = agent.run("打开小红书浏览美食内容")
print(f"\n{'='*50}")
print(f"Task interrupted after {len(steps_taken)} steps")
print(f"{'='*50}")
# ============================================================================
# Example 2: Switch tasks dynamically / 示例2动态切换任务
# ============================================================================
def example_switch_tasks(lang: str = "cn"):
"""Switch to new task based on conditions / 根据条件切换到新任务"""
msgs = get_messages(lang)
task_queue = ["打开微信", "打开淘宝", "打开美团"]
current_task_index = [0] # Use list to allow modification in closure
def step_callback(result: StepResult) -> str | None:
"""Switch to next task after current one completes"""
# Check if current task is done
if result.finished and current_task_index[0] < len(task_queue):
next_task = task_queue[current_task_index[0]]
current_task_index[0] += 1
print(f"\n[Callback] Switching to task: {next_task}\n")
return next_task
return None
agent = _create_agent_with_callbacks(lang=lang, step_callback=step_callback)
print(f"\n{'='*50}")
print(f"Task queue: {' -> '.join(task_queue)}")
print(f"{'='*50}\n")
# Start with first task
result = agent.run(task_queue.pop(0))
# ============================================================================
# Example 3: Interactive task control / 示例3交互式任务控制
# ============================================================================
def example_interactive_control(lang: str = "cn"):
"""Control task execution with user input / 通过用户输入控制任务执行"""
import sys
from select import select
msgs = get_messages(lang)
class TaskController:
def __init__(self):
self.new_task = None
self.should_stop = False
self._thread = threading.Thread(target=self._input_listener, daemon=True)
self._thread.start()
def _input_listener(self):
"""Listen for user input in background"""
print("\n[Interactive Control]")
print(" Enter 's' + Enter to stop")
print(" Enter 'n:<task>' + Enter to switch to new task")
print("-" * 40)
while True:
try:
# Non-blocking input check
if select([sys.stdin], [], [], 0.1)[0]:
cmd = sys.stdin.readline().strip()
if cmd.lower() == 's':
self.should_stop = True
print("\n[Control] STOP requested\n")
elif cmd.lower().startswith('n:'):
self.new_task = cmd[2:]
print(f"\n[Control] New task: {self.new_task}\n")
except (IOError, EOFError):
break
def step_callback(self, result: StepResult) -> str | None:
"""Check control flags after each step"""
if self.should_stop:
return "stop"
if self.new_task:
task = self.new_task
self.new_task = None
return task
return None
controller = TaskController()
agent = _create_agent_with_callbacks(lang=lang, step_callback=controller.step_callback)
print(f"\n{'='*50}")
print(f"Interactive Control Mode")
print(f"{'='*50}\n")
result = agent.run("打开抖音浏览视频")
# ============================================================================
# Example 4: Modify actions before execution / 示例4在执行前修改操作
# ============================================================================
def example_modify_actions(lang: str = "cn"):
"""Modify actions before they are executed / 在操作执行前进行修改"""
msgs = get_messages(lang)
action_log = []
def before_action_callback(action: dict) -> dict | None:
"""Called before executing each action / 每个操作执行前调用"""
action_log.append(action.copy())
# Example: Prevent launching certain apps
if action.get("action") == "Launch":
app = action.get("app", "")
if "游戏" in app or "Game" in app:
print(f"[Callback] Blocked launching: {app}")
# Replace with a safe action
return {"action": "Note", "message": f"Skipped {app}"}
# Example: Add delay after Tap actions
if action.get("action") == "Tap":
print(f"[Callback] Tap at {action.get('element')}")
# Return None to proceed with original action
return None
agent = _create_agent_with_callbacks(lang=lang, before_action_callback=before_action_callback)
print(f"\n{'='*50}")
print(f"Actions will be logged and filtered")
print(f"{'='*50}\n")
result = agent.run("打开微信查看朋友圈")
print(f"\n{'='*50}")
print(f"Total actions executed: {len(action_log)}")
print(f"{'='*50}")
# ============================================================================
# Example 5: Collect statistics / 示例5收集统计信息
# ============================================================================
def example_collect_statistics(lang: str = "cn"):
"""Collect execution statistics / 收集执行统计信息"""
msgs = get_messages(lang)
stats = {
"actions": defaultdict(int),
"total_steps": 0,
"errors": 0,
}
def before_action_callback(action: dict) -> dict | None:
"""Count action types"""
action_type = action.get("action", "Unknown")
stats["actions"][action_type] += 1
return None
def step_callback(result: StepResult) -> str | None:
"""Collect step statistics"""
stats["total_steps"] = result.step_count
if not result.success:
stats["errors"] += 1
return None
agent = _create_agent_with_callbacks(
lang=lang,
before_action_callback=before_action_callback,
step_callback=step_callback,
)
print(f"\n{'='*50}")
print(f"Collecting execution statistics...")
print(f"{'='*50}\n")
result = agent.run("打开美团搜索附近的餐厅")
print(f"\n{'='*50}")
print(f"Execution Statistics:")
print(f" Total steps: {stats['total_steps']}")
print(f" Errors: {stats['errors']}")
print(f"\n Action breakdown:")
for action_type, count in sorted(stats["actions"].items()):
print(f" {action_type}: {count}")
print(f"{'='*50}")
# ============================================================================
# Example 6: Task queue with priority / 示例6优先级任务队列
# ============================================================================
def example_priority_task_queue(lang: str = "cn"):
"""Implement priority-based task switching / 实现基于优先级的任务切换"""
from queue import PriorityQueue
from dataclasses import dataclass, field
@dataclass(order=True)
class PrioritizedTask:
priority: int
task: str = field(compare=False)
task_queue = PriorityQueue()
task_queue.put(PrioritizedTask(priority=2, task="打开微信发消息"))
task_queue.put(PrioritizedTask(priority=1, task="打开淘宝")) # Higher priority
task_queue.put(PrioritizedTask(priority=3, task="打开美团"))
def step_callback(result: StepResult) -> str | None:
"""Switch to higher priority task when available"""
if result.finished and not task_queue.empty():
next_task = task_queue.get()
print(f"\n[Callback] Switching to priority {next_task.priority} task: {next_task.task}\n")
return next_task.task
return None
agent = _create_agent_with_callbacks(lang=lang, step_callback=step_callback)
print(f"\n{'='*50}")
print(f"Priority Task Queue Mode")
print(f"{'='*50}\n")
# Start with initial task
result = agent.run("打开抖音")
# ============================================================================
# Example 7: Timeout control / 示例7超时控制
# ============================================================================
def example_timeout_control(lang: str = "cn"):
"""Interrupt task if it takes too long / 任务耗时过长则中断"""
msgs = get_messages(lang)
start_time = [time.time()]
timeout_seconds = 30
def step_callback(result: StepResult) -> str | None:
"""Check if task has timed out"""
elapsed = time.time() - start_time[0]
print(f"[Callback] Elapsed time: {elapsed:.1f}s")
if elapsed > timeout_seconds:
print(f"\n[Callback] Task timeout after {elapsed:.1f}s\n")
return "stop"
return None
agent = _create_agent_with_callbacks(lang=lang, step_callback=step_callback)
print(f"\n{'='*50}")
print(f"Task will timeout after {timeout_seconds} seconds")
print(f"{'='*50}\n")
result = agent.run("打开小红书浏览内容")
# ============================================================================
# Example 8: Conditional action blocking / 示例8条件性操作拦截
# ============================================================================
def example_conditional_blocking(lang: str = "cn"):
"""Block actions based on conditions / 根据条件拦截操作"""
msgs = get_messages(lang)
blocked_apps = ["王者荣耀", "和平精英", "抖音"] # Apps to block
def before_action_callback(action: dict) -> dict | None:
"""Block launching certain apps"""
if action.get("action") == "Launch":
app = action.get("app", "")
if any(blocked in app for blocked in blocked_apps):
print(f"[Callback] BLOCKED app launch: {app}")
# Return finish action to stop
return {"action": "finish", "message": f"App {app} is blocked"}
return None
agent = _create_agent_with_callbacks(lang=lang, before_action_callback=before_action_callback)
print(f"\n{'='*50}")
print(f"Blocked apps: {', '.join(blocked_apps)}")
print(f"{'='*50}\n")
result = agent.run("打开抖音刷视频")
# ============================================================================
# Main / 主程序
# ============================================================================
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="Phone Agent Callback Hooks Examples")
parser.add_argument(
"--example", "-e",
type=int,
default=1,
choices=range(1, 9),
help="Example number to run (1-8)",
)
parser.add_argument(
"--lang",
type=str,
default="cn",
choices=["cn", "en"],
help="Language for UI messages (cn=Chinese, en=English)",
)
args = parser.parse_args()
examples = {
1: ("Interrupt after max steps", example_interrupt_after_steps),
2: ("Switch tasks dynamically", example_switch_tasks),
3: ("Interactive task control", example_interactive_control),
4: ("Modify actions before execution", example_modify_actions),
5: ("Collect statistics", example_collect_statistics),
6: ("Priority task queue", example_priority_task_queue),
7: ("Timeout control", example_timeout_control),
8: ("Conditional action blocking", example_conditional_blocking),
}
name, func = examples[args.example]
print(f"\n{'='*50}")
print(f"Example {args.example}: {name}")
print(f"{'='*50}")
func(args.lang)
print(f"\n✓ Example {args.example} completed\n")

View File

@@ -0,0 +1,199 @@
#!/usr/bin/env python3
"""
带回调钩子的命令行工具 / Command-line tool with callback hooks
支持通过命令行参数配置回调钩子,实现任务中断和切换。
Configuration / 配置:
Loads settings from .env file (if present).
从 .env 文件加载配置(如果存在)。
"""
import argparse
import os
import sys
from dotenv import load_dotenv
# Load .env file for configuration
# 加载 .env 配置文件
load_dotenv()
from phone_agent import PhoneAgent, AgentConfig
from phone_agent.model import ModelConfig
from phone_agent.config import get_messages
def create_step_callback(max_steps: int | None = None, lang: str = "cn"):
"""创建步数限制回调"""
if max_steps is None:
return None
def callback(result):
if result.step_count >= max_steps:
return "stop"
return None
return callback
def main():
parser = argparse.ArgumentParser(
description="Phone Agent with Callback Hooks",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# 基本用法(与 main.py 相同)
python run_with_callbacks.py "打开微信"
# 限制最大执行步数
python run_with_callbacks.py "打开微信" --max-steps 5
# 使用智谱 API
python run_with_callbacks.py "打开微信" \\
--base-url https://open.bigmodel.cn/api/paas/v4 \\
--model autoglm-phone \\
--apikey your-key
# 交互模式(输入新任务可切换)
python run_with_callbacks.py --interactive
"""
)
# 模型配置
parser.add_argument("--base-url", default=os.getenv("PHONE_AGENT_BASE_URL", "http://localhost:8000/v1"), help="模型 API 地址")
parser.add_argument("--model", default=os.getenv("PHONE_AGENT_MODEL", "autoglm-phone-9b"), help="模型名称")
parser.add_argument("--apikey", default=os.getenv("PHONE_AGENT_API_KEY", "EMPTY"), help="API 密钥")
# Agent 配置
parser.add_argument("--max-steps", type=int, default=int(os.getenv("PHONE_AGENT_MAX_STEPS", "100")), help="最大执行步数")
parser.add_argument("--device-id", default=os.getenv("PHONE_AGENT_DEVICE_ID"), help="设备 ID")
parser.add_argument("--lang", choices=["cn", "en"], default=os.getenv("PHONE_AGENT_LANG", "cn"), help="语言")
# 回调配置
parser.add_argument("--callback-max-steps", type=int, help="回调强制中断的步数")
parser.add_argument("--interactive", action="store_true", help="交互模式(支持动态切换任务)")
# 任务参数
parser.add_argument("task", nargs="?", help="要执行的任务")
args = parser.parse_args()
# 创建模型配置
model_config = ModelConfig(
base_url=args.base_url,
model_name=args.model,
api_key=args.apikey,
lang=args.lang,
)
# 创建回调
step_callback = create_step_callback(args.callback_max_steps, args.lang)
# 交互模式回调
if args.interactive:
import threading
import queue
task_queue = queue.Queue()
input_ready = threading.Event()
def input_listener():
"""后台监听用户输入"""
print("\n[交互模式] 在任务执行时可以输入:")
print(" 's' + Enter - 停止当前任务")
print(" 'n:新任务' + Enter - 切换到新任务")
print("-" * 50)
while True:
try:
cmd = input().strip()
if cmd.lower() == 's':
task_queue.put("stop")
elif cmd.lower().startswith('n:'):
task_queue.put(cmd[2:])
except (EOFError, KeyboardInterrupt):
break
threading.Thread(target=input_listener, daemon=True).start()
def interactive_callback(result):
"""交互式回调"""
try:
cmd = task_queue.get_nowait()
if cmd == "stop":
return "stop"
return cmd
except queue.Empty:
return None
step_callback = interactive_callback
# 创建 Agent 配置
agent_config = AgentConfig(
max_steps=args.max_steps,
device_id=args.device_id,
lang=args.lang,
step_callback=step_callback,
)
# 创建 Agent
agent = PhoneAgent(model_config=model_config, agent_config=agent_config)
# 打印配置信息
msgs = get_messages(args.lang)
print("=" * 50)
print("Phone Agent with Callback Hooks")
print("=" * 50)
print(f"Model: {args.model}")
print(f"Base URL: {args.base_url}")
print(f"Max Steps: {args.max_steps}")
if args.callback_max_steps:
print(f"Callback Max Steps: {args.callback_max_steps} (强制中断)")
print("=" * 50)
# 执行任务
if args.task:
print(f"\nTask: {args.task}\n")
result = agent.run(args.task)
print(f"\nResult: {result}")
elif args.interactive:
# 交互模式循环
print("\n输入任务 (或 'quit' 退出):\n")
while True:
try:
task = input("> ").strip()
if task.lower() in ("quit", "exit", "q"):
break
if not task:
continue
print(f"\n执行: {task}\n")
result = agent.run(task)
print(f"\n结果: {result}\n")
agent.reset()
except KeyboardInterrupt:
print("\n\nGoodbye!")
break
else:
# 默认交互模式
print("\n输入任务 (或 'quit' 退出):\n")
while True:
try:
task = input("> ").strip()
if task.lower() in ("quit", "exit", "q"):
break
if not task:
continue
result = agent.run(task)
print(f"\n结果: {result}\n")
agent.reset()
except KeyboardInterrupt:
print("\n\nGoodbye!")
break
if __name__ == "__main__":
main()

View File

@@ -11,6 +11,9 @@ Environment Variables:
PHONE_AGENT_API_KEY: API key for model authentication (default: EMPTY)
PHONE_AGENT_MAX_STEPS: Maximum steps per task (default: 100)
PHONE_AGENT_DEVICE_ID: ADB device ID for multi-device setups
Configuration Files:
.env: Automatically loaded from current directory or project root
"""
import argparse
@@ -20,8 +23,12 @@ import subprocess
import sys
from urllib.parse import urlparse
from dotenv import load_dotenv
from openai import OpenAI
# Load .env file if exists (search in current directory and project root)
load_dotenv()
from phone_agent import PhoneAgent
from phone_agent.agent import AgentConfig
from phone_agent.agent_ios import IOSAgentConfig, IOSPhoneAgent

View File

@@ -5,8 +5,14 @@ This package provides tools for automating Android and iOS phone interactions
using AI models for visual understanding and decision making.
"""
from phone_agent.agent import PhoneAgent
from phone_agent.agent_ios import IOSPhoneAgent
from phone_agent.agent import AgentConfig, PhoneAgent, StepResult
from phone_agent.agent_ios import IOSAgentConfig, IOSPhoneAgent
__version__ = "0.1.0"
__all__ = ["PhoneAgent", "IOSPhoneAgent"]
__all__ = [
"PhoneAgent",
"IOSPhoneAgent",
"AgentConfig",
"IOSAgentConfig",
"StepResult",
]

View File

@@ -22,6 +22,10 @@ class AgentConfig:
lang: str = "cn"
system_prompt: str | None = None
verbose: bool = True
step_callback: Callable[["StepResult"], str | None] | None = None
"""Callback after each step. Return 'stop' to interrupt, or a new task string to switch."""
before_action_callback: Callable[[dict[str, Any]], dict[str, Any] | None] | None = None
"""Callback before executing action. Return modified action dict, or None to proceed as-is."""
def __post_init__(self):
if self.system_prompt is None:
@@ -37,6 +41,7 @@ class StepResult:
action: dict[str, Any] | None
thinking: str
message: str | None = None
step_count: int = 0
class PhoneAgent:
@@ -52,12 +57,29 @@ class PhoneAgent:
confirmation_callback: Optional callback for sensitive action confirmation.
takeover_callback: Optional callback for takeover requests.
Callbacks in agent_config:
step_callback: Called after each step with StepResult.
- Return 'stop' to interrupt the task
- Return a new task string to switch tasks
- Return None to continue normally
before_action_callback: Called before executing an action with the action dict.
- Return modified action dict to override
- Return None to execute the original action
Example:
>>> from phone_agent import PhoneAgent
>>> from phone_agent import PhoneAgent, AgentConfig
>>> from phone_agent.model import ModelConfig
>>>
>>> # With callback
>>> def on_step(result):
... if result.step_count > 10:
... return "stop" # Interrupt after 10 steps
... return None
>>>
>>> model_config = ModelConfig(base_url="http://localhost:8000/v1")
>>> agent = PhoneAgent(model_config)
>>> agent_config = AgentConfig(step_callback=on_step)
>>> agent = PhoneAgent(model_config, agent_config)
>>> agent.run("Open WeChat and send a message to John")
"""
@@ -184,6 +206,7 @@ class PhoneAgent:
action=None,
thinking="",
message=f"Model error: {e}",
step_count=self._step_count,
)
# Parse action from response
@@ -204,6 +227,16 @@ class PhoneAgent:
# Remove image from context to save space
self._context[-1] = MessageBuilder.remove_images_from_message(self._context[-1])
# Before action callback - allow modifying or intercepting action
if self.agent_config.before_action_callback is not None:
try:
modified_action = self.agent_config.before_action_callback(action)
if modified_action is not None:
action = modified_action
except Exception as e:
if self.agent_config.verbose:
print(f"Warning: before_action_callback error: {e}")
# Execute action
try:
result = self.action_handler.execute(
@@ -234,14 +267,38 @@ class PhoneAgent:
)
print("=" * 50 + "\n")
return StepResult(
# Build step result
step_result = StepResult(
success=result.success,
finished=finished,
action=action,
thinking=response.thinking,
message=result.message or action.get("message"),
step_count=self._step_count,
)
# Step callback - allow interrupting or switching tasks
if self.agent_config.step_callback is not None and not finished:
try:
callback_result = self.agent_config.step_callback(step_result)
if callback_result == "stop":
# Interrupt the task
if self.agent_config.verbose:
print("\n⏹ Task interrupted by callback\n")
step_result.finished = True
return step_result
elif isinstance(callback_result, str):
# Switch to new task
if self.agent_config.verbose:
print(f"\n🔄 Switching to new task: {callback_result}\n")
self.reset()
return self._execute_step(callback_result, is_first=True)
except Exception as e:
if self.agent_config.verbose:
print(f"Warning: step_callback error: {e}")
return step_result
@property
def context(self) -> list[dict[str, Any]]:
"""Get the current conversation context."""

View File

@@ -24,6 +24,10 @@ class IOSAgentConfig:
lang: str = "cn"
system_prompt: str | None = None
verbose: bool = True
step_callback: Callable[[Any], str | None] | None = None
"""Callback after each step. Return 'stop' to interrupt, or a new task string to switch."""
before_action_callback: Callable[[dict[str, Any]], dict[str, Any] | None] | None = None
"""Callback before executing action. Return modified action dict, or None to proceed as-is."""
def __post_init__(self):
if self.system_prompt is None:
@@ -39,6 +43,7 @@ class StepResult:
action: dict[str, Any] | None
thinking: str
message: str | None = None
step_count: int = 0
class IOSPhoneAgent:
@@ -54,12 +59,28 @@ class IOSPhoneAgent:
confirmation_callback: Optional callback for sensitive action confirmation.
takeover_callback: Optional callback for takeover requests.
Callbacks in agent_config:
step_callback: Called after each step with StepResult.
- Return 'stop' to interrupt the task
- Return a new task string to switch tasks
- Return None to continue normally
before_action_callback: Called before executing an action with the action dict.
- Return modified action dict to override
- Return None to execute the original action
Example:
>>> from phone_agent.agent_ios import IOSPhoneAgent, IOSAgentConfig
>>> from phone_agent.model import ModelConfig
>>>
>>> # With callback
>>> def on_step(result):
... if result.step_count > 10:
... return "stop" # Interrupt after 10 steps
... return None
>>>
>>> model_config = ModelConfig(base_url="http://localhost:8000/v1")
>>> agent_config = IOSAgentConfig(wda_url="http://localhost:8100")
>>> agent_config = IOSAgentConfig(wda_url="http://localhost:8100", step_callback=on_step)
>>> agent = IOSPhoneAgent(model_config, agent_config)
>>> agent.run("Open Safari and search for Apple")
"""
@@ -203,6 +224,7 @@ class IOSPhoneAgent:
action=None,
thinking="",
message=f"Model error: {e}",
step_count=self._step_count,
)
# Parse action from response
@@ -228,6 +250,16 @@ class IOSPhoneAgent:
# Remove image from context to save space
self._context[-1] = MessageBuilder.remove_images_from_message(self._context[-1])
# Before action callback - allow modifying or intercepting action
if self.agent_config.before_action_callback is not None:
try:
modified_action = self.agent_config.before_action_callback(action)
if modified_action is not None:
action = modified_action
except Exception as e:
if self.agent_config.verbose:
print(f"Warning: before_action_callback error: {e}")
# Execute action
try:
result = self.action_handler.execute(
@@ -258,14 +290,38 @@ class IOSPhoneAgent:
)
print("=" * 50 + "\n")
return StepResult(
# Build step result
step_result = StepResult(
success=result.success,
finished=finished,
action=action,
thinking=response.thinking,
message=result.message or action.get("message"),
step_count=self._step_count,
)
# Step callback - allow interrupting or switching tasks
if self.agent_config.step_callback is not None and not finished:
try:
callback_result = self.agent_config.step_callback(step_result)
if callback_result == "stop":
# Interrupt the task
if self.agent_config.verbose:
print("\n⏹ Task interrupted by callback\n")
step_result.finished = True
return step_result
elif isinstance(callback_result, str):
# Switch to new task
if self.agent_config.verbose:
print(f"\n🔄 Switching to new task: {callback_result}\n")
self.reset()
return self._execute_step(callback_result, is_first=True)
except Exception as e:
if self.agent_config.verbose:
print(f"Warning: step_callback error: {e}")
return step_result
@property
def context(self) -> list[dict[str, Any]]:
"""Get the current conversation context."""

View File

@@ -1,9 +1,18 @@
Pillow>=12.0.0
openai>=2.9.0
python-dotenv>=1.0.0
# For iOS Support
requests>=2.31.0
# For Web Dashboard
fastapi>=0.104.0
uvicorn[standard]>=0.24.0
websockets>=12.0
pydantic>=2.5.0
python-multipart>=0.0.6
aiofiles>=23.2.0
# For Model Deployment
## After installing sglang or vLLM, please run pip install -U transformers again to upgrade to 5.0.0rc0.

28
scripts/run_dashboard.bat Normal file
View File

@@ -0,0 +1,28 @@
@echo off
REM AutoGLM Dashboard - Run Script for Windows
REM Change to project root
cd /d "%~dp0.."
REM Load .env file if it exists (use set command to parse)
if exist .env (
for /f "tokens=*" %%a in ('type .env ^| findstr /v "^#"') do set %%a
)
REM Set defaults
if not defined DASHBOARD_HOST set DASHBOARD_HOST=0.0.0.0
if not defined DASHBOARD_PORT set DASHBOARD_PORT=8080
if not defined DASHBOARD_DEBUG set DASHBOARD_DEBUG=false
REM Print configuration
echo ==========================================
echo AutoGLM Dashboard
echo ==========================================
echo Host: %DASHBOARD_HOST%
echo Port: %DASHBOARD_PORT%
echo Debug: %DASHBOARD_DEBUG%
echo ==========================================
echo.
REM Run the dashboard
python -m uvicorn dashboard.main:app --host %DASHBOARD_HOST% --port %DASHBOARD_PORT% --reload

34
scripts/run_dashboard.sh Normal file
View File

@@ -0,0 +1,34 @@
#!/bin/bash
# AutoGLM Dashboard - Run Script
# Get the directory where this script is located
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
# Change to project root
cd "$PROJECT_ROOT"
# Load .env file if it exists
if [ -f .env ]; then
export $(grep -v '^#' .env | xargs)
fi
# Set defaults
DASHBOARD_HOST=${DASHBOARD_HOST:-0.0.0.0}
DASHBOARD_PORT=${DASHBOARD_PORT:-8080}
DASHBOARD_DEBUG=${DASHBOARD_DEBUG:-false}
# Print configuration
echo "=========================================="
echo " AutoGLM Dashboard"
echo "=========================================="
echo "Host: $DASHBOARD_HOST"
echo "Port: $DASHBOARD_PORT"
echo "Debug: $DASHBOARD_DEBUG"
echo "=========================================="
# Run the dashboard
python -m uvicorn dashboard.main:app \
--host "$DASHBOARD_HOST" \
--port "$DASHBOARD_PORT" \
--reload

View File

@@ -32,6 +32,8 @@ setup(
install_requires=[
"Pillow>=12.0.0",
"openai>=2.9.0",
"python-dotenv>=1.0.0",
"requests>=2.31.0",
],
extras_require={
"dev": [
@@ -40,10 +42,19 @@ setup(
"mypy>=1.0.0",
"ruff>=0.1.0",
],
"dashboard": [
"fastapi>=0.104.0",
"uvicorn[standard]>=0.24.0",
"websockets>=12.0",
"pydantic>=2.5.0",
"python-multipart>=0.0.6",
"aiofiles>=23.2.0",
],
},
entry_points={
"console_scripts": [
"phone-agent=main:main",
"autoglm-dashboard=dashboard.main:main",
],
},
)