diff --git a/apps/macos/Sources/Clawdis/ControlSocketServer.swift b/apps/macos/Sources/Clawdis/ControlSocketServer.swift index 5eed65d30..4758ef5d5 100644 --- a/apps/macos/Sources/Clawdis/ControlSocketServer.swift +++ b/apps/macos/Sources/Clawdis/ControlSocketServer.swift @@ -225,12 +225,30 @@ final actor ControlSocketServer { let r = getsockopt(fd, SOL_LOCAL, LOCAL_PEERPID, &pid, &pidSize) guard r == 0, pid > 0 else { return false } - // Same-user quick check - if let callerUID = self.uid(for: pid), callerUID == getuid() { + // Always require a valid code signature match (TeamID). + // This prevents any same-UID process from driving the app's privileged surface. + if self.teamIDMatches(pid: pid, allowedTeamIDs: allowedTeamIDs) { return true } - return self.teamIDMatches(pid: pid, allowedTeamIDs: allowedTeamIDs) + #if DEBUG + // Debug-only escape hatch: allow unsigned/same-UID clients when explicitly opted in. + // This keeps local development workable (e.g. a SwiftPM-built `clawdis-mac` binary). + let env = ProcessInfo.processInfo.environment["CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS"] + if env == "1", let callerUID = self.uid(for: pid), callerUID == getuid() { + self.logger.warning( + "allowing unsigned same-UID socket client pid=\(pid, privacy: .public) due to CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS=1") + return true + } + #endif + + if let callerUID = self.uid(for: pid) { + self.logger.error( + "socket client rejected pid=\(pid, privacy: .public) uid=\(callerUID, privacy: .public)") + } else { + self.logger.error("socket client rejected pid=\(pid, privacy: .public) (uid unknown)") + } + return false } private nonisolated static func uid(for pid: pid_t) -> uid_t? { diff --git a/docs/mac/peekaboo.md b/docs/mac/peekaboo.md new file mode 100644 index 000000000..ce47a8aa9 --- /dev/null +++ b/docs/mac/peekaboo.md @@ -0,0 +1,134 @@ +--- +summary: "Plan for integrating Peekaboo automation + visualizer into Clawdis macOS app (via clawdis-mac)" +read_when: + - Adding UI automation commands + - Integrating Peekaboo as a submodule + - Changing clawdis-mac IPC/output formats +--- +# Peekaboo in Clawdis (macOS UI automation + visualizer) + +## Goal +Reuse Peekaboo’s mac automation “core” inside **Clawdis.app** so we piggyback on Clawdis’ existing TCC grants (Screen Recording, Accessibility, etc.). The CLI (`clawdis-mac`) stays a thin synchronous trigger surface for **single actions** (no batches), returning errors cleanly. + +Non-goals: +- No AI/agent runtime parts from Peekaboo (no Tachikoma/MCP/Commander entrypoints). +- No auto-onboarding or System Settings deep-linking from the automation layer (Clawdis onboarding already handles that). + +## Where code lives +- **Clawdis.app (macOS)**: owns all automation + visualization + TCC prompts. +- **`clawdis-mac` CLI**: sends one request, waits, prints result, exits non-zero on failure. +- **Gateway/Node/TS**: shells out to `clawdis-mac` when it needs TCC-backed actions. + +Transport: existing UNIX domain socket (`controlSocketPath`) already used by `clawdis-mac`. + +## Dependencies (submodule strategy) +Integrate Peekaboo via git submodule (nested submodules OK). + +Consume only: +- `PeekabooAutomationKit` (AX automation, element detection, capture helpers; no Tachikoma/MCP). +- `AXorcist` (input driving / AX helpers). +- `PeekabooVisualizer` (overlay visualizations). + +Important nuance: +- `PeekabooVisualizer` currently ships as the `PeekabooVisualizer` product inside `PeekabooCore/Package.swift`. That package declares other dependencies (including a path dependency to Tachikoma). SwiftPM will still need those paths to exist during dependency resolution even if we don’t build those targets. + - If this is too annoying for Clawdis, the follow-up is to extract `PeekabooVisualizer` into its own standalone Swift package that depends only on `PeekabooFoundation`/`PeekabooProtocols`/`PeekabooExternalDependencies`. + +## IPC / CLI surface +### Namespacing +Add new automation commands behind a `ui` prefix: +- `clawdis-mac ui …` for UI automation + visualization-related actions. +- Keep existing top-level commands (`notify`, `run`, `canvas …`, etc.) for compatibility, but `screenshot` should become an alias of `ui screenshot` once Peekaboo takes it over. + +### Output format +Change `clawdis-mac` to default to human text output: +- **Default**: plain text; errors are string messages to stderr; exit codes indicate success/failure. +- **`--json`**: structured output (for agents/scripts) with stable schemas. + +This applies globally, not only `ui` commands. + +### Timeouts +Default timeout for UI actions: **10 seconds** end-to-end (CLI already defaults to 10s). +- CLI: keep the fail-fast default at 10s (unless a command explicitly requests longer). +- Server: only has a ~5s read/decode timeout today; UI operations must also enforce their own per-action timeout so “wait for element” can fail deterministically. + +## Coordinate model (multi-display) +Requirement: coordinates are **per screen**, not global. + +Proposed API shape: +- Requests accept `screenIndex` + `{x, y}` in that screen’s local coordinate space. +- Clawdis.app converts to global CG coordinates using `NSScreen.screens[screenIndex].frame.origin`. +- Responses should echo both: + - The resolved `screenIndex` + - The local `{x, y}` and bounds + - Optionally the global `{x, y}` for debugging + +Ordering: use `NSScreen.screens` ordering consistently (documented in the CLI help + JSON schema). + +## Targeting (per app/window) +Expose window/app targeting in the IPC surface (based on Peekaboo’s existing `WindowTarget` model): +- frontmost +- by app name / bundle id +- by window title substring +- by (app, index) +- by window id + +All “see/click/type/scroll/wait” requests should accept a target (default: frontmost). + +## “See” + click packs (Playwright-style) +Peekaboo already has the core ingredients: +- element detection yielding stable IDs (e.g., `B1`, `T3`) +- bounds + labels/values +- session IDs to allow follow-up actions without re-scanning + +Clawdis’s `ui see` should: +- capture (optionally targeted) window/screen +- return a **session id** +- return a list of elements with `{id, type, label/value?, bounds}` +- optionally return screenshot path/bytes (pref: path) + +## Visualizer integration +Visualizer must be user-toggleable via a Clawdis setting. + +Implementation sketch: +- Add a Clawdis UserDefaults-backed setting (e.g. `clawdis.ui.visualizerEnabled`). +- Implement Peekaboo’s `VisualizerSettingsProviding` in Clawdis (`visualizerEnabled`, animation speed, and per-effect toggles). +- Create a Clawdis-specific `AutomationFeedbackClient` that forwards PeekabooAutomationKit feedback events into a shared `VisualizerCoordinator`. + +Current state: +- `PeekabooVisualizer` already includes the visualization implementation (SwiftUI overlay views + coordinator). + +Open requirement: +- “Any AX event should be clickable.” Today the visualizer is display-only; the likely follow-up is: + - make the annotated element overlays tappable (debug tool) + - surface tap → element id → send a `ui click --element --session ` request back through Clawdis’ control channel (or a local callback if the visualizer runs inside the app) + +## Screenshots (legacy → Peekaboo takeover) +Clawdis currently has a legacy `screenshot` request returning raw PNG bytes in `Response.payload`. + +Migration plan: +- Replace capture implementation with PeekabooAutomationKit’s capture service so we share: + - per-screen mapping + - window/app targeting + - visual feedback (flash / watch HUD) when enabled +- Prefer writing images to a file path on the app side and returning the path (text-friendly), with `--json` providing the structured metadata. + +## Permissions behavior +If required permissions are missing: +- return `ok=false` with a short human error message (e.g., “Accessibility permission missing”) +- do not try to open System Settings from the automation endpoint + +## Security (socket auth) +Clawdis’ socket is protected by: +- filesystem perms on the socket path (owner read/write only) +- server-side caller check: + - requires the caller’s code signature TeamID to be `Y5PE65HELJ` + - in `DEBUG` builds only, an explicit escape hatch allows same-UID clients when `CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` is set (development convenience) + +This ensures “any local process” can’t drive the privileged surface just because it runs under the same macOS user. + +## Next integration steps (after this doc) +1. Add Peekaboo as a git submodule (and required nested submodules). +2. Wire SwiftPM deps in `apps/macos/Package.swift` to import `PeekabooAutomationKit` + `PeekabooVisualizer`. +3. Extend `ClawdisIPC.Request` with `ui.*` commands (`see/click/type/scroll/wait/screenshot/windows/screens`). +4. Implement handlers in Clawdis.app and route through PeekabooAutomationKit services. +5. Update `clawdis-mac` output defaults (text + `--json`), and adjust any internal call sites that relied on JSON-by-default. diff --git a/docs/mac/xpc.md b/docs/mac/xpc.md index 02fe528f2..b17030061 100644 --- a/docs/mac/xpc.md +++ b/docs/mac/xpc.md @@ -9,7 +9,7 @@ read_when: - Single GUI app instance that owns all TCC-facing work (notifications, screen recording, mic, speech, AppleScript). - A small surface for automation: the `clawdis-mac` CLI and the Node gateway talk to the app via a local XPC channel. - Predictable permissions: always the same signed bundle ID, launched by launchd, so TCC grants stick. -- Limit who can connect: only signed clients from our team (with a same-UID fallback for development). +- Limit who can connect: only signed clients from our team (with an explicit DEBUG-only escape hatch for development). ## How it works - The app registers a Mach service named `com.steipete.clawdis.xpc` via a user LaunchAgent at `~/Library/LaunchAgents/com.steipete.clawdis.plist`. @@ -17,8 +17,8 @@ read_when: - The app hosts the XPC listener (`NSXPCListener(machServiceName:)`) and exports `ClawdisXPCService`. - The CLI (`clawdis-mac`) connects with `NSXPCConnection(machServiceName:)`; the Node gateway shells out to the CLI. - Security: on incoming connections we read the audit token (or PID) and allow only: - - Code-signed clients with team ID `Y5PE65HELJ`; or - - Same-UID processes (fallback to avoid blocking local dev). + - Code-signed clients with team ID `Y5PE65HELJ`. + - In `DEBUG` builds only, you can opt into allowing same-UID clients by setting `CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`. ## Operational flows - Restart/rebuild: `SIGN_IDENTITY="Apple Development: Peter Steinberger (2ZAC4GM7GD)" scripts/restart-mac.sh`