--- summary: "Plan for integrating Peekaboo automation into Clawdis via PeekabooBridge (socket-based TCC broker)" read_when: - Adding UI automation commands - Integrating Peekaboo as a submodule - Changing clawdis-mac IPC/output formats --- # Peekaboo Bridge in Clawdis (macOS UI automation broker) ## TL;DR - **Peekaboo removed its XPC helper** and now exposes privileged automation via a **UNIX domain socket bridge** (`PeekabooBridge` / `PeekabooBridgeHost`, socket name `bridge.sock`). - Clawdis integrates by **hosting the same bridge** inside **Clawdis.app** (optional, user-toggleable), and by making `clawdis-mac ui …` act as a **bridge client**. - For **visualizations**, we keep them in **Peekaboo.app** (best UX); Clawdis stays a thin broker host. No visualizer toggle in Clawdis. Non-goals: - No auto-launching Peekaboo.app. - No onboarding deep links from the automation endpoint (Clawdis onboarding already handles permissions). - No AI provider/agent runtime dependencies in Clawdis (avoid pulling Tachikoma/MCP into the Clawdis app/CLI). ## Big refactor (Dec 2025): XPC → Bridge Peekaboo’s privileged execution moved from “CLI → XPC helper” to “CLI → socket bridge host”. For Clawdis this is a win: - It matches the existing “local socket + codesign checks” approach. - It lets us piggyback on **either** Peekaboo.app’s permissions **or** Clawdis.app’s permissions (whichever is running). - It avoids “two apps with two TCC bubbles” unless needed. Reference (Peekaboo submodule): `docs/bridge-host.md`. ## Architecture ### Processes - **Bridge hosts** (provide TCC-backed automation): - **Peekaboo.app** (preferred; also provides visualizations + controls) - **Clawdis.app** (secondary; “thin host” only) - **Bridge clients** (trigger single actions): - `clawdis-mac ui …` - `clawdis ui …` (Node/TS convenience wrapper; shells out to `clawdis-mac ui …`) - Node/Gateway shells out to `clawdis-mac` ### Host discovery (client-side) Order is deliberate: 1. Peekaboo.app host (full UX) 2. Clawdis.app host (piggyback on Clawdis permissions) Socket paths (convention; exact paths must match Peekaboo): - Peekaboo: `~/Library/Application Support/Peekaboo/bridge.sock` - Clawdis: `~/Library/Application Support/clawdis/bridge.sock` No auto-launch: if a host isn’t reachable, the command fails with a clear error (start Peekaboo.app or Clawdis.app). Override (debugging): set `PEEKABOO_BRIDGE_SOCKET=/path/to/bridge.sock`. ### Protocol shape - **Single request per connection**: connect → write one JSON request → half-close → read one JSON response → close. - **Timeout**: 10 seconds end-to-end per action (client enforced; host should also enforce per-operation). - **Errors**: human-readable string by default; structured envelope in `--json`. ## Dependency strategy (submodule) Integrate Peekaboo via git submodule (nested submodules are OK). Path in Clawdis repo: - `./Peekaboo` (Swabble-style; keep stable so SwiftPM path deps don’t churn). What Clawdis should use: - **Client side**: `PeekabooBridge` (socket client + protocol models). - **Host side (Clawdis.app)**: `PeekabooBridgeHost` + the minimal Peekaboo services needed to implement operations. What Clawdis should *not* embed: - **Visualizer UI**: keep it in Peekaboo.app for now (toggle + controls live there). - **XPC**: don’t reintroduce helper targets; use the bridge. ## IPC / CLI surface ### Namespacing Add new automation commands behind a `ui` prefix: - `clawdis-mac ui …` for UI automation + visualization-related actions. - Keep existing top-level commands (`notify`, `run`, `canvas …`, etc.) for compatibility. Screenshot cutover: - Remove legacy screenshot endpoints/commands. - Ship only `clawdis-mac ui screenshot` (no aliases). ### Output format Change `clawdis-mac` to default to human text output: - **Default**: plain text; errors are string messages to stderr; exit codes indicate success/failure. - **`--json`**: structured output (for agents/scripts) with stable schemas. This applies globally, not only `ui` commands. Note (current state as of 2025-12-13): `clawdis-mac` prints text by default; use `--json` for structured output. ### Timeouts Default timeout for UI actions: **10 seconds** end-to-end. ## Coordinate model (multi-display) Requirement: coordinates are **per screen**, not global. Standardize for the CLI (agent-friendly): **top-left origin per screen**. Proposed request shape: - Requests accept `screenIndex` + `{x, y}` in that screen’s local coordinate space. - Clawdis.app converts to global CG coordinates using `NSScreen.screens[screenIndex].frame.origin`. - Responses should echo both: - The resolved `screenIndex` - The local `{x, y}` and bounds - Optionally the global `{x, y}` for debugging Ordering: use `NSScreen.screens` ordering consistently (documented in the CLI help + JSON schema). ## Targeting (per app/window) Expose window/app targeting in the UI surface (align with Peekaboo targeting): - frontmost - by app name / bundle id - by window title substring - by (app, index) Current `clawdis-mac ui …` support: - `--bundle-id ` for app targeting - `--window-index ` (0-based) for disambiguating within an app when capturing (see/screenshot) All “see/click/type/scroll/wait” requests should accept a target (default: frontmost). ## “See” + click packs (Playwright-style) Behavior stays aligned with Peekaboo: - `ui see` returns element IDs (e.g. `B1`, `T3`) with bounds/labels. - Follow-up actions reference those IDs without re-scanning. `clawdis-mac ui see` should: - capture (optionally targeted) window/screen - return a screenshot **file path** (default: temp directory) - return a list of elements (text or JSON) Snapshot lifecycle requirement: - Host apps are long-lived, so snapshot state should be **in-memory by default**. - Snapshot scoping: “implicit snapshot” is **per target bundle id** (reuse last snapshot for that app when snapshot id is omitted). Practical flow (agent-friendly): - `clawdis-mac ui frontmost` returns the focused app (bundle id) + focused window (title/id) so follow-up calls can pass `--bundle-id …`. - `clawdis-mac ui see --bundle-id X` updates the implicit snapshot for `X`. - `clawdis-mac ui click --bundle-id X --on B1` reuses the most recent snapshot for `X` when `--snapshot-id` is omitted. ## Visualizer integration Keep visualizations in **Peekaboo.app** for now. - Clawdis hosts the bridge, but does not render overlays. - Any “visualizer enabled/disabled” setting is controlled in Peekaboo.app. ## Screenshots (legacy → Peekaboo takeover) Clawdis uses `clawdis-mac ui screenshot` and returns a file path (default location: temp directory) instead of raw image bytes. Migration plan: - Bridge host performs capture and returns a temp file path. - No legacy aliases; make the old screenshot surface disappear cleanly. ## Permissions behavior If required permissions are missing: - return `ok=false` with a short human error message (e.g., “Accessibility permission missing”) - do not try to open System Settings from the automation endpoint ## Security (socket auth) Both hosts must enforce: - filesystem perms on the socket path (owner read/write only) - server-side caller validation: - require the caller’s code signature TeamID to be `Y5PE65HELJ` - optional bundle-id allowlist for tighter scoping Debug-only escape hatch (development convenience): - “allow same-UID callers” means: *skip codesign checks for clients running under the same Unix user*. - This must be **opt-in**, **DEBUG-only**, and guarded by an env var (Peekaboo uses `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`). ## Current `clawdis-mac ui` commands (Dec 2025) All commands default to text output. Add `--json` right after `clawdis-mac` for a structured envelope. - `clawdis-mac ui permissions status` - `clawdis-mac ui frontmost` - `clawdis-mac ui apps` - `clawdis-mac ui windows [--bundle-id ]` - `clawdis-mac ui screenshot [--screen-index ] [--bundle-id ] [--window-index ] [--watch] [--scale native|1x]` - `clawdis-mac ui see [--bundle-id ] [--window-index ] [--snapshot-id ]` - `clawdis-mac ui click --on [--bundle-id ] [--snapshot-id ] [--double|--right]` - `clawdis-mac ui type --text [--into ] [--bundle-id ] [--snapshot-id ] [--clear] [--delay-ms ]` - `clawdis-mac ui wait --on [--bundle-id ] [--snapshot-id ] [--timeout ]` ## Next integration steps (after this doc) 1. Add Peekaboo as a git submodule (nested submodules OK). 2. Add a small `clawdis-mac ui …` surface that speaks PeekabooBridge (text by default, `--json` for structured). 3. Host `PeekabooBridgeHost` inside Clawdis.app behind a single setting (“Enable Peekaboo Bridge”, default on). 4. Implement the minimum operation set needed for agents (see/click/type/scroll/wait/screenshot, plus list apps/windows/screens). 5. Keep all protocol decisions aligned with Peekaboo (coordinate system, element IDs, snapshot scoping, error envelopes).