docs: refresh and simplify docs
This commit is contained in:
@@ -1,170 +1,62 @@
|
||||
---
|
||||
summary: "Plan for integrating Peekaboo automation into Clawdbot via PeekabooBridge (socket-based TCC broker)"
|
||||
summary: "PeekabooBridge integration for macOS UI automation"
|
||||
read_when:
|
||||
- Hosting PeekabooBridge in Clawdbot.app
|
||||
- Integrating Peekaboo as a submodule
|
||||
- Changing PeekabooBridge protocol/paths
|
||||
---
|
||||
# Peekaboo Bridge in Clawdbot (macOS UI automation broker)
|
||||
# Peekaboo Bridge (macOS UI automation)
|
||||
|
||||
## TL;DR
|
||||
- **Peekaboo removed its XPC helper** and now exposes privileged automation via a **UNIX domain socket bridge** (`PeekabooBridge` / `PeekabooBridgeHost`, socket name `bridge.sock`).
|
||||
- Clawdbot integrates by **optionally hosting the same bridge** inside **Clawdbot.app** (user-toggleable). The primary client is the **`peekaboo` CLI** (installed via npm); Clawdbot does not need its own `ui …` CLI surface.
|
||||
- For **visualizations**, we keep them in **Peekaboo.app** (best UX); Clawdbot stays a thin broker host. No visualizer toggle in Clawdbot.
|
||||
Clawdbot can host **PeekabooBridge** as a local, permission‑aware UI automation
|
||||
broker. This lets the `peekaboo` CLI drive UI automation while reusing the
|
||||
macOS app’s TCC permissions.
|
||||
|
||||
Non-goals:
|
||||
- No auto-launching Peekaboo.app.
|
||||
- No onboarding deep links from the automation endpoint (Clawdbot onboarding already handles permissions).
|
||||
- No AI provider/agent runtime dependencies in Clawdbot (avoid pulling Tachikoma/MCP into the Clawdbot app/CLI).
|
||||
## What this is (and isn’t)
|
||||
|
||||
## Big refactor (Dec 2025): XPC → Bridge
|
||||
Peekaboo’s privileged execution moved from “CLI → XPC helper” to “CLI → socket bridge host”. For Clawdbot this is a win:
|
||||
- It matches the existing “local socket + codesign checks” approach.
|
||||
- It lets us piggyback on **either** Peekaboo.app’s permissions **or** Clawdbot.app’s permissions (whichever is running).
|
||||
- It avoids “two apps with two TCC bubbles” unless needed.
|
||||
- **Host**: Clawdbot.app can act as a PeekabooBridge host.
|
||||
- **Client**: use the `peekaboo` CLI (no separate `clawdbot ui ...` surface).
|
||||
- **UI**: visual overlays stay in Peekaboo.app; Clawdbot is a thin broker host.
|
||||
|
||||
Reference (Peekaboo submodule): `Peekaboo/docs/bridge-host.md`.
|
||||
## Enable the bridge
|
||||
|
||||
## Architecture
|
||||
### Processes
|
||||
- **Bridge hosts** (provide TCC-backed automation):
|
||||
- **Peekaboo.app** (preferred; also provides visualizations + controls)
|
||||
- **Claude.app** (secondary; lets `peekaboo` reuse Claude Desktop’s granted permissions)
|
||||
- **Clawdbot.app** (secondary; “thin host” only)
|
||||
- **Bridge clients** (trigger single actions):
|
||||
- `peekaboo …` (preferred; humans + agents)
|
||||
- Optional: Clawdbot/Node shells out to `peekaboo` when it needs UI automation/capture
|
||||
In the macOS app:
|
||||
- Settings → **Enable Peekaboo Bridge**
|
||||
|
||||
### Host discovery (client-side)
|
||||
Order is deliberate:
|
||||
1. Peekaboo.app host (full UX)
|
||||
2. Claude.app host (piggyback on Claude Desktop permissions)
|
||||
3. Clawdbot.app host (piggyback on Clawdbot permissions)
|
||||
When enabled, Clawdbot starts a local UNIX socket server. If disabled, the host
|
||||
is stopped and `peekaboo` will fall back to other available hosts.
|
||||
|
||||
Socket paths (convention; exact paths must match Peekaboo):
|
||||
- Peekaboo: `~/Library/Application Support/Peekaboo/bridge.sock`
|
||||
- Claude: `~/Library/Application Support/Claude/bridge.sock`
|
||||
- Clawdbot: `~/Library/Application Support/clawdbot/bridge.sock`
|
||||
## Client discovery order
|
||||
|
||||
No auto-launch: if a host isn’t reachable, the command fails with a clear error (start Peekaboo.app, Claude.app, or Clawdbot.app).
|
||||
Peekaboo clients typically try hosts in this order:
|
||||
|
||||
Override (debugging): set `PEEKABOO_BRIDGE_SOCKET=/path/to/bridge.sock`.
|
||||
1. Peekaboo.app (full UX)
|
||||
2. Claude.app (if installed)
|
||||
3. Clawdbot.app (thin broker)
|
||||
|
||||
### Protocol shape
|
||||
- **Single request per connection**: connect → write one JSON request → half-close → read one JSON response → close.
|
||||
- **Timeout**: 10 seconds end-to-end per action (client enforced; host should also enforce per-operation).
|
||||
- **Errors**: human-readable string by default; structured envelope in `--json`.
|
||||
Use `peekaboo bridge status --verbose` to see which host is active and which
|
||||
socket path is in use. You can override with:
|
||||
|
||||
## Dependency strategy (submodule)
|
||||
Integrate Peekaboo via git submodule (nested submodules are OK).
|
||||
```bash
|
||||
export PEEKABOO_BRIDGE_SOCKET=/path/to/bridge.sock
|
||||
```
|
||||
|
||||
Path in Clawdbot repo:
|
||||
- `./Peekaboo` (Swabble-style; keep stable so SwiftPM path deps don’t churn).
|
||||
## Security & permissions
|
||||
|
||||
What Clawdbot should use:
|
||||
- **Client side**: `PeekabooBridge` (socket client + protocol models).
|
||||
- **Host side (Clawdbot.app)**: `PeekabooBridgeHost` + the minimal Peekaboo services needed to implement operations.
|
||||
- The bridge validates **caller code signatures**; TeamID `Y5PE65HELJ` is
|
||||
allowed by default (Peekaboo’s signing team), plus the Clawdbot app’s TeamID.
|
||||
- Requests time out after ~10 seconds.
|
||||
- If required permissions are missing, the bridge returns a clear error message
|
||||
rather than launching System Settings.
|
||||
|
||||
What Clawdbot should *not* embed:
|
||||
- **Visualizer UI**: keep it in Peekaboo.app for now (toggle + controls live there).
|
||||
- **XPC**: don’t reintroduce helper targets; use the bridge.
|
||||
## Snapshot behavior (automation)
|
||||
|
||||
## IPC / CLI surface
|
||||
### No `clawdbot ui …`
|
||||
We avoid a parallel “Clawdbot UI automation CLI”. Instead:
|
||||
- `peekaboo` is the user/agent-facing CLI surface for automation and capture.
|
||||
- Clawdbot.app can host PeekabooBridge as a **thin TCC broker** so Peekaboo can piggyback on Clawdbot permissions when Peekaboo.app isn’t running.
|
||||
Snapshots are stored in memory and expire automatically after a short window.
|
||||
If you need longer retention, re‑capture from the client.
|
||||
|
||||
### Diagnostics
|
||||
Use Peekaboo’s built-in diagnostics to see which host would be used:
|
||||
- `peekaboo bridge status`
|
||||
- `peekaboo bridge status --verbose`
|
||||
- `peekaboo bridge status --json`
|
||||
## Troubleshooting
|
||||
|
||||
### Output format
|
||||
Peekaboo commands default to human text output. Add `--json` for a structured envelope.
|
||||
|
||||
### Timeouts
|
||||
Default timeout for UI actions: **10 seconds** end-to-end (client enforced; host should also enforce per-operation).
|
||||
|
||||
## Coordinate model (multi-display)
|
||||
Requirement: coordinates are **per screen**, not global.
|
||||
|
||||
Standardize for the CLI (agent-friendly): **top-left origin per screen**.
|
||||
|
||||
Proposed request shape:
|
||||
- Requests accept `screenIndex` + `{x, y}` in that screen’s local coordinate space.
|
||||
- Clawdbot.app converts to global CG coordinates using `NSScreen.screens[screenIndex].frame.origin`.
|
||||
- Responses should echo both:
|
||||
- The resolved `screenIndex`
|
||||
- The local `{x, y}` and bounds
|
||||
- Optionally the global `{x, y}` for debugging
|
||||
|
||||
Ordering: use `NSScreen.screens` ordering consistently (documented in the CLI help + JSON schema).
|
||||
|
||||
## Targeting (per app/window)
|
||||
Expose window/app targeting in the UI surface (align with Peekaboo targeting):
|
||||
- frontmost
|
||||
- by app name / bundle id
|
||||
- by window title substring
|
||||
- by (app, index)
|
||||
|
||||
Peekaboo CLI targeting (agent-friendly):
|
||||
- `--bundle-id <id>` for app targeting
|
||||
- `--window-index <n>` (0-based) for disambiguating within an app when capturing
|
||||
|
||||
All “see/click/type/scroll/wait” requests should accept a target (default: frontmost).
|
||||
|
||||
## “See” + click packs (Playwright-style)
|
||||
Behavior stays aligned with Peekaboo:
|
||||
- `peekaboo see` returns element IDs (e.g. `B1`, `T3`) with bounds/labels.
|
||||
- Follow-up actions reference those IDs without re-scanning.
|
||||
|
||||
`peekaboo see` should:
|
||||
- capture (optionally targeted) window/screen
|
||||
- return a screenshot **file path** (default: temp directory)
|
||||
- return a list of elements (text or JSON)
|
||||
|
||||
Snapshot lifecycle requirement:
|
||||
- Host apps are long-lived, so snapshot state should be **in-memory by default**.
|
||||
- Snapshot scoping: “implicit snapshot” is **per target bundle id** (reuse last snapshot for that app when snapshot id is omitted).
|
||||
|
||||
Practical flow (agent-friendly):
|
||||
- `peekaboo list apps` / `peekaboo list windows` provide bundle-id context for targeting.
|
||||
- `peekaboo see --bundle-id X` updates the implicit snapshot for `X`.
|
||||
- `peekaboo click --bundle-id X --on B1` reuses the most recent snapshot for `X` when `--snapshot-id` is omitted.
|
||||
|
||||
## Visualizer integration
|
||||
Keep visualizations in **Peekaboo.app** for now.
|
||||
- Clawdbot hosts the bridge, but does not render overlays.
|
||||
- Any “visualizer enabled/disabled” setting is controlled in Peekaboo.app.
|
||||
|
||||
## Screenshots (legacy → Peekaboo takeover)
|
||||
Clawdbot should not grow a separate screenshot CLI surface.
|
||||
|
||||
Migration plan:
|
||||
- Use `peekaboo capture …` / `peekaboo see …` (returns a file path, default temp directory).
|
||||
- Once Clawdbot’ legacy screenshot plumbing is replaced, remove it cleanly (no aliases).
|
||||
|
||||
## Permissions behavior
|
||||
If required permissions are missing:
|
||||
- return `ok=false` with a short human error message (e.g., “Accessibility permission missing”)
|
||||
- do not try to open System Settings from the automation endpoint
|
||||
|
||||
## Security (socket auth)
|
||||
Both hosts must enforce:
|
||||
- filesystem perms on the socket path (owner read/write only)
|
||||
- server-side caller validation:
|
||||
- require the caller’s code signature TeamID to be `Y5PE65HELJ`
|
||||
- optional bundle-id allowlist for tighter scoping
|
||||
|
||||
Debug-only escape hatch (development convenience):
|
||||
- “allow same-UID callers” means: *skip codesign checks for clients running under the same Unix user*.
|
||||
- This must be **opt-in**, **DEBUG-only**, and guarded by an env var (Peekaboo uses `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`).
|
||||
|
||||
## Next integration steps (after this doc)
|
||||
1. Add Peekaboo as a git submodule (nested submodules OK).
|
||||
2. Host `PeekabooBridgeHost` inside Clawdbot.app behind a single setting (“Enable Peekaboo Bridge”, default on).
|
||||
3. Ensure Clawdbot hosts the bridge at `~/Library/Application Support/clawdbot/bridge.sock` and speaks the PeekabooBridge JSON protocol.
|
||||
4. Validate with `peekaboo bridge status --verbose` that Peekaboo can select Clawdbot as the fallback host (no auto-launch).
|
||||
5. Keep all protocol decisions aligned with Peekaboo (coordinate system, element IDs, snapshot scoping, error envelopes).
|
||||
- If `peekaboo` reports “bridge client is not authorized”, ensure the client is
|
||||
properly signed or run the host with `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`
|
||||
in **debug** mode only.
|
||||
- If no hosts are found, open one of the host apps (Peekaboo.app or Clawdbot.app)
|
||||
and confirm permissions are granted.
|
||||
|
||||
Reference in New Issue
Block a user