refactor(cli): unify on clawdis CLI + node permissions

This commit is contained in:
Peter Steinberger
2025-12-20 02:08:04 +00:00
parent 479720c169
commit 849446ae17
49 changed files with 1205 additions and 2735 deletions

View File

@@ -63,7 +63,7 @@ git commit -m "Add Clawd workspace"
## What Clawdis Does
- Runs WhatsApp gateway + Pi coding agent so the assistant can read/write chats, fetch context, and run tools via the host Mac.
- macOS app manages permissions (screen recording, notifications, microphone) and exposes a CLI helper `clawdis-mac` for scripts.
- macOS app manages permissions (screen recording, notifications, microphone) and exposes the `clawdis` CLI via its bundled binary.
- Direct chats collapse into the shared `main` session by default; groups stay isolated as `group:<jid>`; heartbeats keep background tasks alive.
## Core Tools (enable in Settings → Tools)
@@ -91,7 +91,7 @@ git commit -m "Add Clawd workspace"
- **Google Calendar MCP** (`google-calendar`) — List, create, and update events.
## Usage Notes
- Prefer the `clawdis-mac` CLI for scripting; mac app handles permissions.
- Prefer the `clawdis` CLI for scripting; mac app handles permissions.
- Run installs from the Tools tab; it hides the button if a tool is already present.
- For MCPs, mcporter writes to the home-scope config; re-run installs if you rotate tokens.
- Keep heartbeats enabled so the assistant can schedule reminders, monitor inboxes, and trigger camera captures.

View File

@@ -11,7 +11,7 @@ Clawdis supports **camera capture** for agent workflows:
- **iOS node** (paired via Gateway): capture a **photo** (`jpg`) or **short video clip** (`mp4`, with optional audio) via `node.invoke`.
- **Android node** (paired via Gateway): capture a **photo** (`jpg`) or **short video clip** (`mp4`, with optional audio) via `node.invoke`.
- **macOS app** (local control socket): capture a **photo** (`jpg`) or **short video clip** (`mp4`, with optional audio) via `clawdis-mac`.
- **macOS app** (node via Gateway): capture a **photo** (`jpg`) or **short video clip** (`mp4`, with optional audio) via `node.invoke`.
All camera access is gated behind **user-controlled settings**.
@@ -100,22 +100,22 @@ The macOS companion app exposes a checkbox:
- Default: **off**
- When off: camera requests return “Camera disabled by user”.
### CLI helper (local control socket)
### CLI helper (node invoke)
The `clawdis-mac` helper talks to the running menu bar app over the local control socket.
Use the main `clawdis` CLI to invoke camera commands on the macOS node.
Examples:
```bash
clawdis-mac camera snap # prints MEDIA:<path>
clawdis-mac camera snap --max-width 1280
clawdis-mac camera clip --duration 10s # prints MEDIA:<path>
clawdis-mac camera clip --duration-ms 3000 # prints MEDIA:<path> (legacy flag)
clawdis-mac camera clip --no-audio
clawdis nodes camera snap --node <id> # prints MEDIA:<path>
clawdis nodes camera snap --node <id> --max-width 1280
clawdis nodes camera clip --node <id> --duration 10s # prints MEDIA:<path>
clawdis nodes camera clip --node <id> --duration-ms 3000 # prints MEDIA:<path> (legacy flag)
clawdis nodes camera clip --node <id> --no-audio
```
Notes:
- `clawdis-mac camera snap` defaults to `maxWidth=1600` unless overridden.
- `clawdis nodes camera snap` defaults to `maxWidth=1600` unless overridden.
## Safety + practical limits
@@ -127,7 +127,7 @@ Notes:
For *screen* video (not camera), use the macOS companion:
```bash
clawdis-mac screen record --duration 10s --fps 15 # prints MEDIA:<path>
clawdis nodes screen record --node <id> --duration 10s --fps 15 # prints MEDIA:<path>
```
Notes:

View File

@@ -1,95 +1,57 @@
---
summary: "Spec for the Clawdis macOS companion menu bar app and local broker (control socket + PeekabooBridge)"
summary: "Spec for the Clawdis macOS companion menu bar app (gateway + node broker)"
read_when:
- Implementing macOS app features
- Touching broker/CLI bridging
- Changing gateway lifecycle or node bridging on macOS
---
# Clawdis macOS Companion (menu bar + local broker)
# Clawdis macOS Companion (menu bar + gateway broker)
Author: steipete · Status: draft spec · Date: 2025-12-05
Author: steipete · Status: draft spec · Date: 2025-12-20
## Purpose
- Single macOS menu-bar app named **Clawdis** that:
- Shows native notifications for Clawdis/clawdis events.
- Owns TCC prompts (Notifications, Accessibility, Screen Recording, Automation/AppleScript, Microphone, Speech Recognition).
- Brokers privileged actions via local IPC:
- Clawdis control socket (app-specific actions like notify/run)
- PeekabooBridge socket (`bridge.sock`) for UI automation brokering (consumed by `peekaboo`; see `docs/mac/peekaboo.md`)
- Provides a tiny CLI (`clawdis-mac`) that talks to the app; Node/TS shells out to it.
- Replace the separate notifier helper pattern (Oracle) with a built-in notifier.
- Offer a first-run experience similar to VibeTunnels onboarding (permissions + CLI install).
- Runs (or connects to) the **Gateway** and exposes itself as a **node** so agents can reach macOSonly features.
- Hosts **PeekabooBridge** for UI automation (consumed by `peekaboo`; see `docs/mac/peekaboo.md`).
- Installs a single CLI (`clawdis`) by symlinking the bundled binary.
## High-level design
- SwiftPM package in `apps/macos/` (macOS 15+, Swift 6).
- Targets:
- `ClawdisIPC` (shared Codable types + helpers for app-specific commands).
- `Clawdis` (LSUIElement MenuBarExtra app; hosts control socket + optional PeekabooBridgeHost).
- `ClawdisCLI` (`clawdis-mac`; prints text by default, `--json` for scripts).
- `ClawdisIPC` (shared Codable types + helpers for appinternal actions).
- `Clawdis` (LSUIElement MenuBarExtra app; hosts Gateway + node bridge + PeekabooBridgeHost).
- Bundle ID: `com.steipete.clawdis`.
- The CLI lives in the app bundle `Contents/Helpers/clawdis-mac`; dev symlink `bin/clawdis-mac` points there.
- Node/TS layer calls the CLI; no direct privileged API calls from Node.
- Bundled runtime binaries live under `Contents/Resources/Relay/`:
- `clawdis-gateway` (buncompiled Gateway)
- `clawdis` (buncompiled CLI)
- The app symlinks `clawdis` into `/usr/local/bin` and `/opt/homebrew/bin`.
Note: `docs/mac/xpc.md` describes an aspirational long-term Mach/XPC architecture. The current direction for UI automation is PeekabooBridge (socket-based).
## Gateway + node bridge
- The mac app runs the Gateway in **local** mode (unless configured remote).
- The mac app connects to the bridge as a **node** and advertises capabilities/commands.
- Agentfacing actions are exposed via `node.invoke` (no local control socket).
## IPC contract (ClawdisIPC)
- Codable enums; small payloads (<1 MB enforced in listener):
### Node commands (mac)
- Canvas: `canvas.present|navigate|eval|snapshot|a2ui.*`
- Camera: `camera.snap|camera.clip`
- Screen: `screen.record`
- System: `system.run` (shell) and `system.notify`
```
enum Capability { notifications, accessibility, screenRecording, appleScript, microphone, speechRecognition }
enum Request {
notify(title, body, sound?)
ensurePermissions([Capability], interactive: Bool)
runShell(command:[String], cwd?, env?, timeoutSec?, needsScreenRecording: Bool)
status
}
struct Response { ok: Bool; message?: String; payload?: Data }
```
- The control-socket server rejects oversize/unknown cases and validates the caller by code signature TeamID (with a `DEBUG`-only same-UID escape hatch controlled by `CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`).
### Permission advertising
- Nodes include a `permissions` map in hello/pairing.
- The Gateway surfaces it via `node.list` / `node.describe` so agents can decide what to run.
UI automation is not part of `ClawdisIPC.Request`:
- UI automation is handled via the separate PeekabooBridge socket and is surfaced by the `peekaboo` CLI (see `docs/mac/peekaboo.md`).
## CLI (`clawdis`)
- The **only** CLI is `clawdis` (TS/bun). There is no `clawdis-mac` helper.
- For macspecific actions, the CLI uses `node.invoke`:
- `clawdis canvas present|navigate|eval|snapshot|a2ui push|a2ui reset`
- `clawdis nodes run --node <id> -- <command...>`
- `clawdis nodes notify --node <id> --title ...`
## App UX (Clawdis)
- MenuBarExtra icon only (LSUIElement; no Dock).
- Menu items: Status, Permissions…, **Pause Clawdis** toggle (temporarily deny privileged actions/notifications without quitting), Quit.
- Settings window (Trimmy-style tabs):
- General: launch at login toggle and debug/visibility toggles (no per-user default sound; pass sounds per notification via CLI).
- Permissions: live status + “Request” buttons for Notifications/Accessibility/Screen Recording; links to System Settings.
- Debug (when enabled): PID/log links, restart/reveal app shortcuts, manual test notification.
- About: version, links, license.
- Pause behavior: matches Trimmys “Auto Trim” toggle. When paused, the broker returns `ok=false, message="clawdis paused"` for actions that would touch TCC. State is persisted (UserDefaults) and surfaced in menu and status view.
- Onboarding (VibeTunnel-inspired): Welcome → What it does → Install CLI (shows `ln -s .../clawdis-mac /usr/local/bin`) → Permissions checklist with live status → Test notification → Done. Re-show when `welcomeVersion` bumps or CLI/app version mismatch.
## Built-in services
- NotificationManager: UNUserNotificationCenter primary; AppleScript `display notification` fallback; respects the `--sound` value on each request.
- PermissionManager: checks/requests Notifications, Accessibility (AX), Screen Recording (capture probe); publishes changes for UI.
- UI automation + capture: provided by **PeekabooBridgeHost** when enabled (see `docs/mac/peekaboo.md`).
- ShellExecutor: executes `Process` with timeout; rejects when `needsScreenRecording` and permission missing; returns stdout/stderr in payload.
- ControlSocketServer actor: routes Request → managers; logs via OSLog.
## CLI (`clawdis-mac`)
- Subcommands (text by default; `--json` for machine output; non-zero exit on failure):
- `notify --title --body [--sound] [--priority passive|active|timeSensitive] [--delivery system|overlay|auto]`
- `ensure-permissions --cap accessibility --cap screenRecording [--interactive]`
- UI automation + capture: use `peekaboo …` (Clawdis hosts PeekabooBridge; see `docs/mac/peekaboo.md`)
- `run -- cmd args... [--cwd] [--env KEY=VAL] [--timeout 30] [--needs-screen-recording]`
- `status`
- Nodes (bridge-connected companions):
- `node list` — lists paired + currently connected nodes, including advertised capabilities (e.g. `canvas`, `camera`) and hardware identifiers (`deviceFamily`, `modelIdentifier`).
- `node invoke --node <id> --command <name> [--params-json <json>]`
- Sounds: supply any macOS alert name with `--sound` per notification; omit the flag to use the system default. There is no longer a persisted “default sound” in the app UI.
- Priority: `timeSensitive` is best-effort and falls back to `active` unless the app is signed with the Time Sensitive Notifications entitlement.
- Delivery: `overlay` and `auto` show an in-app toast panel (bypasses Notification Center/Focus).
- Internals:
- For app-specific commands (`notify`, `ensure-permissions`, `run`, `status`): build `ClawdisIPC.Request`, send over the control socket.
- UI automation is intentionally not exposed via `clawdis-mac`; it lives behind PeekabooBridge and is surfaced by the `peekaboo` CLI.
## Integration with clawdis/Clawdis (Node/TS)
- Add helper module that shells to `clawdis-mac`:
- Prefer `ensure-permissions` before actions that need TCC.
- Use `notify` for desktop toasts; fall back to JS notifier only if CLI missing or platform ≠ macOS.
- Use `run` for tasks requiring privileged UI context (screen-recorded terminal runs, etc.).
- For UI automation, shell out to `peekaboo …` (text by default; add `--json` for structured output) and rely on PeekabooBridge host selection (Peekaboo.app → Clawdis.app → local).
## Onboarding
- Install CLI (symlink) → Permissions checklist → Test notification → Done.
- Remote mode skips local gateway/CLI steps.
## Deep links (URL scheme)
@@ -127,24 +89,12 @@ Notes:
- In local mode, Clawdis will start the local Gateway if needed before issuing the request.
- In remote mode, Clawdis will use the configured remote tunnel/endpoint.
## Permissions strategy
- All TCC prompts originate from the app bundle; CLI and Node stay headless.
- Permission checks are idempotent; onboarding surfaces missing grants and provides one-click request buttons.
## Build & dev workflow (native)
- `cd native && swift build` (debug) / `swift build -c release`.
- Run app for dev: `swift run Clawdis` (or Xcode scheme).
- Package app + helper: `swift build -c release && swift package --allow-writing-to-directory ../dist` (tbd exact script).
- Tests: add Swift Testing suites under `apps/macos/Tests` (especially IPC round-trips and permission probing fakes).
## Icon pipeline
- Source asset lives at `apps/macos/Icon.icon` (glass .icon bundle).
- Regenerate the bundled icns via `scripts/build_icon.sh` (uses ictool/icontool + sips), which outputs to
`apps/macos/Sources/Clawdis/Resources/Clawdis.icns` by default. Override `DEST_ICNS` to change the target.
The script also writes intermediate renders to `apps/macos/build/icon/`.
- Package app + CLI: `scripts/package-mac-app.sh` (builds bun CLI + gateway).
- Tests: add Swift Testing suites under `apps/macos/Tests`.
## Open questions / decisions
- Where to place the dev symlink `bin/clawdis-mac` (repo root vs. `apps/macos/bin`)?
- Should `runShell` support streaming stdout/stderr (IPC with AsyncSequence) or just buffered? (Start buffered; streaming later.)
- Icon: reuse Clawdis lobster or new mac-specific glyph?
- Sparkle updates: bundled via Sparkle; release builds point at `https://raw.githubusercontent.com/steipete/clawdis/main/appcast.xml` and enable auto-checks, while debug builds leave the feed blank and disable checks.
- Should `system.run` support streaming stdout/stderr or keep buffered responses only?
- Should we allow nodeside permission prompts, or always require explicit app UI action?

View File

@@ -154,6 +154,27 @@ Defaults:
}
```
### `gateway` (Gateway server mode + bind)
Use `gateway.mode` to explicitly declare whether this machine should run the Gateway.
Defaults:
- mode: **unset** (treated as “do not auto-start”)
- bind: `loopback`
```json5
{
gateway: {
mode: "local", // or "remote"
bind: "loopback",
// controlUi: { enabled: true }
}
}
```
Notes:
- `clawdis gateway` refuses to start unless `gateway.mode` is set to `local` (or you pass the override flag).
### `canvasHost` (LAN/tailnet Canvas file server + live reload)
The Gateway serves a directory of HTML/CSS/JS over HTTP so iOS/Android nodes can simply `canvas.navigate` to it.

View File

@@ -31,7 +31,7 @@ Non-goals (v1):
## Current repo reality (constraints we respect)
- The Gateway WebSocket server binds to `127.0.0.1:18789` (`src/gateway/server.ts`) with an optional `CLAWDIS_GATEWAY_TOKEN`.
- The Gateway exposes a LAN/tailnet Canvas file server (`canvasHost`) by default so nodes can `canvas.navigate` to `http://<lanHost>:<canvasPort>/` and auto-reload when files change (`docs/configuration.md`).
- macOS “Canvas” exists today, but is **mac-only** and controlled via mac app IPC (`clawdis-mac canvas ...`) rather than the Gateway protocol (`docs/mac/canvas.md`).
- macOS “Canvas” is controlled via the Gateway node protocol (`canvas.*`), matching iOS/Android (`docs/mac/canvas.md`).
- Voice wake forwards via `GatewayChannel` to Gateway `agent` (mac app: `VoiceWakeForwarder``GatewayConnection.sendAgent`).
## Recommended topology (B): Gateway-owned Bridge + loopback Gateway

View File

@@ -16,6 +16,8 @@ App bundle layout:
- `Clawdis.app/Contents/Resources/Relay/clawdis-gateway`
- bun `--compile` executable built from `dist/macos/gateway-daemon.js`
- `Clawdis.app/Contents/Resources/Relay/clawdis`
- bun `--compile` CLI executable built from `dist/index.js`
- `Clawdis.app/Contents/Resources/Relay/package.json`
- tiny “Pi compatibility” file (see below)
- `Clawdis.app/Contents/Resources/Relay/theme/`

View File

@@ -77,9 +77,9 @@ Implementation notes:
- Use an `NSTrackingArea` to fade the chrome in/out on `mouseEntered/mouseExited`.
- Optionally show close/drag affordances only while hovered.
## Agent API surface (proposed)
## Agent API surface (current)
Expose Canvas via the existing `clawdis-mac` → control socket → app routing so the agent can:
Canvas is exposed via the Gateway **node bridge**, so the agent can:
- Show/hide the panel.
- Navigate to a path (relative to the session root).
- Evaluate JavaScript and optionally return results.
@@ -94,21 +94,21 @@ Related:
## Agent commands (current)
`clawdis-mac` exposes Canvas via the control socket. For agent use, prefer `--json` so you can read the structured `CanvasShowResult` (including `status`).
Use the main `clawdis` CLI; it invokes canvas commands via `node.invoke`.
- `clawdis-mac canvas present [--session <key>] [--target <...>] [--x/--y/--width/--height]`
- `clawdis canvas present [--node <id>] [--target <...>] [--x/--y/--width/--height]`
- Local targets map into the session directory via the custom scheme (directory targets resolve `index.html|index.htm`).
- If `/` has no index file, Canvas shows the built-in A2UI shell and returns `status: "a2uiShell"`.
- `clawdis-mac canvas hide [--session <key>]`
- `clawdis-mac canvas eval --js <code> [--session <key>]`
- `clawdis-mac canvas snapshot [--out <path>] [--session <key>]`
- `clawdis canvas hide [--node <id>]`
- `clawdis canvas eval --js <code> [--node <id>]`
- `clawdis canvas snapshot [--node <id>]`
### Canvas A2UI
Canvas includes a built-in **A2UI v0.8** renderer (Lit-based). The agent can drive it with JSONL **server→client protocol messages** (one JSON object per line):
- `clawdis-mac canvas a2ui push --jsonl <path> [--session <key>]`
- `clawdis-mac canvas a2ui reset [--session <key>]`
- `clawdis canvas a2ui push --jsonl <path> [--node <id>]`
- `clawdis canvas a2ui reset [--node <id>]`
`push` expects a JSONL file where **each line is a single JSON object** (parsed and forwarded to the in-page A2UI renderer).
@@ -120,7 +120,7 @@ cat > /tmp/a2ui-v0.8.jsonl <<'EOF'
{"beginRendering":{"surfaceId":"main","root":"root"}}
EOF
clawdis-mac canvas a2ui push --jsonl /tmp/a2ui-v0.8.jsonl --session main
clawdis canvas a2ui push --jsonl /tmp/a2ui-v0.8.jsonl --node <id>
```
Notes:

View File

@@ -23,13 +23,11 @@ Run the Node-based Clawdis/clawdis gateway as a direct child of the LSUIElement
- **TCC:** behaviorally, child processes often inherit the parent apps “responsible process” for TCC, but this is *not a contract*. Continue to route all protected actions through the Swift app/broker so prompts stay tied to the signed app bundle.
## TCC guardrails (must keep)
- Screen Recording, Accessibility, mic, and speech prompts must originate from the signed Swift app/broker. The Node child should never call these APIs directly; use the CLI broker (`clawdis-mac`) for:
- `ensure-permissions`
- `ui screenshot` (via PeekabooBridge host)
- other `ui …` automation (see/click/type/scroll/wait) when implemented
- mic/speech permission checks
- notifications
- shell runs that need `needs-screen-recording`
- Screen Recording, Accessibility, mic, and speech prompts must originate from the signed Swift app/broker. The Node child should never call these APIs directly; route through the apps node commands (via Gateway `node.invoke`) for:
- `system.notify`
- `system.run` (including `needsScreenRecording`)
- `screen.record` / `camera.*`
- PeekabooBridge UI automation (`peekaboo …`)
- Usage strings (`NSMicrophoneUsageDescription`, `NSSpeechRecognitionUsageDescription`, etc.) stay in the app targets Info.plist; a bare Node binary has none and would fail.
- If you ever embed Node that *must* touch TCC, wrap that call in a tiny signed helper target inside the app bundle and have Node exec that helper instead of calling the API directly.
@@ -69,6 +67,6 @@ Run the Node-based Clawdis/clawdis gateway as a direct child of the LSUIElement
- Do we want a tiny signed helper for rare TCC actions that cannot be brokered via the Swift app/broker?
## Decision snapshot (current recommendation)
- Keep all TCC surfaces in the Swift app/broker (control socket + PeekabooBridgeHost).
- Keep all TCC surfaces in the Swift app/broker (node commands + PeekabooBridgeHost).
- Implement `GatewayProcessManager` with Swift Subprocess to start/stop the gateway on the “Clawdis Active” toggle.
- Maintain the launchd path as a fallback for uptime/login persistence until child-mode proves stable.
- Maintain the launchd path as a fallback for uptime/login persistence until child-mode proves stable.

View File

@@ -67,7 +67,7 @@ What Clawdis should *not* embed:
- **XPC**: dont reintroduce helper targets; use the bridge.
## IPC / CLI surface
### No `clawdis-mac ui …`
### No `clawdis ui …`
We avoid a parallel “Clawdis UI automation CLI”. Instead:
- `peekaboo` is the user/agent-facing CLI surface for automation and capture.
- Clawdis.app can host PeekabooBridge as a **thin TCC broker** so Peekaboo can piggyback on Clawdis permissions when Peekaboo.app isnt running.

View File

@@ -7,7 +7,7 @@ read_when:
Updated: 2025-12-08
This flow lets the macOS app act as a full remote control for a Clawdis gateway running on another host (e.g. a Mac Studio). All features—health checks, permissions bootstrapping via the helper CLI, Voice Wake forwarding, and Web Chat—reuse the same remote SSH configuration from *Settings → General*.
This flow lets the macOS app act as a full remote control for a Clawdis gateway running on another host (e.g. a Mac Studio). All features—health checks, Voice Wake forwarding, and Web Chat—reuse the same remote SSH configuration from *Settings → General*.
## Modes
- **Local (this Mac)**: Everything runs on the laptop. No SSH involved.
@@ -15,7 +15,7 @@ This flow lets the macOS app act as a full remote control for a Clawdis gateway
## Prereqs on the remote host
1) Install Node + pnpm and build/install the Clawdis CLI (`pnpm install && pnpm build && pnpm link --global`).
2) Ensure `clawdis` is on PATH for non-interactive shells. If you prefer, symlink `clawdis-mac` too so TCC-capable actions can run remotely when needed.
2) Ensure `clawdis` is on PATH for non-interactive shells (symlink into `/usr/local/bin` or `/opt/homebrew/bin` if needed).
3) Open SSH with key auth. We recommend **Tailscale** IPs for stable reachability off-LAN.
## macOS app setup
@@ -34,7 +34,7 @@ This flow lets the macOS app act as a full remote control for a Clawdis gateway
## Permissions
- The remote host needs the same TCC approvals as local (Automation, Accessibility, Screen Recording, Microphone, Speech Recognition, Notifications). Run onboarding on that machine to grant them once.
- When remote commands need local TCC (e.g., screenshots on the remote Mac), ensure `clawdis-mac` is installed there so the helper can request/hold those permissions.
- Nodes advertise their permission state via `node.list` / `node.describe` so agents know whats available.
## WhatsApp login flow (remote)
- Run `clawdis login --verbose` **on the remote host**. Scan the QR with WhatsApp on your phone.
@@ -47,10 +47,10 @@ This flow lets the macOS app act as a full remote control for a Clawdis gateway
- **Voice Wake**: trigger phrases are forwarded automatically in remote mode; no separate forwarder is needed.
## Notification sounds
Pick sounds per notification from scripts with the helper CLI, e.g.:
Pick sounds per notification from scripts with `clawdis` and `node.invoke`, e.g.:
```bash
clawdis-mac notify --title "Ping" --body "Remote gateway ready" --sound Glass
clawdis nodes notify --node <id> --title "Ping" --body "Remote gateway ready" --sound Glass
```
There is no global “default sound” toggle in the app anymore; callers choose a sound (or none) per request.

View File

@@ -1,21 +1,21 @@
---
summary: "macOS IPC architecture for Clawdis app, CLI helper, and gateway bridge (control socket + XPC + PeekabooBridge)"
summary: "macOS IPC architecture for Clawdis app, gateway node bridge, and PeekabooBridge"
read_when:
- Editing IPC contracts or menu bar app IPC
---
# Clawdis macOS IPC architecture (Dec 2025)
Note: the current implementation primarily uses a local UNIX-domain control socket (`controlSocketPath`) between `clawdis-mac` and the app. This doc captures the intended long-term Mach/XPC direction and the security constraints, and also documents the separate PeekabooBridge socket used for UI automation.
**Current model:** there is **no local control socket** and no `clawdis-mac` CLI. All agent actions go through the Gateway WebSocket and `node.invoke`. UI automation still uses PeekabooBridge.
## Goals
- Single GUI app instance that owns all TCC-facing work (notifications, screen recording, mic, speech, AppleScript).
- A small surface for automation: the `clawdis-mac` CLI and the Node gateway talk to the app via local IPC.
- A small surface for automation: Gateway + node commands, plus PeekabooBridge for UI automation.
- Predictable permissions: always the same signed bundle ID, launched by launchd, so TCC grants stick.
- Limit who can connect: only signed clients from our team (with an explicit DEBUG-only escape hatch for development).
## How it works
### Control socket (current)
- `clawdis-mac` talks to the app via a local UNIX socket (`controlSocketPath`) for app-specific requests (notify, status, ensure-permissions, run, etc.).
### Gateway + node bridge (current)
- The app runs the Gateway (local mode) and connects to it as a node.
- Agent actions are performed via `node.invoke` (e.g. `system.run`, `system.notify`, `canvas.*`).
### PeekabooBridge (UI automation)
- UI automation uses a separate UNIX socket named `bridge.sock` and the PeekabooBridge JSON protocol.
@@ -24,29 +24,17 @@ Note: the current implementation primarily uses a local UNIX-domain control sock
- See: `docs/mac/peekaboo.md` for the Clawdis plan and naming.
### Mach/XPC (future direction)
- The app registers a Mach service named `com.steipete.clawdis.xpc` via a user LaunchAgent at `~/Library/LaunchAgents/com.steipete.clawdis.plist`.
- The launch agent runs `dist/Clawdis.app/Contents/MacOS/Clawdis` with `RunAtLoad=true`, `KeepAlive=false`, and a `MachServices` entry for the XPC name.
- The app hosts the XPC listener (`NSXPCListener(machServiceName:)`) and exports `ClawdisXPCService`.
- The CLI (`clawdis-mac`) connects with `NSXPCConnection(machServiceName:)`; the Node gateway shells out to the CLI.
- Security: on incoming connections we read the audit token (or PID) and allow only:
- Code-signed clients with team ID `Y5PE65HELJ`.
- In `DEBUG` builds only, you can opt into allowing same-UID clients by setting `CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`.
- Still optional for internal app services, but **not required** for automation now that node.invoke is the surface.
## Operational flows
- Restart/rebuild: `SIGN_IDENTITY="Apple Development: Peter Steinberger (2ZAC4GM7GD)" scripts/restart-mac.sh`
- Kills existing instances
- Swift build + package
- Writes/bootstraps/kickstarts the LaunchAgent
- CLI version: `clawdis-mac --version` (pulled from `package.json` during packaging)
- Single instance: app exits early if another instance with the same bundle ID is running.
## Why launchd (not anonymous endpoints)
- A Mach service avoids brittle endpoint handoffs and lets the CLI/Node connect even if the app was started by launchd.
- RunAtLoad without KeepAlive means the app starts once; if it crashes it stays down (no unwanted respawn), but CLI calls will re-spawn via launchd.
## Hardening notes
- Prefer requiring a TeamID match for all privileged surfaces.
- Clawdis control socket: `CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (DEBUG-only) may allow same-UID callers for local development.
- PeekabooBridge: `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (DEBUG-only) may allow same-UID callers for local development.
- PeekabooBridge: `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (DEBUG-only) may allow same-UID callers for local development.
- All communication remains local-only; no network sockets are exposed.
- TCC prompts originate only from the GUI app bundle; run scripts/package-mac-app.sh so the signed bundle ID stays stable.
- TCC prompts originate only from the GUI app bundle; run `scripts/package-mac-app.sh` so the signed bundle ID stays stable.

View File

@@ -1,5 +1,5 @@
---
summary: "Nodes: pairing, capabilities (canvas/camera), and the CLI helpers for screenshots + clips"
summary: "Nodes: pairing, capabilities, permissions, and CLI helpers for canvas/camera/screen/system"
read_when:
- Pairing iOS/Android nodes to a gateway
- Using node canvas/camera for agent context
@@ -8,7 +8,7 @@ read_when:
# Nodes
A **node** is a companion device (iOS/Android today) that connects to the Gateway over the **Bridge** and exposes a small command surface (e.g. `canvas.*`, `camera.*`) via `node.invoke`.
A **node** is a companion device (iOS/Android today) that connects to the Gateway over the **Bridge** and exposes a command surface (e.g. `canvas.*`, `camera.*`, `system.*`) via `node.invoke`.
macOS can also run in **node mode**: the menubar app connects to the Gateways bridge and exposes its local canvas/camera commands as a node (so `clawdis nodes …` works against this Mac).
@@ -90,6 +90,25 @@ Notes:
- Screen recordings are clamped to `<= 60s`.
- `--no-audio` disables microphone capture (supported on iOS/Android; macOS uses system capture audio).
## System commands (mac node)
The macOS node exposes `system.run` and `system.notify`.
Examples:
```bash
clawdis nodes run --node <idOrNameOrIp> -- echo "Hello from mac node"
clawdis nodes notify --node <idOrNameOrIp> --title "Ping" --body "Gateway ready"
```
Notes:
- `system.run` returns stdout/stderr/exit code in the payload.
- `system.notify` respects notification permission state on the macOS app.
## Permissions map
Nodes may include a `permissions` map in `node.list` / `node.describe`, keyed by permission name (e.g. `screenRecording`, `accessibility`) with boolean values (`true` = granted).
## Mac node mode
- The macOS menubar app connects to the Gateway bridge as a node (so `clawdis nodes …` works against this Mac).

View File

@@ -20,7 +20,7 @@
- Avoid double-sending actions when the bundled A2UI shell is present (let the shell forward clicks so it can resolve richer context).
- Intercept `clawdis://…` navigations inside the Canvas WKWebView and route them through `DeepLinkHandler` (no NSWorkspace bounce).
- `GatewayConnection` auto-starts the local gateway (and retries briefly) when a request fails in `.local` mode, so Canvas actions dont silently fail if the gateway isnt running yet.
- Fix a crash that made `clawdis-mac canvas present`/`eval` look “hung”:
- Fix a crash that made `clawdis canvas present`/`eval` look “hung”:
- `VoicePushToTalkHotkey`s NSEvent monitor could call `@MainActor` code off-main, triggering executor checks / EXC_BAD_ACCESS on macOS 26.2.
- Now it hops back to the main actor before mutating state.
- Preserve in-page state when closing Canvas (hide the window instead of closing the `WKWebView`).

View File

@@ -0,0 +1,64 @@
---
summary: "Refactor: unify on the clawdis CLI + gateway-first control; retire clawdis-mac"
read_when:
- Removing or replacing the macOS CLI helper
- Adding node capabilities or permissions metadata
- Updating macOS app packaging/install flows
---
# CLI unification (clawdis-only)
Status: active refactor · Date: 2025-12-20
## Goals
- **Single CLI**: use `clawdis` for all automation (local + remote). Retire `clawdis-mac`.
- **Gateway-first**: all agent actions flow through the Gateway WebSocket + node.invoke.
- **Permission awareness**: nodes advertise permission state so the agent can decide what to run.
- **No duplicate paths**: remove macOS control socket + Swift CLI surface.
## Non-goals
- Keep legacy `clawdis-mac` compatibility.
- Support agent control when no Gateway is running.
## Key decisions
1) **No Gateway → no control**
- If the macOS app is running but the Gateway is not, remote commands (canvas/run/notify) are unavailable.
- This is acceptable to keep one network surface.
2) **Remove ensure-permissions CLI**
- Permissions are **advertised by the node** (e.g., screen recording granted/denied).
- Commands will still fail with explicit errors when permissions are missing.
3) **Mac app installs/symlinks `clawdis`**
- Bundle a standalone `clawdis` binary in the app (bun-compiled).
- Install/symlink that binary to `/usr/local/bin/clawdis` and `/opt/homebrew/bin/clawdis`.
- No `clawdis-mac` helper remains.
4) **Canvas parity across node types**
- Use `node.invoke` commands consistently (`canvas.present|navigate|eval|snapshot|a2ui.*`).
- The TS CLI provides convenient wrappers so agents never have to craft raw `node.invoke` calls.
## Command surface (new/normalized)
- `clawdis nodes invoke --command canvas.*` remains valid.
- New CLI wrappers for convenience:
- `clawdis canvas present|navigate|eval|snapshot|a2ui push|a2ui reset`
- New node commands (mac-only initially):
- `system.run` (shell execution)
- `system.notify` (local notifications)
## Permission advertising
- Node hello/pairing includes a `permissions` map:
- Example keys: `screenRecording`, `accessibility`, `microphone`, `notifications`, `speechRecognition`.
- Values: boolean (`true` = granted, `false` = not granted).
- Gateway `node.list` / `node.describe` surfaces the map.
## Gateway mode + config
- Gateways should only auto-start when explicitly configured for **local** mode.
- When config is missing or explicitly remote, `clawdis gateway` should refuse to auto-start unless forced.
## Implementation checklist
- Add bun-compiled `clawdis` binary to macOS app bundle; update codesign + install flows.
- Remove `ClawdisCLI` target and control socket server.
- Add node command(s) for `system.run` and `system.notify` on macOS.
- Add permission map to node hello/pairing + gateway responses.
- Update TS CLI + docs to use `clawdis` only.