security(macos): require TeamID for control socket

2025-12-13 02:50:20 +00:00
parent e95fdbbc37
commit 36b93c8dc7
3 changed files with 158 additions and 6 deletions
--- a/apps/macos/Sources/Clawdis/ControlSocketServer.swift
+++ b/apps/macos/Sources/Clawdis/ControlSocketServer.swift
@@ -225,12 +225,30 @@ final actor ControlSocketServer {
        let r = getsockopt(fd, SOL_LOCAL, LOCAL_PEERPID, &pid, &pidSize)
        guard r == 0, pid > 0 else { return false }

-        // Same-user quick check
-        if let callerUID = self.uid(for: pid), callerUID == getuid() {
+        // Always require a valid code signature match (TeamID).
+        // This prevents any same-UID process from driving the app's privileged surface.
+        if self.teamIDMatches(pid: pid, allowedTeamIDs: allowedTeamIDs) {
            return true
        }

-        return self.teamIDMatches(pid: pid, allowedTeamIDs: allowedTeamIDs)
+        #if DEBUG
+        // Debug-only escape hatch: allow unsigned/same-UID clients when explicitly opted in.
+        // This keeps local development workable (e.g. a SwiftPM-built `clawdis-mac` binary).
+        let env = ProcessInfo.processInfo.environment["CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS"]
+        if env == "1", let callerUID = self.uid(for: pid), callerUID == getuid() {
+            self.logger.warning(
+                "allowing unsigned same-UID socket client pid=\(pid, privacy: .public) due to CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS=1")
+            return true
+        }
+        #endif
+
+        if let callerUID = self.uid(for: pid) {
+            self.logger.error(
+                "socket client rejected pid=\(pid, privacy: .public) uid=\(callerUID, privacy: .public)")
+        } else {
+            self.logger.error("socket client rejected pid=\(pid, privacy: .public) (uid unknown)")
+        }
+        return false
    }

    private nonisolated static func uid(for pid: pid_t) -> uid_t? {
--- a/docs/mac/peekaboo.md
+++ b/docs/mac/peekaboo.md
@@ -0,0 +1,134 @@
+---
+summary: "Plan for integrating Peekaboo automation + visualizer into Clawdis macOS app (via clawdis-mac)"
+read_when:
+  - Adding UI automation commands
+  - Integrating Peekaboo as a submodule
+  - Changing clawdis-mac IPC/output formats
+---
+# Peekaboo in Clawdis (macOS UI automation + visualizer)
+
+## Goal
+Reuse Peekaboo’s mac automation “core” inside **Clawdis.app** so we piggyback on Clawdis’ existing TCC grants (Screen Recording, Accessibility, etc.). The CLI (`clawdis-mac`) stays a thin synchronous trigger surface for **single actions** (no batches), returning errors cleanly.
+
+Non-goals:
+- No AI/agent runtime parts from Peekaboo (no Tachikoma/MCP/Commander entrypoints).
+- No auto-onboarding or System Settings deep-linking from the automation layer (Clawdis onboarding already handles that).
+
+## Where code lives
+- **Clawdis.app (macOS)**: owns all automation + visualization + TCC prompts.
+- **`clawdis-mac` CLI**: sends one request, waits, prints result, exits non-zero on failure.
+- **Gateway/Node/TS**: shells out to `clawdis-mac` when it needs TCC-backed actions.
+
+Transport: existing UNIX domain socket (`controlSocketPath`) already used by `clawdis-mac`.
+
+## Dependencies (submodule strategy)
+Integrate Peekaboo via git submodule (nested submodules OK).
+
+Consume only:
+- `PeekabooAutomationKit` (AX automation, element detection, capture helpers; no Tachikoma/MCP).
+- `AXorcist` (input driving / AX helpers).
+- `PeekabooVisualizer` (overlay visualizations).
+
+Important nuance:
+- `PeekabooVisualizer` currently ships as the `PeekabooVisualizer` product inside `PeekabooCore/Package.swift`. That package declares other dependencies (including a path dependency to Tachikoma). SwiftPM will still need those paths to exist during dependency resolution even if we don’t build those targets.
+  - If this is too annoying for Clawdis, the follow-up is to extract `PeekabooVisualizer` into its own standalone Swift package that depends only on `PeekabooFoundation`/`PeekabooProtocols`/`PeekabooExternalDependencies`.
+
+## IPC / CLI surface
+### Namespacing
+Add new automation commands behind a `ui` prefix:
+- `clawdis-mac ui …` for UI automation + visualization-related actions.
+- Keep existing top-level commands (`notify`, `run`, `canvas …`, etc.) for compatibility, but `screenshot` should become an alias of `ui screenshot` once Peekaboo takes it over.
+
+### Output format
+Change `clawdis-mac` to default to human text output:
+- **Default**: plain text; errors are string messages to stderr; exit codes indicate success/failure.
+- **`--json`**: structured output (for agents/scripts) with stable schemas.
+
+This applies globally, not only `ui` commands.
+
+### Timeouts
+Default timeout for UI actions: **10 seconds** end-to-end (CLI already defaults to 10s).
+- CLI: keep the fail-fast default at 10s (unless a command explicitly requests longer).
+- Server: only has a ~5s read/decode timeout today; UI operations must also enforce their own per-action timeout so “wait for element” can fail deterministically.
+
+## Coordinate model (multi-display)
+Requirement: coordinates are **per screen**, not global.
+
+Proposed API shape:
+- Requests accept `screenIndex` + `{x, y}` in that screen’s local coordinate space.
+- Clawdis.app converts to global CG coordinates using `NSScreen.screens[screenIndex].frame.origin`.
+- Responses should echo both:
+  - The resolved `screenIndex`
+  - The local `{x, y}` and bounds
+  - Optionally the global `{x, y}` for debugging
+
+Ordering: use `NSScreen.screens` ordering consistently (documented in the CLI help + JSON schema).
+
+## Targeting (per app/window)
+Expose window/app targeting in the IPC surface (based on Peekaboo’s existing `WindowTarget` model):
+- frontmost
+- by app name / bundle id
+- by window title substring
+- by (app, index)
+- by window id
+
+All “see/click/type/scroll/wait” requests should accept a target (default: frontmost).
+
+## “See” + click packs (Playwright-style)
+Peekaboo already has the core ingredients:
+- element detection yielding stable IDs (e.g., `B1`, `T3`)
+- bounds + labels/values
+- session IDs to allow follow-up actions without re-scanning
+
+Clawdis’s `ui see` should:
+- capture (optionally targeted) window/screen
+- return a **session id**
+- return a list of elements with `{id, type, label/value?, bounds}`
+- optionally return screenshot path/bytes (pref: path)
+
+## Visualizer integration
+Visualizer must be user-toggleable via a Clawdis setting.
+
+Implementation sketch:
+- Add a Clawdis UserDefaults-backed setting (e.g. `clawdis.ui.visualizerEnabled`).
+- Implement Peekaboo’s `VisualizerSettingsProviding` in Clawdis (`visualizerEnabled`, animation speed, and per-effect toggles).
+- Create a Clawdis-specific `AutomationFeedbackClient` that forwards PeekabooAutomationKit feedback events into a shared `VisualizerCoordinator`.
+
+Current state:
+- `PeekabooVisualizer` already includes the visualization implementation (SwiftUI overlay views + coordinator).
+
+Open requirement:
+- “Any AX event should be clickable.” Today the visualizer is display-only; the likely follow-up is:
+  - make the annotated element overlays tappable (debug tool)
+  - surface tap → element id → send a `ui click --element <id> --session <sid>` request back through Clawdis’ control channel (or a local callback if the visualizer runs inside the app)
+
+## Screenshots (legacy → Peekaboo takeover)
+Clawdis currently has a legacy `screenshot` request returning raw PNG bytes in `Response.payload`.
+
+Migration plan:
+- Replace capture implementation with PeekabooAutomationKit’s capture service so we share:
+  - per-screen mapping
+  - window/app targeting
+  - visual feedback (flash / watch HUD) when enabled
+- Prefer writing images to a file path on the app side and returning the path (text-friendly), with `--json` providing the structured metadata.
+
+## Permissions behavior
+If required permissions are missing:
+- return `ok=false` with a short human error message (e.g., “Accessibility permission missing”)
+- do not try to open System Settings from the automation endpoint
+
+## Security (socket auth)
+Clawdis’ socket is protected by:
+- filesystem perms on the socket path (owner read/write only)
+- server-side caller check:
+  - requires the caller’s code signature TeamID to be `Y5PE65HELJ`
+  - in `DEBUG` builds only, an explicit escape hatch allows same-UID clients when `CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` is set (development convenience)
+
+This ensures “any local process” can’t drive the privileged surface just because it runs under the same macOS user.
+
+## Next integration steps (after this doc)
+1. Add Peekaboo as a git submodule (and required nested submodules).
+2. Wire SwiftPM deps in `apps/macos/Package.swift` to import `PeekabooAutomationKit` + `PeekabooVisualizer`.
+3. Extend `ClawdisIPC.Request` with `ui.*` commands (`see/click/type/scroll/wait/screenshot/windows/screens`).
+4. Implement handlers in Clawdis.app and route through PeekabooAutomationKit services.
+5. Update `clawdis-mac` output defaults (text + `--json`), and adjust any internal call sites that relied on JSON-by-default.
--- a/docs/mac/xpc.md
+++ b/docs/mac/xpc.md
@@ -9,7 +9,7 @@ read_when:
 - Single GUI app instance that owns all TCC-facing work (notifications, screen recording, mic, speech, AppleScript).
 - A small surface for automation: the `clawdis-mac` CLI and the Node gateway talk to the app via a local XPC channel.
 - Predictable permissions: always the same signed bundle ID, launched by launchd, so TCC grants stick.
- Limit who can connect: only signed clients from our team (with a same-UID fallback for development).
+- Limit who can connect: only signed clients from our team (with an explicit DEBUG-only escape hatch for development).

 ## How it works
 - The app registers a Mach service named `com.steipete.clawdis.xpc` via a user LaunchAgent at `~/Library/LaunchAgents/com.steipete.clawdis.plist`.
@@ -17,8 +17,8 @@ read_when:
 - The app hosts the XPC listener (`NSXPCListener(machServiceName:)`) and exports `ClawdisXPCService`.
 - The CLI (`clawdis-mac`) connects with `NSXPCConnection(machServiceName:)`; the Node gateway shells out to the CLI.
 - Security: on incoming connections we read the audit token (or PID) and allow only:
-  - Code-signed clients with team ID `Y5PE65HELJ`; or
-  - Same-UID processes (fallback to avoid blocking local dev).
+  - Code-signed clients with team ID `Y5PE65HELJ`.
+  - In `DEBUG` builds only, you can opt into allowing same-UID clients by setting `CLAWDIS_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`.

 ## Operational flows
 - Restart/rebuild: `SIGN_IDENTITY="Apple Development: Peter Steinberger (2ZAC4GM7GD)" scripts/restart-mac.sh`