docs: reorganize documentation structure
This commit is contained in:
133
docs/platforms/mac/bun.md
Normal file
133
docs/platforms/mac/bun.md
Normal file
@@ -0,0 +1,133 @@
|
||||
---
|
||||
summary: "Bundled bun gateway: packaging, launchd, signing, and bytecode"
|
||||
read_when:
|
||||
- Packaging Clawdbot.app
|
||||
- Debugging the bundled gateway binary
|
||||
- Changing bun build flags or codesigning
|
||||
---
|
||||
|
||||
# Bundled bun Gateway (macOS)
|
||||
|
||||
Goal: ship **Clawdbot.app** with a self-contained relay binary that can run both the CLI and the Gateway daemon. No global `npm install -g clawdbot`, no system Node requirement.
|
||||
|
||||
## What gets bundled
|
||||
|
||||
App bundle layout:
|
||||
|
||||
- `Clawdbot.app/Contents/Resources/Relay/clawdbot`
|
||||
- bun `--compile` relay executable built from [`dist/macos/relay.js`](https://github.com/clawdbot/clawdbot/blob/main/dist/macos/relay.js)
|
||||
- Supports:
|
||||
- `clawdbot …` (CLI)
|
||||
- `clawdbot gateway-daemon …` (LaunchAgent daemon)
|
||||
- `Clawdbot.app/Contents/Resources/Relay/package.json`
|
||||
- tiny “p runtime compatibility” file (see below)
|
||||
- `Clawdbot.app/Contents/Resources/Relay/theme/`
|
||||
- p TUI theme payload (optional, but strongly recommended)
|
||||
|
||||
Why the sidecar files matter:
|
||||
- The embedded p runtime detects “bun binary mode” and then looks for `package.json` + `theme/` **next to `process.execPath`** (i.e. next to `clawdbot`).
|
||||
- So even if bun can embed assets, the runtime expects filesystem paths. Keep the sidecar files.
|
||||
|
||||
## Build pipeline
|
||||
|
||||
Packaging script:
|
||||
- [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh)
|
||||
|
||||
It builds:
|
||||
- TS: `pnpm exec tsc`
|
||||
- Swift app + helper: `swift build …`
|
||||
- bun relay: `bun build dist/macos/relay.js --compile --bytecode …`
|
||||
|
||||
Important bundler flags:
|
||||
- `--compile`: produces a standalone executable
|
||||
- `--bytecode`: reduces startup time / parsing overhead (works here)
|
||||
- externals:
|
||||
- `-e electron`
|
||||
- Reason: avoid bundling Electron stubs in the relay binary
|
||||
|
||||
Version injection:
|
||||
- `--define "__CLAWDBOT_VERSION__=\"<pkg version>\""`
|
||||
- [`src/version.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/version.ts) also supports `__CLAWDBOT_VERSION__` (and `CLAWDBOT_BUNDLED_VERSION`) so `--version` doesn’t depend on reading `package.json` at runtime.
|
||||
|
||||
## Launchd (Gateway as LaunchAgent)
|
||||
|
||||
Label:
|
||||
- `com.clawdbot.gateway`
|
||||
|
||||
Plist location (per-user):
|
||||
- `~/Library/LaunchAgents/com.clawdbot.gateway.plist`
|
||||
|
||||
Manager:
|
||||
- [`apps/macos/Sources/Clawdbot/GatewayLaunchAgentManager.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/GatewayLaunchAgentManager.swift)
|
||||
|
||||
Behavior:
|
||||
- “Clawdbot Active” enables/disables the LaunchAgent.
|
||||
- App quit does **not** stop the gateway (launchd keeps it alive).
|
||||
|
||||
Logging:
|
||||
- launchd stdout/err: `/tmp/clawdbot/clawdbot-gateway.log`
|
||||
|
||||
Default LaunchAgent env:
|
||||
- `CLAWDBOT_IMAGE_BACKEND=sips` (avoid sharp native addon under bun)
|
||||
|
||||
## Codesigning (hardened runtime + bun)
|
||||
|
||||
Symptom (when mis-signed):
|
||||
- `Ran out of executable memory …` on launch
|
||||
|
||||
Fix:
|
||||
- The bun executable needs JIT-ish permissions under hardened runtime.
|
||||
- [`scripts/codesign-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/codesign-mac-app.sh) signs `Relay/clawdbot` with:
|
||||
- `com.apple.security.cs.allow-jit`
|
||||
- `com.apple.security.cs.allow-unsigned-executable-memory`
|
||||
|
||||
## Image processing under bun
|
||||
|
||||
Problem:
|
||||
- bun can’t load some native Node addons like `sharp` (and we don’t want to ship native addon trees for the gateway).
|
||||
|
||||
Solution:
|
||||
- Central helper [`src/media/image-ops.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/media/image-ops.ts)
|
||||
- Prefers `/usr/bin/sips` on macOS (esp. when running under bun)
|
||||
- Falls back to `sharp` when available (Node/dev)
|
||||
- Used by:
|
||||
- [`src/web/media.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/web/media.ts) (optimize inbound/outbound images)
|
||||
- [`src/browser/screenshot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/browser/screenshot.ts)
|
||||
- [`src/agents/pi-tools.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-tools.ts) (image sanitization)
|
||||
|
||||
## Browser control server
|
||||
|
||||
The Gateway starts the browser control server (loopback only) from [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts).
|
||||
It’s started from the relay daemon process, so the relay binary includes Playwright deps.
|
||||
|
||||
## Tests / smoke checks
|
||||
|
||||
From a packaged app (local build):
|
||||
|
||||
```bash
|
||||
dist/Clawdbot.app/Contents/Resources/Relay/clawdbot --version
|
||||
|
||||
CLAWDBOT_SKIP_PROVIDERS=1 \
|
||||
CLAWDBOT_SKIP_CANVAS_HOST=1 \
|
||||
dist/Clawdbot.app/Contents/Resources/Relay/clawdbot gateway-daemon --port 18999 --bind loopback
|
||||
```
|
||||
|
||||
Then, in another shell:
|
||||
|
||||
```bash
|
||||
pnpm -s clawdbot gateway call health --url ws://127.0.0.1:18999 --timeout 3000
|
||||
```
|
||||
|
||||
## Repo hygiene
|
||||
|
||||
Bun may leave dotfiles like `*.bun-build` in the repo root or subfolders.
|
||||
- These are ignored via `.gitignore` (`*.bun-build`).
|
||||
|
||||
## DMG styling (human installer)
|
||||
|
||||
[`scripts/create-dmg.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/create-dmg.sh) styles the DMG via Finder AppleScript.
|
||||
|
||||
Rules of thumb:
|
||||
- Use a **72dpi** background image that matches the Finder window size in points.
|
||||
- Preferred asset: `assets/dmg-background-small.png` (**500×320**).
|
||||
- Default icon positions: app `{125,160}`, Applications `{375,160}`.
|
||||
161
docs/platforms/mac/canvas.md
Normal file
161
docs/platforms/mac/canvas.md
Normal file
@@ -0,0 +1,161 @@
|
||||
---
|
||||
summary: "Agent-controlled Canvas panel embedded via WKWebView + custom URL scheme"
|
||||
read_when:
|
||||
- Implementing the macOS Canvas panel
|
||||
- Adding agent controls for visual workspace
|
||||
- Debugging WKWebView canvas loads
|
||||
---
|
||||
|
||||
# Canvas (macOS app)
|
||||
|
||||
Status: draft spec · Date: 2025-12-12
|
||||
|
||||
Note: for iOS/Android nodes that should render agent-edited HTML/CSS/JS over the network, prefer the Gateway `canvasHost` (serves `~/clawd/canvas` over LAN/tailnet with live reload). A2UI is also **hosted by the Gateway** over HTTP. This doc focuses on the macOS in-app canvas panel. See [`docs/configuration.md`](/configuration).
|
||||
|
||||
Clawdbot can embed an agent-controlled “visual workspace” panel (“Canvas”) inside the macOS app using `WKWebView`, served via a **custom URL scheme** (no loopback HTTP port required).
|
||||
|
||||
This is designed for:
|
||||
- Agent-written HTML/CSS/JS on disk (per-session directory).
|
||||
- A real browser engine for layout, rendering, and basic interactivity.
|
||||
- Agent-driven visibility (show/hide), navigation, DOM/JS queries, and snapshots.
|
||||
- Minimal chrome: borderless panel; bezel/chrome appears only on hover.
|
||||
|
||||
## Why a custom scheme (vs. loopback HTTP)
|
||||
|
||||
Using `WKURLSchemeHandler` keeps Canvas entirely in-process:
|
||||
- No port conflicts and no extra local server lifecycle.
|
||||
- Easier to sandbox: only serve files we explicitly map.
|
||||
- Works offline and can use an ephemeral data store (no persistent cookies/cache).
|
||||
|
||||
If a Canvas page truly needs “real web” semantics (CORS, fetch to loopback endpoints, service workers), consider the loopback-server variant instead (out of scope for this doc).
|
||||
|
||||
## URL ↔ directory mapping
|
||||
|
||||
The Canvas scheme is:
|
||||
- `clawdbot-canvas://<session>/<path>`
|
||||
|
||||
Routing model:
|
||||
- `clawdbot-canvas://main/` → `<canvasRoot>/main/index.html` (or `index.htm`)
|
||||
- `clawdbot-canvas://main/yolo` → `<canvasRoot>/main/yolo/index.html` (or `index.htm`)
|
||||
- `clawdbot-canvas://main/assets/app.css` → `<canvasRoot>/main/assets/app.css`
|
||||
|
||||
Directory listings are not served.
|
||||
|
||||
When `/` has no `index.html` yet, the handler serves a **built-in scaffold page** (bundled with the macOS app).
|
||||
This is a visual placeholder only (no A2UI renderer).
|
||||
|
||||
### Suggested on-disk location
|
||||
|
||||
Store Canvas state under the app support directory:
|
||||
- `~/Library/Application Support/Clawdbot/canvas/<session>/…`
|
||||
|
||||
This keeps it alongside other app-owned state and avoids mixing with `~/.clawdbot/` gateway config.
|
||||
|
||||
## Panel behavior (agent-controlled)
|
||||
|
||||
Canvas is presented as a borderless `NSPanel` (similar to the existing WebChat panel):
|
||||
- Can be shown/hidden at any time by the agent.
|
||||
- Supports an “anchored” presentation (near the menu bar icon or another anchor rect).
|
||||
- Uses a rounded container; shadow stays on, but **chrome/bezel only appears on hover**.
|
||||
- Default position is the **top-right corner** of the current screen’s visible frame (unless the user moved/resized it previously).
|
||||
- The panel is **user-resizable** (edge resize + hover resize handle) and the last frame is persisted per session.
|
||||
|
||||
### Hover-only chrome
|
||||
|
||||
Implementation notes:
|
||||
- Keep the window borderless at all times (don’t toggle `styleMask`).
|
||||
- Add an overlay view inside the content container for chrome (stroke + subtle gradient/material).
|
||||
- Use an `NSTrackingArea` to fade the chrome in/out on `mouseEntered/mouseExited`.
|
||||
- Optionally show close/drag affordances only while hovered.
|
||||
|
||||
## Agent API surface (current)
|
||||
|
||||
Canvas is exposed via the Gateway **node bridge**, so the agent can:
|
||||
- Show/hide the panel.
|
||||
- Navigate to a path (relative to the session root).
|
||||
- Evaluate JavaScript and optionally return results.
|
||||
- Query/modify DOM (helpers mirroring “dom query/all/attr/click/type/wait” patterns).
|
||||
- Capture a snapshot image of the current canvas view.
|
||||
- Optionally set panel placement (screen `x/y` + `width/height`) when showing/navigating.
|
||||
|
||||
This should be modeled after `WebChatManager`/`WebChatSwiftUIWindowController` but targeting `clawdbot-canvas://…` URLs.
|
||||
|
||||
Related:
|
||||
- For “invoke the agent again from UI” flows, prefer the macOS deep link scheme (`clawdbot://agent?...`) so *any* UI surface (Canvas, WebChat, native views) can trigger a new agent run. See [`docs/macos.md`](/macos).
|
||||
|
||||
## Agent commands (current)
|
||||
|
||||
Use the main `clawdbot` CLI; it invokes canvas commands via `node.invoke`.
|
||||
|
||||
- `clawdbot canvas present [--node <id>] [--target <...>] [--x/--y/--width/--height]`
|
||||
- Local targets map into the session directory via the custom scheme (directory targets resolve `index.html|index.htm`).
|
||||
- If `/` has no index file, Canvas shows the built-in scaffold page and returns `status: "welcome"`.
|
||||
- `clawdbot canvas hide [--node <id>]`
|
||||
- `clawdbot canvas eval --js <code> [--node <id>]`
|
||||
- `clawdbot canvas snapshot [--node <id>]`
|
||||
|
||||
### Canvas A2UI
|
||||
|
||||
Canvas A2UI is hosted by the **Gateway canvas host** at:
|
||||
|
||||
```
|
||||
http://<gateway-host>:18793/__clawdbot__/a2ui/
|
||||
```
|
||||
|
||||
The macOS app simply renders that page in the Canvas panel. The agent can drive it with JSONL **server→client protocol messages** (one JSON object per line):
|
||||
|
||||
- `clawdbot canvas a2ui push --jsonl <path> [--node <id>]`
|
||||
- `clawdbot canvas a2ui reset [--node <id>]`
|
||||
|
||||
`push` expects a JSONL file where **each line is a single JSON object** (parsed and forwarded to the in-page A2UI renderer).
|
||||
|
||||
Minimal example (v0.8):
|
||||
|
||||
```bash
|
||||
cat > /tmp/a2ui-v0.8.jsonl <<'EOF'
|
||||
{"surfaceUpdate":{"surfaceId":"main","components":[{"id":"root","component":{"Column":{"children":{"explicitList":["title","content"]}}}},{"id":"title","component":{"Text":{"text":{"literalString":"Canvas (A2UI v0.8)"},"usageHint":"h1"}}},{"id":"content","component":{"Text":{"text":{"literalString":"If you can read this, `canvas a2ui push` works."},"usageHint":"body"}}}]}}
|
||||
{"beginRendering":{"surfaceId":"main","root":"root"}}
|
||||
EOF
|
||||
|
||||
clawdbot canvas a2ui push --jsonl /tmp/a2ui-v0.8.jsonl --node <id>
|
||||
```
|
||||
|
||||
Notes:
|
||||
- This does **not** support the A2UI v0.9 examples using `createSurface`.
|
||||
- A2UI **fails** if the Gateway canvas host is unreachable (no local fallback).
|
||||
- `canvas a2ui push` validates JSONL (line numbers on errors) and rejects v0.9 payloads.
|
||||
- Quick smoke: `clawdbot canvas a2ui push --text "Hello from A2UI"` renders a minimal v0.8 view.
|
||||
|
||||
## Triggering agent runs from Canvas (deep links)
|
||||
|
||||
Canvas can trigger new agent runs via the macOS app deep-link scheme:
|
||||
- `clawdbot://agent?...`
|
||||
|
||||
This is intentionally separate from `clawdbot-canvas://…` (which is only for serving local Canvas files into the `WKWebView`).
|
||||
|
||||
Suggested patterns:
|
||||
- HTML: render links/buttons that navigate to `clawdbot://agent?message=...`.
|
||||
- JS: set `window.location.href = 'clawdbot://agent?...'` for “run this now” actions.
|
||||
|
||||
Implementation note (important):
|
||||
- In `WKWebView`, intercept `clawdbot://…` navigations in `WKNavigationDelegate` and forward them to the app, e.g. by calling `DeepLinkHandler.shared.handle(url:)` and returning `.cancel` for the navigation.
|
||||
|
||||
Safety:
|
||||
- Deep links (`clawdbot://agent?...`) are always enabled.
|
||||
- Without a `key` query param, the app will prompt for confirmation before invoking the agent.
|
||||
- With a valid `key`, the run is unattended (no prompt). For Canvas-originated actions, the app injects an internal key automatically.
|
||||
|
||||
## Security / guardrails
|
||||
|
||||
Recommended defaults:
|
||||
- `WKWebsiteDataStore.nonPersistent()` for Canvas (ephemeral).
|
||||
- Navigation policy: allow only `clawdbot-canvas://…` (and optionally `about:blank`); open `http/https` externally.
|
||||
- Scheme handler must prevent directory traversal: resolved file paths must stay under `<canvasRoot>/<session>/`.
|
||||
- Disable or tightly scope any JS bridge; prefer query-string/bootstrap config over `window.webkit.messageHandlers` for sensitive data.
|
||||
|
||||
## Debugging
|
||||
|
||||
Suggested debugging hooks:
|
||||
- Enable Web Inspector for Canvas builds (same approach as WebChat).
|
||||
- Log scheme requests + resolution decisions to OSLog (subsystem `com.clawdbot`, category `Canvas`).
|
||||
- Provide a “copy canvas dir” action in debug settings to quickly reveal the session directory in Finder.
|
||||
72
docs/platforms/mac/child-process.md
Normal file
72
docs/platforms/mac/child-process.md
Normal file
@@ -0,0 +1,72 @@
|
||||
---
|
||||
summary: "Running the gateway as a child process of the macOS app and why"
|
||||
read_when:
|
||||
- Integrating the mac app with the gateway lifecycle
|
||||
---
|
||||
# Clawdbot gateway as a child process of the macOS app
|
||||
|
||||
Date: 2025-12-06 · Status: draft · Owner: steipete
|
||||
|
||||
Note (2025-12-19): the current implementation prefers a **launchd LaunchAgent** that runs the **bundled bun-compiled gateway**. This doc remains as an alternative mode for tighter coupling to the UI.
|
||||
|
||||
## Goal
|
||||
Run the Node-based Clawdbot/clawdbot gateway as a direct child of the LSUIElement app (instead of a launchd agent) while keeping all TCC-sensitive work inside the Swift app/broker layer and wiring the existing “Clawdbot Active” toggle to start/stop the child.
|
||||
|
||||
## When to prefer the child-process mode
|
||||
- You want gateway lifetime strictly coupled to the menu-bar app (dies when the app quits) and controlled by the “Clawdbot Active” toggle without touching launchd.
|
||||
- You’re okay giving up login persistence/auto-restart that launchd provides, or you’ll add your own backoff loop.
|
||||
- You want simpler log capture and supervision inside the app (no external plist or user-visible LaunchAgent).
|
||||
|
||||
## Tradeoffs vs. launchd
|
||||
- **Pros:** tighter coupling to UI state; simpler surface (no plist install/bootout); easier to stream stdout/stderr; fewer moving parts for beta users.
|
||||
- **Cons:** no built-in KeepAlive/login auto-start; app crash kills gateway; you must build your own restart/backoff; Activity Monitor will show both processes under the app; still need correct TCC handling (see below).
|
||||
- **TCC:** behaviorally, child processes often inherit the parent app’s “responsible process” for TCC, but this is *not a contract*. Continue to route all protected actions through the Swift app/broker so prompts stay tied to the signed app bundle.
|
||||
|
||||
## TCC guardrails (must keep)
|
||||
- Screen Recording, Accessibility, mic, and speech prompts must originate from the signed Swift app/broker. The Node child should never call these APIs directly; route through the app’s node commands (via Gateway `node.invoke`) for:
|
||||
- `system.notify`
|
||||
- `system.run` (including `needsScreenRecording`)
|
||||
- `screen.record` / `camera.*`
|
||||
- PeekabooBridge UI automation (`peekaboo …`)
|
||||
- Usage strings (`NSMicrophoneUsageDescription`, `NSSpeechRecognitionUsageDescription`, etc.) stay in the app target’s Info.plist; a bare Node binary has none and would fail.
|
||||
- If you ever embed Node that *must* touch TCC, wrap that call in a tiny signed helper target inside the app bundle and have Node exec that helper instead of calling the API directly.
|
||||
|
||||
## Process manager design (Swift Subprocess)
|
||||
- Add a small `GatewayProcessManager` (Swift) that owns:
|
||||
- `execution: Execution?` from `Swift Subprocess` to track the child.
|
||||
- `start(config)` called when “Clawdbot Active” flips ON:
|
||||
- binary: host Node running the bundled gateway under `Clawdbot.app/Contents/Resources/Gateway/`
|
||||
- args: current clawdbot entrypoint and flags
|
||||
- cwd/env: point to `~/.clawdbot` as today; inject the expanded PATH so Homebrew Node resolves under launchd
|
||||
- output: stream stdout/stderr to `/tmp/clawdbot-gateway.log` (cap buffer via Subprocess OutputLimits)
|
||||
- restart: optional linear/backoff restart if exit was non-zero and Active is still true
|
||||
- `stop()` called when Active flips OFF or app terminates: cancel the execution and `waitUntilExit`.
|
||||
- Wire SwiftUI toggle:
|
||||
- ON: `GatewayProcessManager.start(...)`
|
||||
- OFF: `GatewayProcessManager.stop()` (no launchctl calls in this mode)
|
||||
- Keep the existing `LaunchdManager` around so we can switch back if needed; the toggle can choose between launchd or child mode with a flag if we want both.
|
||||
|
||||
## Packaging and signing
|
||||
- Bundle the gateway payload (dist + production node_modules) under `Contents/Resources/Gateway/`; rely on host Node ≥22 instead of embedding a runtime.
|
||||
- Codesign native addons and dylibs inside the bundle; no nested runtime binary to sign now.
|
||||
- Host runtime should not call TCC APIs directly; keep privileged work inside the app/broker.
|
||||
|
||||
## Logging and observability
|
||||
- Stream child stdout/stderr to `/tmp/clawdbot-gateway.log`; surface the last N lines in the Debug tab.
|
||||
- Emit a user notification (via existing NotificationManager) on crash/exit while Active is true.
|
||||
- Add a lightweight heartbeat from Node → app (e.g., ping over stdout) so the app can show status in the menu.
|
||||
|
||||
## Failure/edge cases
|
||||
- App crash/quit kills the gateway. Decide if that is acceptable for the deployment tier; otherwise, stick with launchd for production and keep child-process for dev/experiments.
|
||||
- If the gateway exits repeatedly, back off (e.g., 1s/2s/5s/10s) and give up after N attempts with a menu warning.
|
||||
- Respect the existing pause semantics: when paused, the broker should return `ok=false, "clawdbot paused"`; the gateway should avoid calling privileged routes while paused.
|
||||
|
||||
## Open questions / follow-ups
|
||||
- Do we need dual-mode (launchd for prod, child for dev)? If yes, gate via a setting or build flag.
|
||||
- Embedding a runtime is off the table for now; we rely on host Node for size/simplicity. Revisit only if host PATH drift becomes painful.
|
||||
- Do we want a tiny signed helper for rare TCC actions that cannot be brokered via the Swift app/broker?
|
||||
|
||||
## Decision snapshot (current recommendation)
|
||||
- Keep all TCC surfaces in the Swift app/broker (node commands + PeekabooBridgeHost).
|
||||
- Implement `GatewayProcessManager` with Swift Subprocess to start/stop the gateway on the “Clawdbot Active” toggle.
|
||||
- Maintain the launchd path as a fallback for uptime/login persistence until child-mode proves stable.
|
||||
96
docs/platforms/mac/dev-setup.md
Normal file
96
docs/platforms/mac/dev-setup.md
Normal file
@@ -0,0 +1,96 @@
|
||||
---
|
||||
summary: "Setup guide for developers working on the Clawdbot macOS app"
|
||||
read_when:
|
||||
- Setting up the macOS development environment
|
||||
---
|
||||
# macOS Developer Setup
|
||||
|
||||
This guide covers the necessary steps to build and run the Clawdbot macOS application from source.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before building the app, ensure you have the following installed:
|
||||
|
||||
1. **Xcode 26.2+**: Required for Swift development.
|
||||
2. **Node.js & pnpm**: Required for the gateway and CLI components.
|
||||
3. **Bun**: Required to package the embedded gateway relay.
|
||||
```bash
|
||||
curl -fsSL https://bun.sh/install | bash
|
||||
```
|
||||
|
||||
## 1. Initialize Submodules
|
||||
|
||||
Clawdbot depends on several submodules (like `Peekaboo`). You must initialize these recursively:
|
||||
|
||||
```bash
|
||||
git submodule update --init --recursive
|
||||
```
|
||||
|
||||
## 2. Install Dependencies
|
||||
|
||||
Install the project-wide dependencies:
|
||||
|
||||
```bash
|
||||
pnpm install
|
||||
```
|
||||
|
||||
## 3. Build and Package the App
|
||||
|
||||
To build the macOS app and package it into `dist/Clawdbot.app`, run:
|
||||
|
||||
```bash
|
||||
./scripts/package-mac-app.sh
|
||||
```
|
||||
|
||||
If you don't have an Apple Developer ID certificate, the script will automatically use **ad-hoc signing** (`-`).
|
||||
|
||||
> **Note**: Ad-hoc signed apps may trigger security prompts. If the app crashes immediately with "Abort trap 6", see the [Troubleshooting](#troubleshooting) section.
|
||||
|
||||
## 4. Install the CLI Helper
|
||||
|
||||
The macOS app requires a symlink named `clawdbot` in `/usr/local/bin` or `/opt/homebrew/bin` to manage background tasks.
|
||||
|
||||
**To install it:**
|
||||
1. Open the Clawdbot app.
|
||||
2. Go to the **General** settings tab.
|
||||
3. Click **"Install CLI helper"** (requires administrator privileges).
|
||||
|
||||
Alternatively, you can manually link it from your Admin account:
|
||||
```bash
|
||||
sudo ln -sf "/Users/$(whoami)/Projects/clawdbot/dist/Clawdbot.app/Contents/Resources/Relay/clawdbot" /usr/local/bin/clawdbot
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Build Fails: Toolchain or SDK Mismatch
|
||||
The macOS app build expects the latest macOS SDK and Swift 6.2 toolchain.
|
||||
|
||||
**System dependencies (required):**
|
||||
- **Latest macOS version available in Software Update** (required by Xcode 26.2 SDKs)
|
||||
- **Xcode 26.2** (Swift 6.2 toolchain)
|
||||
|
||||
**Checks:**
|
||||
```bash
|
||||
xcodebuild -version
|
||||
xcrun swift --version
|
||||
```
|
||||
|
||||
If versions don’t match, update macOS/Xcode and re-run the build.
|
||||
|
||||
### App Crashes on Permission Grant
|
||||
If the app crashes when you try to allow **Speech Recognition** or **Microphone** access, it may be due to a corrupted TCC cache or signature mismatch.
|
||||
|
||||
**Fix:**
|
||||
1. Reset the TCC permissions:
|
||||
```bash
|
||||
tccutil reset All com.clawdbot.mac.debug
|
||||
```
|
||||
2. If that fails, change the `BUNDLE_ID` temporarily in [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh) to force a "clean slate" from macOS.
|
||||
|
||||
### Gateway "Starting..." indefinitely
|
||||
If the gateway status stays on "Starting...", check if a zombie process is holding the port:
|
||||
|
||||
```bash
|
||||
lsof -nP -i :18789
|
||||
```
|
||||
Kill any existing `node` or `clawdbot` processes listening on that port and restart the app.
|
||||
28
docs/platforms/mac/health.md
Normal file
28
docs/platforms/mac/health.md
Normal file
@@ -0,0 +1,28 @@
|
||||
---
|
||||
summary: "How the macOS app reports gateway/Baileys health states"
|
||||
read_when:
|
||||
- Debugging mac app health indicators
|
||||
---
|
||||
# Health Checks on macOS
|
||||
|
||||
How to see whether the WhatsApp Web/Baileys bridge is healthy from the menu bar app.
|
||||
|
||||
## Menu bar
|
||||
- Status dot now reflects Baileys health:
|
||||
- Green: linked + socket opened recently.
|
||||
- Orange: connecting/retrying.
|
||||
- Red: logged out or probe failed.
|
||||
- Secondary line reads "Web: linked · auth 12m · socket ok" or shows the failure reason.
|
||||
- "Run Health Check" menu item triggers an on-demand probe.
|
||||
|
||||
## Settings
|
||||
- General tab gains a Health card showing: linked auth age, session-store path/count, last check time, last error/status code, and buttons for Run Health Check / Reveal Logs.
|
||||
- Uses a cached snapshot so the UI loads instantly and falls back gracefully when offline.
|
||||
- **Connections tab** surfaces provider status + controls for WhatsApp/Telegram (login QR, logout, probe, last disconnect/error).
|
||||
|
||||
## How the probe works
|
||||
- App runs `clawdbot health --json` via `ShellExecutor` every ~60s and on demand. The probe loads creds, attempts a short Baileys connect, and reports status without sending messages.
|
||||
- Cache the last good snapshot and the last error separately to avoid flicker; show the timestamp of each.
|
||||
|
||||
## When in doubt
|
||||
- You can still use the CLI flow in [`docs/health.md`](/health) (`clawdbot status`, `clawdbot status --deep`, `clawdbot health --json`) and tail `/tmp/clawdbot/clawdbot-*.log` for `web-heartbeat` / `web-reconnect`.
|
||||
26
docs/platforms/mac/icon.md
Normal file
26
docs/platforms/mac/icon.md
Normal file
@@ -0,0 +1,26 @@
|
||||
---
|
||||
summary: "Menu bar icon states and animations for Clawdbot on macOS"
|
||||
read_when:
|
||||
- Changing menu bar icon behavior
|
||||
---
|
||||
# Menu Bar Icon States
|
||||
|
||||
Author: steipete · Updated: 2025-12-06 · Scope: macOS app (`apps/macos`)
|
||||
|
||||
- **Idle:** Normal icon animation (blink, occasional wiggle).
|
||||
- **Paused:** Status item uses `appearsDisabled`; no motion.
|
||||
- **Voice trigger (big ears):** Voice wake detector calls `AppState.triggerVoiceEars(ttl: nil)` when the wake word is heard, keeping `earBoostActive=true` while the utterance is captured. Ears scale up (1.9x), get circular ear holes for readability, then drop via `stopVoiceEars()` after 1s of silence. Only fired from the in-app voice pipeline.
|
||||
- **Working (agent running):** `AppState.isWorking=true` drives a “tail/leg scurry” micro-motion: faster leg wiggle and slight offset while work is in-flight. Currently toggled around WebChat agent runs; add the same toggle around other long tasks when you wire them.
|
||||
|
||||
Wiring points
|
||||
- Voice wake: runtime/tester call `AppState.triggerVoiceEars(ttl: nil)` on trigger and `stopVoiceEars()` after 1s of silence to match the capture window.
|
||||
- Agent activity: set `AppStateStore.shared.setWorking(true/false)` around work spans (already done in WebChat agent call). Keep spans short and reset in `defer` blocks to avoid stuck animations.
|
||||
|
||||
Shapes & sizes
|
||||
- Base icon drawn in `CritterIconRenderer.makeIcon(blink:legWiggle:earWiggle:earScale:earHoles:)`.
|
||||
- Ear scale defaults to `1.0`; voice boost sets `earScale=1.9` and toggles `earHoles=true` without changing overall frame (18×18 pt template image rendered into a 36×36 px Retina backing store).
|
||||
- Scurry uses leg wiggle up to ~1.0 with a small horizontal jiggle; it’s additive to any existing idle wiggle.
|
||||
|
||||
Behavioral notes
|
||||
- No external CLI/broker toggle for ears/working; keep it internal to the app’s own signals to avoid accidental flapping.
|
||||
- Keep TTLs short (<10s) so the icon returns to baseline quickly if a job hangs.
|
||||
51
docs/platforms/mac/logging.md
Normal file
51
docs/platforms/mac/logging.md
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
summary: "Clawdbot logging: rolling diagnostics file log + unified log privacy flags"
|
||||
read_when:
|
||||
- Capturing macOS logs or investigating private data logging
|
||||
- Debugging voice wake/session lifecycle issues
|
||||
---
|
||||
# Logging (macOS)
|
||||
|
||||
## Rolling diagnostics file log (Debug pane)
|
||||
Clawdbot routes macOS app logs through swift-log (unified logging by default) and can write a local, rotating file log to disk when you need a durable capture.
|
||||
|
||||
- Verbosity: **Debug pane → Logs → App logging → Verbosity**
|
||||
- Enable: **Debug pane → Logs → App logging → “Write rolling diagnostics log (JSONL)”**
|
||||
- Location: `~/Library/Logs/Clawdbot/diagnostics.jsonl` (rotates automatically; old files are suffixed with `.1`, `.2`, …)
|
||||
- Clear: **Debug pane → Logs → App logging → “Clear”**
|
||||
|
||||
Notes:
|
||||
- This is **off by default**. Enable only while actively debugging.
|
||||
- Treat the file as sensitive; don’t share it without review.
|
||||
|
||||
## Unified logging private data on macOS
|
||||
|
||||
Unified logging redacts most payloads unless a subsystem opts into `privacy -off`. Per Peter's write-up on macOS [logging privacy shenanigans](https://steipete.me/posts/2025/logging-privacy-shenanigans) (2025) this is controlled by a plist in `/Library/Preferences/Logging/Subsystems/` keyed by the subsystem name. Only new log entries pick up the flag, so enable it before reproducing an issue.
|
||||
|
||||
## Enable for Clawdbot (`com.clawdbot`)
|
||||
- Write the plist to a temp file first, then install it atomically as root:
|
||||
|
||||
```bash
|
||||
cat <<'EOF' >/tmp/com.clawdbot.plist
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
||||
<plist version="1.0">
|
||||
<dict>
|
||||
<key>DEFAULT-OPTIONS</key>
|
||||
<dict>
|
||||
<key>Enable-Private-Data</key>
|
||||
<true/>
|
||||
</dict>
|
||||
</dict>
|
||||
</plist>
|
||||
EOF
|
||||
sudo install -m 644 -o root -g wheel /tmp/com.clawdbot.plist /Library/Preferences/Logging/Subsystems/com.clawdbot.plist
|
||||
```
|
||||
|
||||
- No reboot is required; logd notices the file quickly, but only new log lines will include private payloads.
|
||||
- View the richer output with the existing helper, e.g. `./scripts/clawlog.sh --category WebChat --last 5m`.
|
||||
|
||||
## Disable after debugging
|
||||
- Remove the override: `sudo rm /Library/Preferences/Logging/Subsystems/com.clawdbot.plist`.
|
||||
- Optionally run `sudo log config --reload` to force logd to drop the override immediately.
|
||||
- Remember this surface can include phone numbers and message bodies; keep the plist in place only while you actively need the extra detail.
|
||||
69
docs/platforms/mac/menu-bar.md
Normal file
69
docs/platforms/mac/menu-bar.md
Normal file
@@ -0,0 +1,69 @@
|
||||
---
|
||||
summary: "Menu bar status logic and what is surfaced to users"
|
||||
read_when:
|
||||
- Tweaking mac menu UI or status logic
|
||||
---
|
||||
# Menu Bar Status Logic
|
||||
|
||||
## What is shown
|
||||
- We surface the current agent work state in the menu bar icon and in the first status row of the menu.
|
||||
- Health status is hidden while work is active; it returns when all sessions are idle.
|
||||
- The “Nodes” block in the menu lists **devices** only (gateway bridge nodes via `node.list`), not client/presence entries.
|
||||
|
||||
## State model
|
||||
- Sessions: events arrive with `runId` (per-run) plus `sessionKey` in the payload. The “main” session is the key `main`; if absent, we fall back to the most recently updated session.
|
||||
- Priority: main always wins. If main is active, its state is shown immediately. If main is idle, the most recently active non‑main session is shown. We do not flip‑flop mid‑activity; we only switch when the current session goes idle or main becomes active.
|
||||
- Activity kinds:
|
||||
- `job`: high‑level command execution (`state: started|streaming|done|error`).
|
||||
- `tool`: `phase: start|result` with `toolName` and `meta/args`.
|
||||
|
||||
## IconState enum (Swift)
|
||||
- `idle`
|
||||
- `workingMain(ActivityKind)`
|
||||
- `workingOther(ActivityKind)`
|
||||
- `overridden(ActivityKind)` (debug override)
|
||||
|
||||
### ActivityKind → glyph
|
||||
- `bash` → 💻
|
||||
- `read` → 📄
|
||||
- `write` → ✍️
|
||||
- `edit` → 📝
|
||||
- `attach` → 📎
|
||||
- default → 🛠️
|
||||
|
||||
### Visual mapping
|
||||
- `idle`: normal critter.
|
||||
- `workingMain`: badge with glyph, full tint, leg “working” animation.
|
||||
- `workingOther`: badge with glyph, muted tint, no scurry.
|
||||
- `overridden`: uses the chosen glyph/tint regardless of activity.
|
||||
|
||||
## Status row text (menu)
|
||||
- While work is active: `<Session role> · <activity label>`
|
||||
- Examples: `Main · bash: pnpm test`, `Other · read: apps/macos/Sources/Clawdbot/AppState.swift`.
|
||||
- When idle: falls back to the health summary.
|
||||
|
||||
## Event ingestion
|
||||
- Source: control‑channel `agent` events (`ControlChannel.handleAgentEvent`).
|
||||
- Parsed fields:
|
||||
- `stream: "job"` with `data.state` for start/stop.
|
||||
- `stream: "tool"` with `data.phase`, `name`, optional `meta`/`args`.
|
||||
- Labels:
|
||||
- `bash`: first line of `args.command`.
|
||||
- `read`/`write`: shortened path.
|
||||
- `edit`: path plus inferred change kind from `meta`/diff counts.
|
||||
- fallback: tool name.
|
||||
|
||||
## Debug override
|
||||
- Settings ▸ Debug ▸ “Icon override” picker:
|
||||
- `System (auto)` (default)
|
||||
- `Working: main` (per tool kind)
|
||||
- `Working: other` (per tool kind)
|
||||
- `Idle`
|
||||
- Stored via `@AppStorage("iconOverride")`; mapped to `IconState.overridden`.
|
||||
|
||||
## Testing checklist
|
||||
- Trigger main session job: verify icon switches immediately and status row shows main label.
|
||||
- Trigger non‑main session job while main idle: icon/status shows non‑main; stays stable until it finishes.
|
||||
- Start main while other active: icon flips to main instantly.
|
||||
- Rapid tool bursts: ensure badge does not flicker (TTL grace on tool results).
|
||||
- Health row reappears once all sessions idle.
|
||||
170
docs/platforms/mac/peekaboo.md
Normal file
170
docs/platforms/mac/peekaboo.md
Normal file
@@ -0,0 +1,170 @@
|
||||
---
|
||||
summary: "Plan for integrating Peekaboo automation into Clawdbot via PeekabooBridge (socket-based TCC broker)"
|
||||
read_when:
|
||||
- Hosting PeekabooBridge in Clawdbot.app
|
||||
- Integrating Peekaboo as a submodule
|
||||
- Changing PeekabooBridge protocol/paths
|
||||
---
|
||||
# Peekaboo Bridge in Clawdbot (macOS UI automation broker)
|
||||
|
||||
## TL;DR
|
||||
- **Peekaboo removed its XPC helper** and now exposes privileged automation via a **UNIX domain socket bridge** (`PeekabooBridge` / `PeekabooBridgeHost`, socket name `bridge.sock`).
|
||||
- Clawdbot integrates by **optionally hosting the same bridge** inside **Clawdbot.app** (user-toggleable). The primary client is the **`peekaboo` CLI** (installed via npm); Clawdbot does not need its own `ui …` CLI surface.
|
||||
- For **visualizations**, we keep them in **Peekaboo.app** (best UX); Clawdbot stays a thin broker host. No visualizer toggle in Clawdbot.
|
||||
|
||||
Non-goals:
|
||||
- No auto-launching Peekaboo.app.
|
||||
- No onboarding deep links from the automation endpoint (Clawdbot onboarding already handles permissions).
|
||||
- No AI provider/agent runtime dependencies in Clawdbot (avoid pulling Tachikoma/MCP into the Clawdbot app/CLI).
|
||||
|
||||
## Big refactor (Dec 2025): XPC → Bridge
|
||||
Peekaboo’s privileged execution moved from “CLI → XPC helper” to “CLI → socket bridge host”. For Clawdbot this is a win:
|
||||
- It matches the existing “local socket + codesign checks” approach.
|
||||
- It lets us piggyback on **either** Peekaboo.app’s permissions **or** Clawdbot.app’s permissions (whichever is running).
|
||||
- It avoids “two apps with two TCC bubbles” unless needed.
|
||||
|
||||
Reference (Peekaboo submodule): `Peekaboo/docs/bridge-host.md`.
|
||||
|
||||
## Architecture
|
||||
### Processes
|
||||
- **Bridge hosts** (provide TCC-backed automation):
|
||||
- **Peekaboo.app** (preferred; also provides visualizations + controls)
|
||||
- **Claude.app** (secondary; lets `peekaboo` reuse Claude Desktop’s granted permissions)
|
||||
- **Clawdbot.app** (secondary; “thin host” only)
|
||||
- **Bridge clients** (trigger single actions):
|
||||
- `peekaboo …` (preferred; humans + agents)
|
||||
- Optional: Clawdbot/Node shells out to `peekaboo` when it needs UI automation/capture
|
||||
|
||||
### Host discovery (client-side)
|
||||
Order is deliberate:
|
||||
1. Peekaboo.app host (full UX)
|
||||
2. Claude.app host (piggyback on Claude Desktop permissions)
|
||||
3. Clawdbot.app host (piggyback on Clawdbot permissions)
|
||||
|
||||
Socket paths (convention; exact paths must match Peekaboo):
|
||||
- Peekaboo: `~/Library/Application Support/Peekaboo/bridge.sock`
|
||||
- Claude: `~/Library/Application Support/Claude/bridge.sock`
|
||||
- Clawdbot: `~/Library/Application Support/clawdbot/bridge.sock`
|
||||
|
||||
No auto-launch: if a host isn’t reachable, the command fails with a clear error (start Peekaboo.app, Claude.app, or Clawdbot.app).
|
||||
|
||||
Override (debugging): set `PEEKABOO_BRIDGE_SOCKET=/path/to/bridge.sock`.
|
||||
|
||||
### Protocol shape
|
||||
- **Single request per connection**: connect → write one JSON request → half-close → read one JSON response → close.
|
||||
- **Timeout**: 10 seconds end-to-end per action (client enforced; host should also enforce per-operation).
|
||||
- **Errors**: human-readable string by default; structured envelope in `--json`.
|
||||
|
||||
## Dependency strategy (submodule)
|
||||
Integrate Peekaboo via git submodule (nested submodules are OK).
|
||||
|
||||
Path in Clawdbot repo:
|
||||
- `./Peekaboo` (Swabble-style; keep stable so SwiftPM path deps don’t churn).
|
||||
|
||||
What Clawdbot should use:
|
||||
- **Client side**: `PeekabooBridge` (socket client + protocol models).
|
||||
- **Host side (Clawdbot.app)**: `PeekabooBridgeHost` + the minimal Peekaboo services needed to implement operations.
|
||||
|
||||
What Clawdbot should *not* embed:
|
||||
- **Visualizer UI**: keep it in Peekaboo.app for now (toggle + controls live there).
|
||||
- **XPC**: don’t reintroduce helper targets; use the bridge.
|
||||
|
||||
## IPC / CLI surface
|
||||
### No `clawdbot ui …`
|
||||
We avoid a parallel “Clawdbot UI automation CLI”. Instead:
|
||||
- `peekaboo` is the user/agent-facing CLI surface for automation and capture.
|
||||
- Clawdbot.app can host PeekabooBridge as a **thin TCC broker** so Peekaboo can piggyback on Clawdbot permissions when Peekaboo.app isn’t running.
|
||||
|
||||
### Diagnostics
|
||||
Use Peekaboo’s built-in diagnostics to see which host would be used:
|
||||
- `peekaboo bridge status`
|
||||
- `peekaboo bridge status --verbose`
|
||||
- `peekaboo bridge status --json`
|
||||
|
||||
### Output format
|
||||
Peekaboo commands default to human text output. Add `--json` for a structured envelope.
|
||||
|
||||
### Timeouts
|
||||
Default timeout for UI actions: **10 seconds** end-to-end (client enforced; host should also enforce per-operation).
|
||||
|
||||
## Coordinate model (multi-display)
|
||||
Requirement: coordinates are **per screen**, not global.
|
||||
|
||||
Standardize for the CLI (agent-friendly): **top-left origin per screen**.
|
||||
|
||||
Proposed request shape:
|
||||
- Requests accept `screenIndex` + `{x, y}` in that screen’s local coordinate space.
|
||||
- Clawdbot.app converts to global CG coordinates using `NSScreen.screens[screenIndex].frame.origin`.
|
||||
- Responses should echo both:
|
||||
- The resolved `screenIndex`
|
||||
- The local `{x, y}` and bounds
|
||||
- Optionally the global `{x, y}` for debugging
|
||||
|
||||
Ordering: use `NSScreen.screens` ordering consistently (documented in the CLI help + JSON schema).
|
||||
|
||||
## Targeting (per app/window)
|
||||
Expose window/app targeting in the UI surface (align with Peekaboo targeting):
|
||||
- frontmost
|
||||
- by app name / bundle id
|
||||
- by window title substring
|
||||
- by (app, index)
|
||||
|
||||
Peekaboo CLI targeting (agent-friendly):
|
||||
- `--bundle-id <id>` for app targeting
|
||||
- `--window-index <n>` (0-based) for disambiguating within an app when capturing
|
||||
|
||||
All “see/click/type/scroll/wait” requests should accept a target (default: frontmost).
|
||||
|
||||
## “See” + click packs (Playwright-style)
|
||||
Behavior stays aligned with Peekaboo:
|
||||
- `peekaboo see` returns element IDs (e.g. `B1`, `T3`) with bounds/labels.
|
||||
- Follow-up actions reference those IDs without re-scanning.
|
||||
|
||||
`peekaboo see` should:
|
||||
- capture (optionally targeted) window/screen
|
||||
- return a screenshot **file path** (default: temp directory)
|
||||
- return a list of elements (text or JSON)
|
||||
|
||||
Snapshot lifecycle requirement:
|
||||
- Host apps are long-lived, so snapshot state should be **in-memory by default**.
|
||||
- Snapshot scoping: “implicit snapshot” is **per target bundle id** (reuse last snapshot for that app when snapshot id is omitted).
|
||||
|
||||
Practical flow (agent-friendly):
|
||||
- `peekaboo list apps` / `peekaboo list windows` provide bundle-id context for targeting.
|
||||
- `peekaboo see --bundle-id X` updates the implicit snapshot for `X`.
|
||||
- `peekaboo click --bundle-id X --on B1` reuses the most recent snapshot for `X` when `--snapshot-id` is omitted.
|
||||
|
||||
## Visualizer integration
|
||||
Keep visualizations in **Peekaboo.app** for now.
|
||||
- Clawdbot hosts the bridge, but does not render overlays.
|
||||
- Any “visualizer enabled/disabled” setting is controlled in Peekaboo.app.
|
||||
|
||||
## Screenshots (legacy → Peekaboo takeover)
|
||||
Clawdbot should not grow a separate screenshot CLI surface.
|
||||
|
||||
Migration plan:
|
||||
- Use `peekaboo capture …` / `peekaboo see …` (returns a file path, default temp directory).
|
||||
- Once Clawdbot’ legacy screenshot plumbing is replaced, remove it cleanly (no aliases).
|
||||
|
||||
## Permissions behavior
|
||||
If required permissions are missing:
|
||||
- return `ok=false` with a short human error message (e.g., “Accessibility permission missing”)
|
||||
- do not try to open System Settings from the automation endpoint
|
||||
|
||||
## Security (socket auth)
|
||||
Both hosts must enforce:
|
||||
- filesystem perms on the socket path (owner read/write only)
|
||||
- server-side caller validation:
|
||||
- require the caller’s code signature TeamID to be `Y5PE65HELJ`
|
||||
- optional bundle-id allowlist for tighter scoping
|
||||
|
||||
Debug-only escape hatch (development convenience):
|
||||
- “allow same-UID callers” means: *skip codesign checks for clients running under the same Unix user*.
|
||||
- This must be **opt-in**, **DEBUG-only**, and guarded by an env var (Peekaboo uses `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`).
|
||||
|
||||
## Next integration steps (after this doc)
|
||||
1. Add Peekaboo as a git submodule (nested submodules OK).
|
||||
2. Host `PeekabooBridgeHost` inside Clawdbot.app behind a single setting (“Enable Peekaboo Bridge”, default on).
|
||||
3. Ensure Clawdbot hosts the bridge at `~/Library/Application Support/clawdbot/bridge.sock` and speaks the PeekabooBridge JSON protocol.
|
||||
4. Validate with `peekaboo bridge status --verbose` that Peekaboo can select Clawdbot as the fallback host (no auto-launch).
|
||||
5. Keep all protocol decisions aligned with Peekaboo (coordinate system, element IDs, snapshot scoping, error envelopes).
|
||||
40
docs/platforms/mac/permissions.md
Normal file
40
docs/platforms/mac/permissions.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
summary: "macOS permission persistence (TCC) and signing requirements"
|
||||
read_when:
|
||||
- Debugging missing or stuck macOS permission prompts
|
||||
- Packaging or signing the macOS app
|
||||
- Changing bundle IDs or app install paths
|
||||
---
|
||||
# macOS permissions (TCC)
|
||||
|
||||
macOS permission grants are fragile. TCC associates a permission grant with the
|
||||
app's code signature, bundle identifier, and on-disk path. If any of those change,
|
||||
macOS treats the app as new and may drop or hide prompts.
|
||||
|
||||
## Requirements for stable permissions
|
||||
- Same path: run the app from a fixed location (for Clawdbot, `dist/Clawdbot.app`).
|
||||
- Same bundle identifier: changing the bundle ID creates a new permission identity.
|
||||
- Signed app: unsigned or ad-hoc signed builds do not persist permissions.
|
||||
- Consistent signature: use a real Apple Development or Developer ID certificate
|
||||
so the signature stays stable across rebuilds.
|
||||
|
||||
Ad-hoc signatures generate a new identity every build. macOS will forget previous
|
||||
grants, and prompts can disappear entirely until the stale entries are cleared.
|
||||
|
||||
## Recovery checklist when prompts disappear
|
||||
1. Quit the app.
|
||||
2. Remove the app entry in System Settings -> Privacy & Security.
|
||||
3. Relaunch the app from the same path and re-grant permissions.
|
||||
4. If the prompt still does not appear, reset TCC entries with `tccutil` and try again.
|
||||
5. Some permissions only reappear after a full macOS restart.
|
||||
|
||||
Example resets (replace bundle ID as needed):
|
||||
|
||||
```bash
|
||||
sudo tccutil reset Accessibility com.clawdbot.mac
|
||||
sudo tccutil reset ScreenCapture com.clawdbot.mac
|
||||
sudo tccutil reset AppleEvents
|
||||
```
|
||||
|
||||
If you are testing permissions, always sign with a real certificate. Ad-hoc
|
||||
builds are only acceptable for quick local runs where permissions do not matter.
|
||||
76
docs/platforms/mac/release.md
Normal file
76
docs/platforms/mac/release.md
Normal file
@@ -0,0 +1,76 @@
|
||||
---
|
||||
summary: "Clawdbot macOS release checklist (Sparkle feed, packaging, signing)"
|
||||
read_when:
|
||||
- Cutting or validating a Clawdbot macOS release
|
||||
- Updating the Sparkle appcast or feed assets
|
||||
---
|
||||
|
||||
# Clawdbot macOS release (Sparkle)
|
||||
|
||||
This app now ships Sparkle auto-updates. Release builds must be Developer ID–signed, zipped, and published with a signed appcast entry.
|
||||
|
||||
## Prereqs
|
||||
- Developer ID Application cert installed (`Developer ID Application: Peter Steinberger (Y5PE65HELJ)` is expected).
|
||||
- Sparkle private key path set in the environment as `SPARKLE_PRIVATE_KEY_FILE`; key lives in `/Users/steipete/Library/CloudStorage/Dropbox/Backup/Sparkle` (same key as Trimmy; public key baked into Info.plist).
|
||||
- Notary credentials (keychain profile or API key) for `xcrun notarytool` if you want Gatekeeper-safe DMG/zip distribution.
|
||||
- We use a Keychain profile named `clawdbot-notary`, created from App Store Connect API key env vars in your shell profile:
|
||||
- `APP_STORE_CONNECT_API_KEY_P8`, `APP_STORE_CONNECT_KEY_ID`, `APP_STORE_CONNECT_ISSUER_ID`
|
||||
- `echo "$APP_STORE_CONNECT_API_KEY_P8" | sed 's/\\n/\n/g' > /tmp/clawdbot-notary.p8`
|
||||
- `xcrun notarytool store-credentials "clawdbot-notary" --key /tmp/clawdbot-notary.p8 --key-id "$APP_STORE_CONNECT_KEY_ID" --issuer "$APP_STORE_CONNECT_ISSUER_ID"`
|
||||
- `pnpm` deps installed (`pnpm install --config.node-linker=hoisted`).
|
||||
- Sparkle tools are fetched automatically via SwiftPM at `apps/macos/.build/artifacts/sparkle/Sparkle/bin/` (`sign_update`, `generate_appcast`, etc.).
|
||||
|
||||
## Build & package
|
||||
Notes:
|
||||
- `APP_BUILD` maps to `CFBundleVersion`/`sparkle:version`; keep it numeric + monotonic (no `-beta`), or Sparkle compares it as equal.
|
||||
- Defaults to the current architecture (`$(uname -m)`). For release/universal builds, set `BUILD_ARCHS="arm64 x86_64"` (or `BUILD_ARCHS=all`).
|
||||
|
||||
```bash
|
||||
# From repo root; set release IDs so Sparkle feed is enabled.
|
||||
# APP_BUILD must be numeric + monotonic for Sparkle compare.
|
||||
BUNDLE_ID=com.clawdbot.mac \
|
||||
APP_VERSION=0.1.0 \
|
||||
APP_BUILD="$(git rev-list --count HEAD)" \
|
||||
BUILD_CONFIG=release \
|
||||
SIGN_IDENTITY="Developer ID Application: Peter Steinberger (Y5PE65HELJ)" \
|
||||
scripts/package-mac-app.sh
|
||||
|
||||
# Zip for distribution (includes resource forks for Sparkle delta support)
|
||||
ditto -c -k --sequesterRsrc --keepParent dist/Clawdbot.app dist/Clawdbot-0.1.0.zip
|
||||
|
||||
# Optional: also build a styled DMG for humans (drag to /Applications)
|
||||
scripts/create-dmg.sh dist/Clawdbot.app dist/Clawdbot-0.1.0.dmg
|
||||
|
||||
# Recommended: build + notarize/staple zip + DMG
|
||||
# First, create a keychain profile once:
|
||||
# xcrun notarytool store-credentials "clawdbot-notary" \
|
||||
# --apple-id "<apple-id>" --team-id "<team-id>" --password "<app-specific-password>"
|
||||
NOTARIZE=1 NOTARYTOOL_PROFILE=clawdbot-notary \
|
||||
BUNDLE_ID=com.clawdbot.mac \
|
||||
APP_VERSION=0.1.0 \
|
||||
APP_BUILD="$(git rev-list --count HEAD)" \
|
||||
BUILD_CONFIG=release \
|
||||
SIGN_IDENTITY="Developer ID Application: Peter Steinberger (Y5PE65HELJ)" \
|
||||
scripts/package-mac-dist.sh
|
||||
|
||||
# Optional: ship dSYM alongside the release
|
||||
ditto -c -k --keepParent apps/macos/.build/release/Clawdbot.app.dSYM dist/Clawdbot-0.1.0.dSYM.zip
|
||||
```
|
||||
|
||||
## Appcast entry
|
||||
Use the release note generator so Sparkle renders formatted HTML notes:
|
||||
```bash
|
||||
SPARKLE_PRIVATE_KEY_FILE=/Users/steipete/Library/CloudStorage/Dropbox/Backup/Sparkle/ed25519-private-key scripts/make_appcast.sh dist/Clawdbot-0.1.0.zip https://raw.githubusercontent.com/clawdbot/clawdbot/main/appcast.xml
|
||||
```
|
||||
Generates HTML release notes from `CHANGELOG.md` (via [`scripts/changelog-to-html.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/changelog-to-html.sh)) and embeds them in the appcast entry.
|
||||
Commit the updated `appcast.xml` alongside the release assets (zip + dSYM) when publishing.
|
||||
|
||||
## Publish & verify
|
||||
- Upload `Clawdbot-0.1.0.zip` (and `Clawdbot-0.1.0.dSYM.zip`) to the GitHub release for tag `v0.1.0`.
|
||||
- Ensure the raw appcast URL matches the baked feed: `https://raw.githubusercontent.com/clawdbot/clawdbot/main/appcast.xml`.
|
||||
- Sanity checks:
|
||||
- `curl -I https://raw.githubusercontent.com/clawdbot/clawdbot/main/appcast.xml` returns 200.
|
||||
- `curl -I <enclosure url>` returns 200 after assets upload.
|
||||
- On a previous public build, run “Check for Updates…” from the About tab and verify Sparkle installs the new build cleanly.
|
||||
|
||||
Definition of done: signed app + appcast are published, update flow works from an older installed version, and release assets are attached to the GitHub release.
|
||||
57
docs/platforms/mac/remote.md
Normal file
57
docs/platforms/mac/remote.md
Normal file
@@ -0,0 +1,57 @@
|
||||
---
|
||||
summary: "macOS app flow for controlling a remote Clawdbot gateway over SSH"
|
||||
read_when:
|
||||
- Setting up or debugging remote mac control
|
||||
---
|
||||
# Remote Clawdbot (macOS ⇄ remote host)
|
||||
|
||||
Updated: 2025-12-08
|
||||
|
||||
This flow lets the macOS app act as a full remote control for a Clawdbot gateway running on another host (e.g. a Mac Studio). All features—health checks, Voice Wake forwarding, and Web Chat—reuse the same remote SSH configuration from *Settings → General*.
|
||||
|
||||
## Modes
|
||||
- **Local (this Mac)**: Everything runs on the laptop. No SSH involved.
|
||||
- **Remote over SSH**: Clawdbot commands are executed on the remote host. The mac app opens an SSH connection with `-o BatchMode` plus your chosen identity/key.
|
||||
|
||||
## Prereqs on the remote host
|
||||
1) Install Node + pnpm and build/install the Clawdbot CLI (`pnpm install && pnpm build && pnpm link --global`).
|
||||
2) Ensure `clawdbot` is on PATH for non-interactive shells (symlink into `/usr/local/bin` or `/opt/homebrew/bin` if needed).
|
||||
3) Open SSH with key auth. We recommend **Tailscale** IPs for stable reachability off-LAN.
|
||||
|
||||
## macOS app setup
|
||||
1) Open *Settings → General*.
|
||||
2) Under **Clawdbot runs**, pick **Remote over SSH** and set:
|
||||
- **SSH target**: `user@host` (optional `:port`).
|
||||
- If the gateway is on the same LAN and advertises Bonjour, pick it from the discovered list to auto-fill this field.
|
||||
- **Identity file** (advanced): path to your key.
|
||||
- **Project root** (advanced): remote checkout path used for commands.
|
||||
- **CLI path** (advanced): optional path to a runnable `clawdbot` entrypoint/binary (auto-filled when advertised).
|
||||
3) Hit **Test remote**. Success indicates the remote `clawdbot status --json` runs correctly. Failures usually mean PATH/CLI issues; exit 127 means the CLI isn’t found remotely.
|
||||
4) Health checks and Web Chat will now run through this SSH tunnel automatically.
|
||||
|
||||
## Web Chat over SSH
|
||||
- Web Chat connects to the gateway over the forwarded WebSocket control port (default 18789).
|
||||
- There is no separate WebChat HTTP server anymore.
|
||||
|
||||
## Permissions
|
||||
- The remote host needs the same TCC approvals as local (Automation, Accessibility, Screen Recording, Microphone, Speech Recognition, Notifications). Run onboarding on that machine to grant them once.
|
||||
- Nodes advertise their permission state via `node.list` / `node.describe` so agents know what’s available.
|
||||
|
||||
## WhatsApp login flow (remote)
|
||||
- Run `clawdbot login --verbose` **on the remote host**. Scan the QR with WhatsApp on your phone.
|
||||
- Re-run login on that host if auth expires. Health check will surface link problems.
|
||||
|
||||
## Troubleshooting
|
||||
- **exit 127 / not found**: `clawdbot` isn’t on PATH for non-login shells. Add it to `/etc/paths`, your shell rc, or symlink into `/usr/local/bin`/`/opt/homebrew/bin`.
|
||||
- **Health probe failed**: check SSH reachability, PATH, and that Baileys is logged in (`clawdbot status --json`).
|
||||
- **Web Chat stuck**: confirm the gateway is running on the remote host and the forwarded port matches the gateway WS port; the UI requires a healthy WS connection.
|
||||
- **Voice Wake**: trigger phrases are forwarded automatically in remote mode; no separate forwarder is needed.
|
||||
|
||||
## Notification sounds
|
||||
Pick sounds per notification from scripts with `clawdbot` and `node.invoke`, e.g.:
|
||||
|
||||
```bash
|
||||
clawdbot nodes notify --node <id> --title "Ping" --body "Remote gateway ready" --sound Glass
|
||||
```
|
||||
|
||||
There is no global “default sound” toggle in the app anymore; callers choose a sound (or none) per request.
|
||||
41
docs/platforms/mac/signing.md
Normal file
41
docs/platforms/mac/signing.md
Normal file
@@ -0,0 +1,41 @@
|
||||
---
|
||||
summary: "Signing steps for macOS debug builds generated by packaging scripts"
|
||||
read_when:
|
||||
- Building or signing mac debug builds
|
||||
---
|
||||
# mac signing (debug builds)
|
||||
|
||||
This app is usually built from [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh), which now:
|
||||
|
||||
- sets a stable debug bundle identifier: `com.clawdbot.mac.debug`
|
||||
- writes the Info.plist with that bundle id (override via `BUNDLE_ID=...`)
|
||||
- calls [`scripts/codesign-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/codesign-mac-app.sh) to sign the main binary, bundled CLI, and app bundle so macOS treats each rebuild as the same signed bundle and keeps TCC permissions (notifications, accessibility, screen recording, mic, speech). For stable permissions, use a real signing identity; ad-hoc is opt-in and fragile (see [`docs/mac/permissions.md`](/mac/permissions)).
|
||||
- uses `CODESIGN_TIMESTAMP=auto` by default; it enables trusted timestamps for Developer ID signatures. Set `CODESIGN_TIMESTAMP=off` to skip timestamping (offline debug builds).
|
||||
- inject build metadata into Info.plist: `ClawdbotBuildTimestamp` (UTC) and `ClawdbotGitCommit` (short hash) so the About pane can show build, git, and debug/release channel.
|
||||
- **Packaging requires Bun**: The embedded gateway relay is compiled using `bun`. Ensure it is installed (`curl -fsSL https://bun.sh/install | bash`).
|
||||
- reads `SIGN_IDENTITY` from the environment. Add `export SIGN_IDENTITY="Apple Development: Your Name (TEAMID)"` (or your Developer ID Application cert) to your shell rc to always sign with your cert. Ad-hoc signing requires explicit opt-in via `ALLOW_ADHOC_SIGNING=1` or `SIGN_IDENTITY="-"` (not recommended for permission testing).
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
# from repo root
|
||||
scripts/package-mac-app.sh # auto-selects identity; errors if none found
|
||||
SIGN_IDENTITY="Developer ID Application: Your Name" scripts/package-mac-app.sh # real cert
|
||||
ALLOW_ADHOC_SIGNING=1 scripts/package-mac-app.sh # ad-hoc (permissions will not stick)
|
||||
SIGN_IDENTITY="-" scripts/package-mac-app.sh # explicit ad-hoc (same caveat)
|
||||
```
|
||||
|
||||
### Ad-hoc Signing Note
|
||||
When signing with `SIGN_IDENTITY="-"` (ad-hoc), the script automatically disables the **Hardened Runtime** (`--options runtime`). This is necessary to prevent crashes when the app attempts to load embedded frameworks (like Sparkle) that do not share the same Team ID. Ad-hoc signatures also break TCC permission persistence; see [`docs/mac/permissions.md`](/mac/permissions) for recovery steps.
|
||||
|
||||
## Build metadata for About
|
||||
|
||||
`package-mac-app.sh` stamps the bundle with:
|
||||
- `ClawdbotBuildTimestamp`: ISO8601 UTC at package time
|
||||
- `ClawdbotGitCommit`: short git hash (or `unknown` if unavailable)
|
||||
|
||||
The About tab reads these keys to show version, build date, git commit, and whether it’s a debug build (via `#if DEBUG`). Run the packager to refresh these values after code changes.
|
||||
|
||||
## Why
|
||||
|
||||
TCC permissions are tied to the bundle identifier *and* code signature. Unsigned debug builds with changing UUIDs were causing macOS to forget grants after each rebuild. Signing the binaries (ad‑hoc by default) and keeping a fixed bundle id/path (`dist/Clawdbot.app`) preserves the grants between builds, matching the VibeTunnel approach.
|
||||
27
docs/platforms/mac/skills.md
Normal file
27
docs/platforms/mac/skills.md
Normal file
@@ -0,0 +1,27 @@
|
||||
---
|
||||
summary: "macOS Skills settings UI and gateway-backed status"
|
||||
read_when:
|
||||
- Updating the macOS Skills settings UI
|
||||
- Changing skills gating or install behavior
|
||||
---
|
||||
# Skills (macOS)
|
||||
|
||||
The macOS app surfaces Clawdbot skills via the gateway; it does not parse skills locally.
|
||||
|
||||
## Data source
|
||||
- `skills.status` (gateway) returns all skills plus eligibility and missing requirements
|
||||
(including allowlist blocks for bundled skills).
|
||||
- Requirements are derived from `metadata.clawdbot.requires` in each `SKILL.md`.
|
||||
|
||||
## Install actions
|
||||
- `metadata.clawdbot.install` defines install options (brew/node/go/uv).
|
||||
- The app calls `skills.install` to run installers on the gateway host.
|
||||
- The gateway surfaces only one preferred installer when multiple are provided
|
||||
(brew when available, otherwise node manager from `skills.install`, default npm).
|
||||
|
||||
## Env/API keys
|
||||
- The app stores keys in `~/.clawdbot/clawdbot.json` under `skills.entries.<skillKey>`.
|
||||
- `skills.update` patches `enabled`, `apiKey`, and `env`.
|
||||
|
||||
## Remote mode
|
||||
- Install + config updates happen on the gateway host (not the local Mac).
|
||||
52
docs/platforms/mac/voice-overlay.md
Normal file
52
docs/platforms/mac/voice-overlay.md
Normal file
@@ -0,0 +1,52 @@
|
||||
---
|
||||
summary: "Voice overlay lifecycle when wake-word and push-to-talk overlap"
|
||||
read_when:
|
||||
- Adjusting voice overlay behavior
|
||||
---
|
||||
## Voice Overlay Lifecycle (macOS)
|
||||
|
||||
Audience: macOS app contributors. Goal: keep the voice overlay predictable when wake-word and push-to-talk overlap.
|
||||
|
||||
### Current intent
|
||||
- If the overlay is already visible from wake-word and the user presses the hotkey, the hotkey session *adopts* the existing text instead of resetting it. The overlay stays up while the hotkey is held. When the user releases: send if there is trimmed text, otherwise dismiss.
|
||||
- Wake-word alone still auto-sends on silence; push-to-talk sends immediately on release.
|
||||
|
||||
### Implemented (Dec 9, 2025)
|
||||
- Overlay sessions now carry a token per capture (wake-word or push-to-talk). Partial/final/send/dismiss/level updates are dropped when the token doesn’t match, avoiding stale callbacks.
|
||||
- Push-to-talk adopts any visible overlay text as a prefix (so pressing the hotkey while the wake overlay is up keeps the text and appends new speech). It waits up to 1.5s for a final transcript before falling back to the current text.
|
||||
- Chime/overlay logging is emitted at `info` in categories `voicewake.overlay`, `voicewake.ptt`, and `voicewake.chime` (session start, partial, final, send, dismiss, chime reason).
|
||||
|
||||
### Next steps
|
||||
1. **VoiceSessionCoordinator (actor)**
|
||||
- Owns exactly one `VoiceSession` at a time.
|
||||
- API (token-based): `beginWakeCapture`, `beginPushToTalk`, `updatePartial`, `endCapture`, `cancel`, `applyCooldown`.
|
||||
- Drops callbacks that carry stale tokens (prevents old recognizers from reopening the overlay).
|
||||
2. **VoiceSession (model)**
|
||||
- Fields: `token`, `source` (wakeWord|pushToTalk), committed/volatile text, chime flags, timers (auto-send, idle), `overlayMode` (display|editing|sending), cooldown deadline.
|
||||
3. **Overlay binding**
|
||||
- `VoiceSessionPublisher` (`ObservableObject`) mirrors the active session into SwiftUI.
|
||||
- `VoiceWakeOverlayView` renders only via the publisher; it never mutates global singletons directly.
|
||||
- Overlay user actions (`sendNow`, `dismiss`, `edit`) call back into the coordinator with the session token.
|
||||
4. **Unified send path**
|
||||
- On `endCapture`: if trimmed text is empty → dismiss; else `performSend(session:)` (plays send chime once, forwards, dismisses).
|
||||
- Push-to-talk: no delay; wake-word: optional delay for auto-send.
|
||||
- Apply a short cooldown to the wake runtime after push-to-talk finishes so wake-word doesn’t immediately retrigger.
|
||||
5. **Logging**
|
||||
- Coordinator emits `.info` logs in subsystem `com.clawdbot`, categories `voicewake.overlay` and `voicewake.chime`.
|
||||
- Key events: `session_started`, `adopted_by_push_to_talk`, `partial`, `finalized`, `send`, `dismiss`, `cancel`, `cooldown`.
|
||||
|
||||
### Debugging checklist
|
||||
- Stream logs while reproducing a sticky overlay:
|
||||
|
||||
```bash
|
||||
sudo log stream --predicate 'subsystem == "com.clawdbot" AND category CONTAINS "voicewake"' --level info --style compact
|
||||
```
|
||||
- Verify only one active session token; stale callbacks should be dropped by the coordinator.
|
||||
- Ensure push-to-talk release always calls `endCapture` with the active token; if text is empty, expect `dismiss` without chime or send.
|
||||
|
||||
### Migration steps (suggested)
|
||||
1. Add `VoiceSessionCoordinator`, `VoiceSession`, and `VoiceSessionPublisher`.
|
||||
2. Refactor `VoiceWakeRuntime` to create/update/end sessions instead of touching `VoiceWakeOverlayController` directly.
|
||||
3. Refactor `VoicePushToTalk` to adopt existing sessions and call `endCapture` on release; apply runtime cooldown.
|
||||
4. Wire `VoiceWakeOverlayController` to the publisher; remove direct calls from runtime/PTT.
|
||||
5. Add integration tests for session adoption, cooldown, and empty-text dismissal.
|
||||
56
docs/platforms/mac/voicewake.md
Normal file
56
docs/platforms/mac/voicewake.md
Normal file
@@ -0,0 +1,56 @@
|
||||
---
|
||||
summary: "Voice wake and push-to-talk modes plus routing details in the mac app"
|
||||
read_when:
|
||||
- Working on voice wake or PTT pathways
|
||||
---
|
||||
# Voice Wake & Push-to-Talk
|
||||
|
||||
Updated: 2025-12-23 · Owners: mac app
|
||||
|
||||
## Modes
|
||||
- **Wake-word mode** (default): always-on Speech recognizer waits for trigger tokens (`swabbleTriggerWords`). On match it starts capture, shows the overlay with partial text, and auto-sends after silence.
|
||||
- **Push-to-talk (Right Option hold)**: hold the right Option key to capture immediately—no trigger needed. The overlay appears while held; releasing finalizes and forwards after a short delay so you can tweak text.
|
||||
|
||||
## Runtime behavior (wake-word)
|
||||
- Speech recognizer lives in `VoiceWakeRuntime`.
|
||||
- Trigger only fires when there’s a **meaningful pause** between the wake word and the next word (~0.45s gap).
|
||||
- Silence windows: 2.0s when speech is flowing, 5.0s if only the trigger was heard.
|
||||
- Hard stop: 120s to prevent runaway sessions.
|
||||
- Debounce between sessions: 350ms.
|
||||
- Overlay is driven via `VoiceWakeOverlayController` with committed/volatile coloring.
|
||||
- After send, recognizer restarts cleanly to listen for the next trigger.
|
||||
|
||||
## Lifecycle invariants
|
||||
- If Voice Wake is enabled and permissions are granted, the wake-word recognizer should be listening (except during an explicit push-to-talk capture).
|
||||
- Overlay visibility (including manual dismiss via the X button) must never prevent the recognizer from resuming.
|
||||
|
||||
## Sticky overlay failure mode (previous)
|
||||
Previously, if the overlay got stuck visible and you manually closed it, Voice Wake could appear “dead” because the runtime’s restart attempt could be blocked by overlay visibility and no subsequent restart was scheduled.
|
||||
|
||||
Hardening:
|
||||
- Wake runtime restart is no longer blocked by overlay visibility.
|
||||
- Overlay dismiss completion triggers a `VoiceWakeRuntime.refresh(...)` via `VoiceSessionCoordinator`, so manual X-dismiss always resumes listening.
|
||||
|
||||
## Push-to-talk specifics
|
||||
- Hotkey detection uses a global `.flagsChanged` monitor for **right Option** (`keyCode 61` + `.option`). We only observe events (no swallowing).
|
||||
- Capture pipeline lives in `VoicePushToTalk`: starts Speech immediately, streams partials to the overlay, and calls `VoiceWakeForwarder` on release.
|
||||
- When push-to-talk starts we pause the wake-word runtime to avoid dueling audio taps; it restarts automatically after release.
|
||||
- Permissions: requires Microphone + Speech; seeing events needs Accessibility/Input Monitoring approval.
|
||||
- External keyboards: some may not expose right Option as expected—offer a fallback shortcut if users report misses.
|
||||
|
||||
## User-facing settings
|
||||
- **Voice Wake** toggle: enables wake-word runtime.
|
||||
- **Hold Cmd+Fn to talk**: enables the push-to-talk monitor. Disabled on macOS < 26.
|
||||
- Language & mic pickers, live level meter, trigger-word table, tester.
|
||||
- **Sounds**: chimes on trigger detect and on send; defaults to the macOS “Glass” system sound. You can pick any `NSSound`-loadable file (e.g. MP3/WAV/AIFF) for each event or choose **No Sound**.
|
||||
|
||||
## Forwarding behavior
|
||||
- When Voice Wake is enabled, transcripts are forwarded to the active gateway/agent (the same local vs remote mode used by the rest of the mac app).
|
||||
- Replies are delivered to the **last-used main provider** (WhatsApp/Telegram/Discord/WebChat). If delivery fails, the error is logged and the run is still visible via WebChat/session logs.
|
||||
|
||||
## Forwarding payload
|
||||
- `VoiceWakeForwarder.prefixedTranscript(_:)` prepends the machine hint before sending. Shared between wake-word and push-to-talk paths.
|
||||
|
||||
## Quick verification
|
||||
- Toggle push-to-talk on, hold Cmd+Fn, speak, release: overlay should show partials then send.
|
||||
- While holding, menu-bar ears should stay enlarged (uses `triggerVoiceEars(ttl:nil)`); they drop after release.
|
||||
27
docs/platforms/mac/webchat.md
Normal file
27
docs/platforms/mac/webchat.md
Normal file
@@ -0,0 +1,27 @@
|
||||
---
|
||||
summary: "How the mac app embeds the gateway WebChat and how to debug it"
|
||||
read_when:
|
||||
- Debugging mac WebChat view or loopback port
|
||||
---
|
||||
# Web Chat (macOS app)
|
||||
|
||||
The macOS menu bar app shows the WebChat UI as a native SwiftUI view and reuses the **primary Clawd session** (`main`, or `global` when scope is global).
|
||||
|
||||
- **Local mode**: connects directly to the local Gateway WebSocket.
|
||||
- **Remote mode**: forwards the Gateway WebSocket control port over SSH and uses that as the data plane.
|
||||
|
||||
## Launch & debugging
|
||||
- Manual: Lobster menu → “Open Chat”.
|
||||
- Auto-open for testing: run `dist/Clawdbot.app/Contents/MacOS/Clawdbot --webchat` (or pass `--webchat` to the binary launched by launchd). The window opens on startup.
|
||||
- Logs: see [`./scripts/clawlog.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/clawlog.sh) (subsystem `com.clawdbot`, category `WebChatSwiftUI`).
|
||||
|
||||
## How it’s wired
|
||||
- Implementation: [`apps/macos/Sources/Clawdbot/WebChatSwiftUI.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/WebChatSwiftUI.swift) hosts `ClawdbotChatUI` and speaks to the Gateway over `GatewayConnection`.
|
||||
- Data plane: Gateway WebSocket methods `chat.history`, `chat.send`, `chat.abort`; events `chat`, `agent`, `presence`, `tick`, `health`.
|
||||
- Session: usually primary (`main`); multiple transports (WhatsApp/Telegram/Discord/Desktop) share the same key. The onboarding flow uses a dedicated `onboarding` session to keep first-run setup separate.
|
||||
|
||||
## Security / surface area
|
||||
- Remote mode forwards only the Gateway WebSocket control port over SSH.
|
||||
|
||||
## Known limitations
|
||||
- The UI is optimized for the primary session and typical “chat” usage (not a full browser-based sandbox surface).
|
||||
40
docs/platforms/mac/xpc.md
Normal file
40
docs/platforms/mac/xpc.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
summary: "macOS IPC architecture for Clawdbot app, gateway node bridge, and PeekabooBridge"
|
||||
read_when:
|
||||
- Editing IPC contracts or menu bar app IPC
|
||||
---
|
||||
# Clawdbot macOS IPC architecture (Dec 2025)
|
||||
|
||||
**Current model:** there is **no local control socket** and no `clawdbot-mac` CLI. All agent actions go through the Gateway WebSocket and `node.invoke`. UI automation still uses PeekabooBridge.
|
||||
|
||||
## Goals
|
||||
- Single GUI app instance that owns all TCC-facing work (notifications, screen recording, mic, speech, AppleScript).
|
||||
- A small surface for automation: Gateway + node commands, plus PeekabooBridge for UI automation.
|
||||
- Predictable permissions: always the same signed bundle ID, launched by launchd, so TCC grants stick.
|
||||
|
||||
## How it works
|
||||
### Gateway + node bridge (current)
|
||||
- The app runs the Gateway (local mode) and connects to it as a node.
|
||||
- Agent actions are performed via `node.invoke` (e.g. `system.run`, `system.notify`, `canvas.*`).
|
||||
|
||||
### PeekabooBridge (UI automation)
|
||||
- UI automation uses a separate UNIX socket named `bridge.sock` and the PeekabooBridge JSON protocol.
|
||||
- Host preference order (client-side): Peekaboo.app → Claude.app → Clawdbot.app → local execution.
|
||||
- Security: bridge hosts require TeamID `Y5PE65HELJ`; DEBUG-only same-UID escape hatch is guarded by `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (Peekaboo convention).
|
||||
- See: [`docs/mac/peekaboo.md`](/mac/peekaboo) for the Clawdbot plan and naming.
|
||||
|
||||
### Mach/XPC (future direction)
|
||||
- Still optional for internal app services, but **not required** for automation now that node.invoke is the surface.
|
||||
|
||||
## Operational flows
|
||||
- Restart/rebuild: `SIGN_IDENTITY="Apple Development: Peter Steinberger (2ZAC4GM7GD)" scripts/restart-mac.sh`
|
||||
- Kills existing instances
|
||||
- Swift build + package
|
||||
- Writes/bootstraps/kickstarts the LaunchAgent
|
||||
- Single instance: app exits early if another instance with the same bundle ID is running.
|
||||
|
||||
## Hardening notes
|
||||
- Prefer requiring a TeamID match for all privileged surfaces.
|
||||
- PeekabooBridge: `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (DEBUG-only) may allow same-UID callers for local development.
|
||||
- All communication remains local-only; no network sockets are exposed.
|
||||
- TCC prompts originate only from the GUI app bundle; run [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh) so the signed bundle ID stays stable.
|
||||
Reference in New Issue
Block a user