docs: reorganize documentation structure

This commit is contained in:
Peter Steinberger
2026-01-07 00:41:31 +01:00
parent b8db8502aa
commit db4d0b8e75
126 changed files with 881 additions and 270 deletions

133
docs/platforms/android.md Normal file
View File

@@ -0,0 +1,133 @@
---
summary: "Android app (node): connection runbook + Canvas/Chat/Camera"
read_when:
- Pairing or reconnecting the Android node
- Debugging Android bridge discovery or auth
- Verifying chat history parity across clients
---
# Android App (Node)
## Connection Runbook
Android node app ⇄ (mDNS/NSD + TCP bridge) ⇄ **Gateway bridge** ⇄ (loopback WS) ⇄ **Gateway**
The Gateway WebSocket stays loopback-only (`ws://127.0.0.1:18789`). Android talks to the LAN-facing **bridge** (default `tcp://0.0.0.0:18790`) and uses Gateway-owned pairing.
### Prerequisites
- You can run the Gateway on the “master” machine.
- Android device/emulator can reach the gateway bridge:
- Same LAN with mDNS/NSD, **or**
- Same Tailscale tailnet using Wide-Area Bonjour / unicast DNS-SD (see below), **or**
- Manual bridge host/port (fallback)
- You can run the CLI (`clawdbot`) on the gateway machine (or via SSH).
### 1) Start the Gateway (with bridge enabled)
Bridge is enabled by default (disable via `CLAWDBOT_BRIDGE_ENABLED=0`).
```bash
clawdbot gateway --port 18789 --verbose
```
Confirm in logs you see something like:
- `bridge listening on tcp://0.0.0.0:18790 (node)`
For tailnet-only setups (recommended for Vienna ⇄ London), bind the bridge to the gateway machines Tailscale IP instead:
- Set `bridge.bind: "tailnet"` in `~/.clawdbot/clawdbot.json` on the gateway host.
- Restart the Gateway / macOS menubar app.
### 2) Verify discovery (optional)
From the gateway machine:
```bash
dns-sd -B _clawdbot-bridge._tcp local.
```
More debugging notes: [`docs/bonjour.md`](/bonjour).
#### Tailnet (Vienna ⇄ London) discovery via unicast DNS-SD
Android NSD/mDNS discovery wont cross networks. If your Android node and the gateway are on different networks but connected via Tailscale, use Wide-Area Bonjour / unicast DNS-SD instead:
1) Set up a DNS-SD zone (example `clawdbot.internal.`) on the gateway host and publish `_clawdbot-bridge._tcp` records.
2) Configure Tailscale split DNS for `clawdbot.internal` pointing at that DNS server.
Details and example CoreDNS config: [`docs/bonjour.md`](/bonjour).
### 3) Connect from Android
In the Android app:
- The app keeps its bridge connection alive via a **foreground service** (persistent notification).
- Open **Settings**.
- Under **Discovered Bridges**, select your gateway and hit **Connect**.
- If mDNS is blocked, use **Advanced → Manual Bridge** (host + port) and **Connect (Manual)**.
After the first successful pairing, Android auto-reconnects on launch:
- Manual endpoint (if enabled), otherwise
- The last discovered bridge (best-effort).
### 4) Approve pairing (CLI)
On the gateway machine:
```bash
clawdbot nodes pending
clawdbot nodes approve <requestId>
```
Pairing details: [`docs/gateway/pairing.md`](/gateway/pairing).
### 5) Verify the node is connected
- Via nodes status:
```bash
clawdbot nodes status
```
- Via Gateway:
```bash
clawdbot gateway call node.list --params "{}"
```
### 6) Chat + history
The Android nodes Chat sheet uses the gateways **primary session key** (`main`), so history and replies are shared with WebChat and other clients:
- History: `chat.history`
- Send: `chat.send`
- Push updates (best-effort): `chat.subscribe` → `event:"chat"`
### 7) Canvas + camera
#### Gateway Canvas Host (recommended for web content)
If you want the node to show real HTML/CSS/JS that the agent can edit on disk, point the node at the Gateway canvas host.
Note: nodes always use the standalone canvas host on `canvasHost.port` (default `18793`), bound to the bridge interface.
1) Create `~/clawd/canvas/index.html` on the gateway host.
2) Navigate the node to it (LAN):
```bash
clawdbot nodes invoke --node "<Android Node>" --command canvas.navigate --params '{"url":"http://<gateway-hostname>.local:18793/__clawdbot__/canvas/"}'
```
Tailnet (optional): if both devices are on Tailscale, use a MagicDNS name or tailnet IP instead of `.local`, e.g. `http://<gateway-magicdns>:18793/__clawdbot__/canvas/`.
This server injects a live-reload client into HTML and reloads on file changes.
The A2UI host lives at `http://<gateway-host>:18793/__clawdbot__/a2ui/`.
Canvas commands (foreground only):
- `canvas.eval`, `canvas.snapshot`, `canvas.navigate` (use `{"url":""}` or `{"url":"/"}` to return to the default scaffold). `canvas.snapshot` returns `{ format, base64 }` (default `format="jpeg"`).
- A2UI: `canvas.a2ui.push`, `canvas.a2ui.reset` (`canvas.a2ui.pushJSONL` legacy alias)
Camera commands (foreground only; permission-gated):
- `camera.snap` (jpg)
- `camera.clip` (mp4)
See [`docs/camera.md`](/camera) for parameters and CLI helpers.

372
docs/platforms/ios.md Normal file
View File

@@ -0,0 +1,372 @@
---
summary: "iOS app (node): architecture + connection runbook"
read_when:
- Pairing or reconnecting the iOS node
- Debugging iOS bridge discovery or auth
- Sending screen/canvas commands to iOS
- Designing iOS node + gateway integration
- Extending the Gateway protocol for node/canvas commands
- Implementing Bonjour pairing or transport security
---
# iOS App (Node)
Status: prototype implemented (internal) · Date: 2025-12-13
## Connection Runbook
This is the practical “how do I connect the iOS node” guide:
**iOS app** ⇄ (Bonjour + TCP bridge) ⇄ **Gateway bridge** ⇄ (loopback WS) ⇄ **Gateway**
The Gateway WebSocket stays loopback-only (`ws://127.0.0.1:18789`). The iOS node talks to the LAN-facing **bridge** (default `tcp://0.0.0.0:18790`) and uses Gateway-owned pairing.
### Prerequisites
- You can run the Gateway on the “master” machine.
- iOS node app can reach the gateway bridge:
- Same LAN with Bonjour/mDNS, **or**
- Same Tailscale tailnet using Wide-Area Bonjour / unicast DNS-SD (see below), **or**
- Manual bridge host/port (fallback)
- You can run the CLI (`clawdbot`) on the gateway machine (or via SSH).
### 1) Start the Gateway (with bridge enabled)
Bridge is enabled by default (disable via `CLAWDBOT_BRIDGE_ENABLED=0`).
```bash
clawdbot gateway --port 18789 --verbose
```
Confirm in logs you see something like:
- `bridge listening on tcp://0.0.0.0:18790 (node)`
For tailnet-only setups (recommended for Vienna ⇄ London), bind the bridge to the gateway machines Tailscale IP instead:
- Set `bridge.bind: "tailnet"` in `~/.clawdbot/clawdbot.json` on the gateway host.
- Restart the Gateway / macOS menubar app.
### 2) Verify Bonjour discovery (optional but recommended)
From the gateway machine:
```bash
dns-sd -B _clawdbot-bridge._tcp local.
```
You should see your gateway advertising `_clawdbot-bridge._tcp`.
If browse works, but the iOS node cant connect, try resolving one instance:
```bash
dns-sd -L "<instance name>" _clawdbot-bridge._tcp local.
```
More debugging notes: [`docs/bonjour.md`](/bonjour).
#### Tailnet (Vienna ⇄ London) discovery via unicast DNS-SD
If the iOS node and the gateway are on different networks but connected via Tailscale, multicast mDNS wont cross the boundary. Use Wide-Area Bonjour / unicast DNS-SD instead:
1) Set up a DNS-SD zone (example `clawdbot.internal.`) on the gateway host and publish `_clawdbot-bridge._tcp` records.
2) Configure Tailscale split DNS for `clawdbot.internal` pointing at that DNS server.
Details and example CoreDNS config: [`docs/bonjour.md`](/bonjour).
### 3) Connect from the iOS node app
In the iOS node app:
- Pick the discovered bridge (or hit refresh).
- If not paired yet, it will initiate pairing automatically.
- After the first successful pairing, it will auto-reconnect **strictly to the last discovered gateway** on launch (including after reinstall), as long as the iOS Keychain entry is still present.
#### Connection indicator (always visible)
The Settings tab icon shows a small status dot:
- **Green**: connected to the bridge
- **Yellow**: connecting (subtle pulse)
- **Red**: not connected / error
### 4) Approve pairing (CLI)
On the gateway machine:
```bash
clawdbot nodes pending
```
Approve the request:
```bash
clawdbot nodes approve <requestId>
```
After approval, the iOS node receives/stores the token and reconnects authenticated.
Pairing details: [`docs/gateway/pairing.md`](/gateway/pairing).
### 5) Verify the node is connected
- In the macOS app: **Instances** tab should show something like `iOS Node (...)` with a green “Active” presence dot shortly after connect.
- Via nodes status (paired + connected):
```bash
clawdbot nodes status
```
- Via Gateway (paired + connected):
```bash
clawdbot gateway call node.list --params "{}"
```
- Via Gateway presence (legacy-ish, still useful):
```bash
clawdbot gateway call system-presence --params "{}"
```
Look for the node `instanceId` (often a UUID).
### 6) Drive the iOS Canvas (draw / snapshot)
The iOS node runs a WKWebView “Canvas” scaffold which exposes:
- `window.__clawdbot.canvas`
- `window.__clawdbot.ctx` (2D context)
- `window.__clawdbot.setStatus(title, subtitle)`
#### Gateway Canvas Host (recommended for web content)
If you want the node to show real HTML/CSS/JS that the agent can edit on disk, point it at the Gateway canvas host.
Note: nodes always use the standalone canvas host on `canvasHost.port` (default `18793`), bound to the bridge interface.
1) Create `~/clawd/canvas/index.html` on the gateway host.
2) Navigate the node to it (LAN):
```bash
clawdbot nodes invoke --node "iOS Node" --command canvas.navigate --params '{"url":"http://<gateway-hostname>.local:18793/__clawdbot__/canvas/"}'
```
Notes:
- The server injects a live-reload client into HTML and reloads on file changes.
- A2UI is hosted on the same canvas host at `http://<gateway-host>:18793/__clawdbot__/a2ui/`.
- Tailnet (optional): if both devices are on Tailscale, use a MagicDNS name or tailnet IP instead of `.local`, e.g. `http://<gateway-magicdns>:18793/__clawdbot__/canvas/`.
- iOS may require App Transport Security allowances to load plain `http://` URLs; if it fails to load, prefer HTTPS or adjust the iOS apps ATS config.
#### Draw with `canvas.eval`
```bash
clawdbot nodes invoke --node "iOS Node" --command canvas.eval --params "$(cat <<'JSON'
{"javaScript":"(() => { const {ctx,setStatus} = window.__clawdbot; setStatus('Drawing','…'); ctx.clearRect(0,0,innerWidth,innerHeight); ctx.lineWidth=6; ctx.strokeStyle='#ff2d55'; ctx.beginPath(); ctx.moveTo(40,40); ctx.lineTo(innerWidth-40, innerHeight-40); ctx.stroke(); setStatus(null,null); return 'ok'; })()"}
JSON
)"
```
#### Snapshot with `canvas.snapshot`
```bash
clawdbot nodes invoke --node 192.168.0.88 --command canvas.snapshot --params '{"maxWidth":900}'
```
The response includes `{ format, base64 }` image data (default `format="jpeg"`; pass `{"format":"png"}` when you specifically need lossless PNG).
### Common gotchas
- **iOS in background:** all `canvas.*` commands fail fast with `NODE_BACKGROUND_UNAVAILABLE` (bring the iOS node app to foreground).
- **Return to default scaffold:** `canvas.navigate` with `{"url":""}` or `{"url":"/"}` returns to the built-in scaffold page.
- **mDNS blocked:** some networks block multicast; use a different LAN or plan a tailnet-capable bridge (see [`docs/discovery.md`](/discovery)).
- **Wrong node selector:** `--node` can be the node id (UUID), display name (e.g. `iOS Node`), IP, or an unambiguous prefix. If its ambiguous, the CLI will tell you.
- **Stale pairing / Keychain cleared:** if the pairing token is missing (or iOS Keychain was wiped), the node must pair again; approve a new pending request.
- **App reinstall but no reconnect:** the node restores `instanceId` + last bridge preference from Keychain; if it still comes up “unpaired”, verify Keychain persistence on your device/simulator and re-pair once.
## Design + Architecture
### Goals
- Build an **iOS app** that acts as a **remote node** for Clawdbot:
- **Voice trigger** (wake-word / always-listening intent) that forwards transcripts to the Gateway `agent` method.
- **Canvas** surface that the agent can control: navigate, draw/render, evaluate JS, snapshot.
- **Dead-simple setup**:
- Auto-discover the host on the local network via **Bonjour**.
- One-tap pairing with an approval prompt on the Mac.
- iOS is **never** a local gateway; it is always a remote node.
- Operational clarity:
- When iOS is backgrounded, voice may still run; **canvas commands must fail fast** with a structured error.
- Provide **settings**: node display name, enable/disable voice wake, pairing status.
Non-goals (v1):
- Exposing the Node Gateway directly on the LAN.
- Supporting arbitrary third-party “plugins” on iOS.
- Perfect App Store compliance; this is **internal-only** initially.
### Current repo reality (constraints we respect)
- The Gateway WebSocket server binds to `127.0.0.1:18789` ([`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts)) with an optional `CLAWDBOT_GATEWAY_TOKEN`.
- The Gateway exposes a Canvas file server (`canvasHost`) on `canvasHost.port` (default `18793`), so nodes can `canvas.navigate` to `http://<lanHost>:18793/__clawdbot__/canvas/` and auto-reload on file changes ([`docs/configuration.md`](/configuration)).
- macOS “Canvas” is controlled via the Gateway node protocol (`canvas.*`), matching iOS/Android ([`docs/mac/canvas.md`](/mac/canvas)).
- Voice wake forwards via `GatewayChannel` to Gateway `agent` (mac app: `VoiceWakeForwarder` → `GatewayConnection.sendAgent`).
### Recommended topology (B): Gateway-owned Bridge + loopback Gateway
Keep the Node gateway loopback-only; expose a dedicated **gateway-owned bridge** to the LAN/tailnet.
**iOS App** ⇄ (TLS + pairing) ⇄ **Bridge (in gateway)** ⇄ (loopback) ⇄ **Gateway WS** (`ws://127.0.0.1:18789`)
Why:
- Preserves current threat model: Gateway remains local-only.
- Centralizes auth, rate limiting, and allowlisting in the bridge.
- Lets us unify “canvas node” semantics across mac + iOS without exposing raw gateway methods.
### Security plan (internal, but still robust)
#### Transport
- **Current (v0):** bridge is a LAN-facing **TCP** listener with token-based auth after pairing.
- **Next:** wrap the bridge in **TLS** and prefer key-pinned or mTLS-like auth after pairing.
#### Pairing
- Bonjour discovery shows a candidate “Clawdbot Bridge” on the LAN.
- First connection:
1) iOS generates a keypair (Secure Enclave if available).
2) iOS connects to the bridge and requests pairing.
3) The bridge forwards the pairing request to the **Gateway** as a *pending request*.
4) Approval can happen via:
- **macOS UI** (Clawdbot shows an alert with Approve/Reject/Later, including the node IP), or
- **Terminal/CLI** (headless flows).
5) Once approved, the bridge returns a token to iOS; iOS stores it in Keychain.
- Subsequent connections:
- The bridge requires the paired identity. Unpaired clients get a structured “not paired” error and no access.
##### Gateway-owned pairing (Option B details)
Pairing decisions must be owned by the Gateway (`clawd` / Node) so nodes can be approved without the macOS app running.
Key idea:
- The Swift app may still show an alert, but it is only a **frontend** for pending requests stored in the Gateway.
Desired behavior:
- If the Swift UI is present: show alert with Approve/Reject/Later.
- If the Swift UI is not present: `clawdbot` CLI can list pending requests and approve/reject.
See [`docs/gateway/pairing.md`](/gateway/pairing) for the API/events and storage.
CLI (headless approvals):
- `clawdbot nodes pending`
- `clawdbot nodes approve <requestId>`
- `clawdbot nodes reject <requestId>`
#### Authorization / scope control (bridge-side ACL)
The bridge must not be a raw proxy to every gateway method.
- Allow by default:
- `agent` (with guardrails; idempotency required)
- minimal `system-event` beacons (presence updates for the node)
- node/canvas methods defined below (new protocol surface)
- Deny by default:
- anything that widens control without explicit intent (future “shell”, “files”, etc.)
- Rate limit:
- handshake attempts
- voice forwards per minute
- snapshot frequency / payload size
### Protocol unification: add “node/canvas” to Gateway protocol
#### Principle
Unify mac Canvas + iOS Canvas under a single conceptual surface:
- The agent talks to the Gateway using a stable method set (typed protocol).
- The Gateway routes node-targeted requests to:
- local mac Canvas implementation, or
- remote iOS node via the bridge
#### Minimal protocol additions (v1)
Add to [`src/gateway/protocol/schema.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/protocol/schema.ts) (and regenerate Swift models):
**Identity**
- Node identity comes from `connect.params.client.instanceId` (stable), and `connect.params.client.mode = "node"` (or `"ios-node"`).
**Methods**
- `node.list` → list paired/connected nodes + capabilities
- `node.describe` → describe a node (capabilities + supported `node.invoke` commands)
- `node.invoke` → send a command to a specific node
- Params: `{ nodeId, command, params?, timeoutMs? }`
**Events**
- `node.event` → async node status/errors
- e.g. background/foreground transitions, voice availability, canvas availability
#### Node command set (canvas)
These are values for `node.invoke.command`:
- `canvas.present` / `canvas.hide`
- `canvas.navigate` with `{ url }` (loads a URL; use `""` or `"/"` to return to the default scaffold)
- `canvas.eval` with `{ javaScript }`
- `canvas.snapshot` with `{ maxWidth?, quality?, format? }`
- A2UI (mobile + macOS canvas):
- `canvas.a2ui.push` with `{ messages: [...] }` (A2UI v0.8 server→client messages)
- `canvas.a2ui.pushJSONL` with `{ jsonl: "..." }` (legacy alias)
- `canvas.a2ui.reset`
- A2UI is hosted by the Gateway canvas host (`/__clawdbot__/a2ui/`) on `canvasHost.port`. Commands fail if the host is unreachable.
Result pattern:
- Request is a standard `req/res` with `ok` / `error`.
- Long operations (loads, streaming drawing, etc.) may also emit `node.event` progress.
##### Current (implemented)
As of 2025-12-13, the Gateway supports `node.invoke` for bridge-connected nodes.
Example: draw a diagonal line on the iOS Canvas:
```bash
clawdbot nodes invoke --node ios-node --command canvas.eval --params '{"javaScript":"(() => { const {ctx} = window.__clawdbot; ctx.clearRect(0,0,innerWidth,innerHeight); ctx.lineWidth=6; ctx.strokeStyle=\"#ff2d55\"; ctx.beginPath(); ctx.moveTo(40,40); ctx.lineTo(innerWidth-40, innerHeight-40); ctx.stroke(); return \"ok\"; })()"}'
```
### Background behavior requirement
When iOS is backgrounded:
- Voice may still be active (subject to iOS suspension).
- **All `canvas.*` commands must fail** with a stable error code, e.g.:
- `NODE_BACKGROUND_UNAVAILABLE`
- Include `retryable: true` and `retryAfterMs` if we want the agent to wait.
## iOS app architecture (SwiftUI)
### App structure
- Single fullscreen Canvas surface (WKWebView).
- One settings entry point: a **gear button** that opens a settings sheet.
- All navigation is **agent-driven** (no local URL bar).
### Components
- `BridgeDiscovery`: Bonjour browse + resolve (Network.framework `NWBrowser`)
- `BridgeConnection`: TCP session + pairing handshake + reconnect (TLS planned)
- `NodeRuntime`:
- Voice pipeline (wake-word + capture + forward)
- Canvas pipeline (WKWebView controller + snapshot + eval)
- Background state tracking; enforces “canvas unavailable in background”
### Voice in background (internal)
- Enable background audio mode (and required session configuration) so the mic pipeline can keep running when the user switches apps.
- If iOS suspends the app anyway, surface a clear node status (`node.event`) so operators can see voice is unavailable.
## Code sharing (macOS + iOS)
Create/expand SwiftPM targets so both apps share:
- `ClawdbotProtocol` (generated models; platform-neutral)
- `ClawdbotGatewayClient` (shared WS framing + connect/req/res + seq-gap handling)
- `ClawdbotKit` (node/canvas command types + deep links + shared utilities)
macOS continues to own:
- local Canvas implementation details (custom scheme handler serving on-disk HTML, window/panel presentation)
iOS owns:
- iOS-specific audio/speech + WKWebView presentation and lifecycle
## Repo layout
- iOS app: `apps/ios/` (XcodeGen `project.yml`)
- Shared Swift packages: `apps/shared/`
- Lint/format: iOS target runs `swiftformat --lint` + `swiftlint lint` using repo configs (`.swiftformat`, `.swiftlint.yml`).
Generate the Xcode project:
```bash
cd apps/ios
xcodegen generate
open Clawdbot.xcodeproj
```
## Storage plan (private by default)
### iOS
- Canvas/workspace files (persistent, private):
- `Application Support/Clawdbot/canvas/<sessionKey>/...`
- Snapshots / temp exports (evictable):
- `Library/Caches/Clawdbot/canvas-snapshots/<sessionKey>/...`
- Credentials:
- Keychain (paired identity + bridge trust anchor)
## Related docs
- [`docs/gateway.md`](/gateway) (gateway runbook)
- [`docs/gateway/pairing.md`](/gateway/pairing) (approval + storage)
- [`docs/bonjour.md`](/bonjour) (discovery debugging)
- [`docs/discovery.md`](/discovery) (LAN vs tailnet vs SSH)

11
docs/platforms/linux.md Normal file
View File

@@ -0,0 +1,11 @@
---
summary: "Linux app status + contribution call"
read_when:
- Looking for Linux companion app status
- Planning platform coverage or contributions
---
# Linux App
Clawdbot core is fully supported on Linux. The core is written in TypeScript, so it runs anywhere Node runs.
We do not have a Linux companion app yet. It is planned, and we would love contributions to make it happen.

133
docs/platforms/mac/bun.md Normal file
View File

@@ -0,0 +1,133 @@
---
summary: "Bundled bun gateway: packaging, launchd, signing, and bytecode"
read_when:
- Packaging Clawdbot.app
- Debugging the bundled gateway binary
- Changing bun build flags or codesigning
---
# Bundled bun Gateway (macOS)
Goal: ship **Clawdbot.app** with a self-contained relay binary that can run both the CLI and the Gateway daemon. No global `npm install -g clawdbot`, no system Node requirement.
## What gets bundled
App bundle layout:
- `Clawdbot.app/Contents/Resources/Relay/clawdbot`
- bun `--compile` relay executable built from [`dist/macos/relay.js`](https://github.com/clawdbot/clawdbot/blob/main/dist/macos/relay.js)
- Supports:
- `clawdbot …` (CLI)
- `clawdbot gateway-daemon …` (LaunchAgent daemon)
- `Clawdbot.app/Contents/Resources/Relay/package.json`
- tiny “p runtime compatibility” file (see below)
- `Clawdbot.app/Contents/Resources/Relay/theme/`
- p TUI theme payload (optional, but strongly recommended)
Why the sidecar files matter:
- The embedded p runtime detects “bun binary mode” and then looks for `package.json` + `theme/` **next to `process.execPath`** (i.e. next to `clawdbot`).
- So even if bun can embed assets, the runtime expects filesystem paths. Keep the sidecar files.
## Build pipeline
Packaging script:
- [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh)
It builds:
- TS: `pnpm exec tsc`
- Swift app + helper: `swift build …`
- bun relay: `bun build dist/macos/relay.js --compile --bytecode …`
Important bundler flags:
- `--compile`: produces a standalone executable
- `--bytecode`: reduces startup time / parsing overhead (works here)
- externals:
- `-e electron`
- Reason: avoid bundling Electron stubs in the relay binary
Version injection:
- `--define "__CLAWDBOT_VERSION__=\"<pkg version>\""`
- [`src/version.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/version.ts) also supports `__CLAWDBOT_VERSION__` (and `CLAWDBOT_BUNDLED_VERSION`) so `--version` doesnt depend on reading `package.json` at runtime.
## Launchd (Gateway as LaunchAgent)
Label:
- `com.clawdbot.gateway`
Plist location (per-user):
- `~/Library/LaunchAgents/com.clawdbot.gateway.plist`
Manager:
- [`apps/macos/Sources/Clawdbot/GatewayLaunchAgentManager.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/GatewayLaunchAgentManager.swift)
Behavior:
- “Clawdbot Active” enables/disables the LaunchAgent.
- App quit does **not** stop the gateway (launchd keeps it alive).
Logging:
- launchd stdout/err: `/tmp/clawdbot/clawdbot-gateway.log`
Default LaunchAgent env:
- `CLAWDBOT_IMAGE_BACKEND=sips` (avoid sharp native addon under bun)
## Codesigning (hardened runtime + bun)
Symptom (when mis-signed):
- `Ran out of executable memory …` on launch
Fix:
- The bun executable needs JIT-ish permissions under hardened runtime.
- [`scripts/codesign-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/codesign-mac-app.sh) signs `Relay/clawdbot` with:
- `com.apple.security.cs.allow-jit`
- `com.apple.security.cs.allow-unsigned-executable-memory`
## Image processing under bun
Problem:
- bun cant load some native Node addons like `sharp` (and we dont want to ship native addon trees for the gateway).
Solution:
- Central helper [`src/media/image-ops.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/media/image-ops.ts)
- Prefers `/usr/bin/sips` on macOS (esp. when running under bun)
- Falls back to `sharp` when available (Node/dev)
- Used by:
- [`src/web/media.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/web/media.ts) (optimize inbound/outbound images)
- [`src/browser/screenshot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/browser/screenshot.ts)
- [`src/agents/pi-tools.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-tools.ts) (image sanitization)
## Browser control server
The Gateway starts the browser control server (loopback only) from [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts).
Its started from the relay daemon process, so the relay binary includes Playwright deps.
## Tests / smoke checks
From a packaged app (local build):
```bash
dist/Clawdbot.app/Contents/Resources/Relay/clawdbot --version
CLAWDBOT_SKIP_PROVIDERS=1 \
CLAWDBOT_SKIP_CANVAS_HOST=1 \
dist/Clawdbot.app/Contents/Resources/Relay/clawdbot gateway-daemon --port 18999 --bind loopback
```
Then, in another shell:
```bash
pnpm -s clawdbot gateway call health --url ws://127.0.0.1:18999 --timeout 3000
```
## Repo hygiene
Bun may leave dotfiles like `*.bun-build` in the repo root or subfolders.
- These are ignored via `.gitignore` (`*.bun-build`).
## DMG styling (human installer)
[`scripts/create-dmg.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/create-dmg.sh) styles the DMG via Finder AppleScript.
Rules of thumb:
- Use a **72dpi** background image that matches the Finder window size in points.
- Preferred asset: `assets/dmg-background-small.png` (**500×320**).
- Default icon positions: app `{125,160}`, Applications `{375,160}`.

View File

@@ -0,0 +1,161 @@
---
summary: "Agent-controlled Canvas panel embedded via WKWebView + custom URL scheme"
read_when:
- Implementing the macOS Canvas panel
- Adding agent controls for visual workspace
- Debugging WKWebView canvas loads
---
# Canvas (macOS app)
Status: draft spec · Date: 2025-12-12
Note: for iOS/Android nodes that should render agent-edited HTML/CSS/JS over the network, prefer the Gateway `canvasHost` (serves `~/clawd/canvas` over LAN/tailnet with live reload). A2UI is also **hosted by the Gateway** over HTTP. This doc focuses on the macOS in-app canvas panel. See [`docs/configuration.md`](/configuration).
Clawdbot can embed an agent-controlled “visual workspace” panel (“Canvas”) inside the macOS app using `WKWebView`, served via a **custom URL scheme** (no loopback HTTP port required).
This is designed for:
- Agent-written HTML/CSS/JS on disk (per-session directory).
- A real browser engine for layout, rendering, and basic interactivity.
- Agent-driven visibility (show/hide), navigation, DOM/JS queries, and snapshots.
- Minimal chrome: borderless panel; bezel/chrome appears only on hover.
## Why a custom scheme (vs. loopback HTTP)
Using `WKURLSchemeHandler` keeps Canvas entirely in-process:
- No port conflicts and no extra local server lifecycle.
- Easier to sandbox: only serve files we explicitly map.
- Works offline and can use an ephemeral data store (no persistent cookies/cache).
If a Canvas page truly needs “real web” semantics (CORS, fetch to loopback endpoints, service workers), consider the loopback-server variant instead (out of scope for this doc).
## URL ↔ directory mapping
The Canvas scheme is:
- `clawdbot-canvas://<session>/<path>`
Routing model:
- `clawdbot-canvas://main/``<canvasRoot>/main/index.html` (or `index.htm`)
- `clawdbot-canvas://main/yolo``<canvasRoot>/main/yolo/index.html` (or `index.htm`)
- `clawdbot-canvas://main/assets/app.css``<canvasRoot>/main/assets/app.css`
Directory listings are not served.
When `/` has no `index.html` yet, the handler serves a **built-in scaffold page** (bundled with the macOS app).
This is a visual placeholder only (no A2UI renderer).
### Suggested on-disk location
Store Canvas state under the app support directory:
- `~/Library/Application Support/Clawdbot/canvas/<session>/…`
This keeps it alongside other app-owned state and avoids mixing with `~/.clawdbot/` gateway config.
## Panel behavior (agent-controlled)
Canvas is presented as a borderless `NSPanel` (similar to the existing WebChat panel):
- Can be shown/hidden at any time by the agent.
- Supports an “anchored” presentation (near the menu bar icon or another anchor rect).
- Uses a rounded container; shadow stays on, but **chrome/bezel only appears on hover**.
- Default position is the **top-right corner** of the current screens visible frame (unless the user moved/resized it previously).
- The panel is **user-resizable** (edge resize + hover resize handle) and the last frame is persisted per session.
### Hover-only chrome
Implementation notes:
- Keep the window borderless at all times (dont toggle `styleMask`).
- Add an overlay view inside the content container for chrome (stroke + subtle gradient/material).
- Use an `NSTrackingArea` to fade the chrome in/out on `mouseEntered/mouseExited`.
- Optionally show close/drag affordances only while hovered.
## Agent API surface (current)
Canvas is exposed via the Gateway **node bridge**, so the agent can:
- Show/hide the panel.
- Navigate to a path (relative to the session root).
- Evaluate JavaScript and optionally return results.
- Query/modify DOM (helpers mirroring “dom query/all/attr/click/type/wait” patterns).
- Capture a snapshot image of the current canvas view.
- Optionally set panel placement (screen `x/y` + `width/height`) when showing/navigating.
This should be modeled after `WebChatManager`/`WebChatSwiftUIWindowController` but targeting `clawdbot-canvas://…` URLs.
Related:
- For “invoke the agent again from UI” flows, prefer the macOS deep link scheme (`clawdbot://agent?...`) so *any* UI surface (Canvas, WebChat, native views) can trigger a new agent run. See [`docs/macos.md`](/macos).
## Agent commands (current)
Use the main `clawdbot` CLI; it invokes canvas commands via `node.invoke`.
- `clawdbot canvas present [--node <id>] [--target <...>] [--x/--y/--width/--height]`
- Local targets map into the session directory via the custom scheme (directory targets resolve `index.html|index.htm`).
- If `/` has no index file, Canvas shows the built-in scaffold page and returns `status: "welcome"`.
- `clawdbot canvas hide [--node <id>]`
- `clawdbot canvas eval --js <code> [--node <id>]`
- `clawdbot canvas snapshot [--node <id>]`
### Canvas A2UI
Canvas A2UI is hosted by the **Gateway canvas host** at:
```
http://<gateway-host>:18793/__clawdbot__/a2ui/
```
The macOS app simply renders that page in the Canvas panel. The agent can drive it with JSONL **server→client protocol messages** (one JSON object per line):
- `clawdbot canvas a2ui push --jsonl <path> [--node <id>]`
- `clawdbot canvas a2ui reset [--node <id>]`
`push` expects a JSONL file where **each line is a single JSON object** (parsed and forwarded to the in-page A2UI renderer).
Minimal example (v0.8):
```bash
cat > /tmp/a2ui-v0.8.jsonl <<'EOF'
{"surfaceUpdate":{"surfaceId":"main","components":[{"id":"root","component":{"Column":{"children":{"explicitList":["title","content"]}}}},{"id":"title","component":{"Text":{"text":{"literalString":"Canvas (A2UI v0.8)"},"usageHint":"h1"}}},{"id":"content","component":{"Text":{"text":{"literalString":"If you can read this, `canvas a2ui push` works."},"usageHint":"body"}}}]}}
{"beginRendering":{"surfaceId":"main","root":"root"}}
EOF
clawdbot canvas a2ui push --jsonl /tmp/a2ui-v0.8.jsonl --node <id>
```
Notes:
- This does **not** support the A2UI v0.9 examples using `createSurface`.
- A2UI **fails** if the Gateway canvas host is unreachable (no local fallback).
- `canvas a2ui push` validates JSONL (line numbers on errors) and rejects v0.9 payloads.
- Quick smoke: `clawdbot canvas a2ui push --text "Hello from A2UI"` renders a minimal v0.8 view.
## Triggering agent runs from Canvas (deep links)
Canvas can trigger new agent runs via the macOS app deep-link scheme:
- `clawdbot://agent?...`
This is intentionally separate from `clawdbot-canvas://…` (which is only for serving local Canvas files into the `WKWebView`).
Suggested patterns:
- HTML: render links/buttons that navigate to `clawdbot://agent?message=...`.
- JS: set `window.location.href = 'clawdbot://agent?...'` for “run this now” actions.
Implementation note (important):
- In `WKWebView`, intercept `clawdbot://…` navigations in `WKNavigationDelegate` and forward them to the app, e.g. by calling `DeepLinkHandler.shared.handle(url:)` and returning `.cancel` for the navigation.
Safety:
- Deep links (`clawdbot://agent?...`) are always enabled.
- Without a `key` query param, the app will prompt for confirmation before invoking the agent.
- With a valid `key`, the run is unattended (no prompt). For Canvas-originated actions, the app injects an internal key automatically.
## Security / guardrails
Recommended defaults:
- `WKWebsiteDataStore.nonPersistent()` for Canvas (ephemeral).
- Navigation policy: allow only `clawdbot-canvas://…` (and optionally `about:blank`); open `http/https` externally.
- Scheme handler must prevent directory traversal: resolved file paths must stay under `<canvasRoot>/<session>/`.
- Disable or tightly scope any JS bridge; prefer query-string/bootstrap config over `window.webkit.messageHandlers` for sensitive data.
## Debugging
Suggested debugging hooks:
- Enable Web Inspector for Canvas builds (same approach as WebChat).
- Log scheme requests + resolution decisions to OSLog (subsystem `com.clawdbot`, category `Canvas`).
- Provide a “copy canvas dir” action in debug settings to quickly reveal the session directory in Finder.

View File

@@ -0,0 +1,72 @@
---
summary: "Running the gateway as a child process of the macOS app and why"
read_when:
- Integrating the mac app with the gateway lifecycle
---
# Clawdbot gateway as a child process of the macOS app
Date: 2025-12-06 · Status: draft · Owner: steipete
Note (2025-12-19): the current implementation prefers a **launchd LaunchAgent** that runs the **bundled bun-compiled gateway**. This doc remains as an alternative mode for tighter coupling to the UI.
## Goal
Run the Node-based Clawdbot/clawdbot gateway as a direct child of the LSUIElement app (instead of a launchd agent) while keeping all TCC-sensitive work inside the Swift app/broker layer and wiring the existing “Clawdbot Active” toggle to start/stop the child.
## When to prefer the child-process mode
- You want gateway lifetime strictly coupled to the menu-bar app (dies when the app quits) and controlled by the “Clawdbot Active” toggle without touching launchd.
- Youre okay giving up login persistence/auto-restart that launchd provides, or youll add your own backoff loop.
- You want simpler log capture and supervision inside the app (no external plist or user-visible LaunchAgent).
## Tradeoffs vs. launchd
- **Pros:** tighter coupling to UI state; simpler surface (no plist install/bootout); easier to stream stdout/stderr; fewer moving parts for beta users.
- **Cons:** no built-in KeepAlive/login auto-start; app crash kills gateway; you must build your own restart/backoff; Activity Monitor will show both processes under the app; still need correct TCC handling (see below).
- **TCC:** behaviorally, child processes often inherit the parent apps “responsible process” for TCC, but this is *not a contract*. Continue to route all protected actions through the Swift app/broker so prompts stay tied to the signed app bundle.
## TCC guardrails (must keep)
- Screen Recording, Accessibility, mic, and speech prompts must originate from the signed Swift app/broker. The Node child should never call these APIs directly; route through the apps node commands (via Gateway `node.invoke`) for:
- `system.notify`
- `system.run` (including `needsScreenRecording`)
- `screen.record` / `camera.*`
- PeekabooBridge UI automation (`peekaboo …`)
- Usage strings (`NSMicrophoneUsageDescription`, `NSSpeechRecognitionUsageDescription`, etc.) stay in the app targets Info.plist; a bare Node binary has none and would fail.
- If you ever embed Node that *must* touch TCC, wrap that call in a tiny signed helper target inside the app bundle and have Node exec that helper instead of calling the API directly.
## Process manager design (Swift Subprocess)
- Add a small `GatewayProcessManager` (Swift) that owns:
- `execution: Execution?` from `Swift Subprocess` to track the child.
- `start(config)` called when “Clawdbot Active” flips ON:
- binary: host Node running the bundled gateway under `Clawdbot.app/Contents/Resources/Gateway/`
- args: current clawdbot entrypoint and flags
- cwd/env: point to `~/.clawdbot` as today; inject the expanded PATH so Homebrew Node resolves under launchd
- output: stream stdout/stderr to `/tmp/clawdbot-gateway.log` (cap buffer via Subprocess OutputLimits)
- restart: optional linear/backoff restart if exit was non-zero and Active is still true
- `stop()` called when Active flips OFF or app terminates: cancel the execution and `waitUntilExit`.
- Wire SwiftUI toggle:
- ON: `GatewayProcessManager.start(...)`
- OFF: `GatewayProcessManager.stop()` (no launchctl calls in this mode)
- Keep the existing `LaunchdManager` around so we can switch back if needed; the toggle can choose between launchd or child mode with a flag if we want both.
## Packaging and signing
- Bundle the gateway payload (dist + production node_modules) under `Contents/Resources/Gateway/`; rely on host Node ≥22 instead of embedding a runtime.
- Codesign native addons and dylibs inside the bundle; no nested runtime binary to sign now.
- Host runtime should not call TCC APIs directly; keep privileged work inside the app/broker.
## Logging and observability
- Stream child stdout/stderr to `/tmp/clawdbot-gateway.log`; surface the last N lines in the Debug tab.
- Emit a user notification (via existing NotificationManager) on crash/exit while Active is true.
- Add a lightweight heartbeat from Node → app (e.g., ping over stdout) so the app can show status in the menu.
## Failure/edge cases
- App crash/quit kills the gateway. Decide if that is acceptable for the deployment tier; otherwise, stick with launchd for production and keep child-process for dev/experiments.
- If the gateway exits repeatedly, back off (e.g., 1s/2s/5s/10s) and give up after N attempts with a menu warning.
- Respect the existing pause semantics: when paused, the broker should return `ok=false, "clawdbot paused"`; the gateway should avoid calling privileged routes while paused.
## Open questions / follow-ups
- Do we need dual-mode (launchd for prod, child for dev)? If yes, gate via a setting or build flag.
- Embedding a runtime is off the table for now; we rely on host Node for size/simplicity. Revisit only if host PATH drift becomes painful.
- Do we want a tiny signed helper for rare TCC actions that cannot be brokered via the Swift app/broker?
## Decision snapshot (current recommendation)
- Keep all TCC surfaces in the Swift app/broker (node commands + PeekabooBridgeHost).
- Implement `GatewayProcessManager` with Swift Subprocess to start/stop the gateway on the “Clawdbot Active” toggle.
- Maintain the launchd path as a fallback for uptime/login persistence until child-mode proves stable.

View File

@@ -0,0 +1,96 @@
---
summary: "Setup guide for developers working on the Clawdbot macOS app"
read_when:
- Setting up the macOS development environment
---
# macOS Developer Setup
This guide covers the necessary steps to build and run the Clawdbot macOS application from source.
## Prerequisites
Before building the app, ensure you have the following installed:
1. **Xcode 26.2+**: Required for Swift development.
2. **Node.js & pnpm**: Required for the gateway and CLI components.
3. **Bun**: Required to package the embedded gateway relay.
```bash
curl -fsSL https://bun.sh/install | bash
```
## 1. Initialize Submodules
Clawdbot depends on several submodules (like `Peekaboo`). You must initialize these recursively:
```bash
git submodule update --init --recursive
```
## 2. Install Dependencies
Install the project-wide dependencies:
```bash
pnpm install
```
## 3. Build and Package the App
To build the macOS app and package it into `dist/Clawdbot.app`, run:
```bash
./scripts/package-mac-app.sh
```
If you don't have an Apple Developer ID certificate, the script will automatically use **ad-hoc signing** (`-`).
> **Note**: Ad-hoc signed apps may trigger security prompts. If the app crashes immediately with "Abort trap 6", see the [Troubleshooting](#troubleshooting) section.
## 4. Install the CLI Helper
The macOS app requires a symlink named `clawdbot` in `/usr/local/bin` or `/opt/homebrew/bin` to manage background tasks.
**To install it:**
1. Open the Clawdbot app.
2. Go to the **General** settings tab.
3. Click **"Install CLI helper"** (requires administrator privileges).
Alternatively, you can manually link it from your Admin account:
```bash
sudo ln -sf "/Users/$(whoami)/Projects/clawdbot/dist/Clawdbot.app/Contents/Resources/Relay/clawdbot" /usr/local/bin/clawdbot
```
## Troubleshooting
### Build Fails: Toolchain or SDK Mismatch
The macOS app build expects the latest macOS SDK and Swift 6.2 toolchain.
**System dependencies (required):**
- **Latest macOS version available in Software Update** (required by Xcode 26.2 SDKs)
- **Xcode 26.2** (Swift 6.2 toolchain)
**Checks:**
```bash
xcodebuild -version
xcrun swift --version
```
If versions dont match, update macOS/Xcode and re-run the build.
### App Crashes on Permission Grant
If the app crashes when you try to allow **Speech Recognition** or **Microphone** access, it may be due to a corrupted TCC cache or signature mismatch.
**Fix:**
1. Reset the TCC permissions:
```bash
tccutil reset All com.clawdbot.mac.debug
```
2. If that fails, change the `BUNDLE_ID` temporarily in [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh) to force a "clean slate" from macOS.
### Gateway "Starting..." indefinitely
If the gateway status stays on "Starting...", check if a zombie process is holding the port:
```bash
lsof -nP -i :18789
```
Kill any existing `node` or `clawdbot` processes listening on that port and restart the app.

View File

@@ -0,0 +1,28 @@
---
summary: "How the macOS app reports gateway/Baileys health states"
read_when:
- Debugging mac app health indicators
---
# Health Checks on macOS
How to see whether the WhatsApp Web/Baileys bridge is healthy from the menu bar app.
## Menu bar
- Status dot now reflects Baileys health:
- Green: linked + socket opened recently.
- Orange: connecting/retrying.
- Red: logged out or probe failed.
- Secondary line reads "Web: linked · auth 12m · socket ok" or shows the failure reason.
- "Run Health Check" menu item triggers an on-demand probe.
## Settings
- General tab gains a Health card showing: linked auth age, session-store path/count, last check time, last error/status code, and buttons for Run Health Check / Reveal Logs.
- Uses a cached snapshot so the UI loads instantly and falls back gracefully when offline.
- **Connections tab** surfaces provider status + controls for WhatsApp/Telegram (login QR, logout, probe, last disconnect/error).
## How the probe works
- App runs `clawdbot health --json` via `ShellExecutor` every ~60s and on demand. The probe loads creds, attempts a short Baileys connect, and reports status without sending messages.
- Cache the last good snapshot and the last error separately to avoid flicker; show the timestamp of each.
## When in doubt
- You can still use the CLI flow in [`docs/health.md`](/health) (`clawdbot status`, `clawdbot status --deep`, `clawdbot health --json`) and tail `/tmp/clawdbot/clawdbot-*.log` for `web-heartbeat` / `web-reconnect`.

View File

@@ -0,0 +1,26 @@
---
summary: "Menu bar icon states and animations for Clawdbot on macOS"
read_when:
- Changing menu bar icon behavior
---
# Menu Bar Icon States
Author: steipete · Updated: 2025-12-06 · Scope: macOS app (`apps/macos`)
- **Idle:** Normal icon animation (blink, occasional wiggle).
- **Paused:** Status item uses `appearsDisabled`; no motion.
- **Voice trigger (big ears):** Voice wake detector calls `AppState.triggerVoiceEars(ttl: nil)` when the wake word is heard, keeping `earBoostActive=true` while the utterance is captured. Ears scale up (1.9x), get circular ear holes for readability, then drop via `stopVoiceEars()` after 1s of silence. Only fired from the in-app voice pipeline.
- **Working (agent running):** `AppState.isWorking=true` drives a “tail/leg scurry” micro-motion: faster leg wiggle and slight offset while work is in-flight. Currently toggled around WebChat agent runs; add the same toggle around other long tasks when you wire them.
Wiring points
- Voice wake: runtime/tester call `AppState.triggerVoiceEars(ttl: nil)` on trigger and `stopVoiceEars()` after 1s of silence to match the capture window.
- Agent activity: set `AppStateStore.shared.setWorking(true/false)` around work spans (already done in WebChat agent call). Keep spans short and reset in `defer` blocks to avoid stuck animations.
Shapes & sizes
- Base icon drawn in `CritterIconRenderer.makeIcon(blink:legWiggle:earWiggle:earScale:earHoles:)`.
- Ear scale defaults to `1.0`; voice boost sets `earScale=1.9` and toggles `earHoles=true` without changing overall frame (18×18pt template image rendered into a 36×36px Retina backing store).
- Scurry uses leg wiggle up to ~1.0 with a small horizontal jiggle; its additive to any existing idle wiggle.
Behavioral notes
- No external CLI/broker toggle for ears/working; keep it internal to the apps own signals to avoid accidental flapping.
- Keep TTLs short (&lt;10s) so the icon returns to baseline quickly if a job hangs.

View File

@@ -0,0 +1,51 @@
---
summary: "Clawdbot logging: rolling diagnostics file log + unified log privacy flags"
read_when:
- Capturing macOS logs or investigating private data logging
- Debugging voice wake/session lifecycle issues
---
# Logging (macOS)
## Rolling diagnostics file log (Debug pane)
Clawdbot routes macOS app logs through swift-log (unified logging by default) and can write a local, rotating file log to disk when you need a durable capture.
- Verbosity: **Debug pane → Logs → App logging → Verbosity**
- Enable: **Debug pane → Logs → App logging → “Write rolling diagnostics log (JSONL)”**
- Location: `~/Library/Logs/Clawdbot/diagnostics.jsonl` (rotates automatically; old files are suffixed with `.1`, `.2`, …)
- Clear: **Debug pane → Logs → App logging → “Clear”**
Notes:
- This is **off by default**. Enable only while actively debugging.
- Treat the file as sensitive; dont share it without review.
## Unified logging private data on macOS
Unified logging redacts most payloads unless a subsystem opts into `privacy -off`. Per Peter's write-up on macOS [logging privacy shenanigans](https://steipete.me/posts/2025/logging-privacy-shenanigans) (2025) this is controlled by a plist in `/Library/Preferences/Logging/Subsystems/` keyed by the subsystem name. Only new log entries pick up the flag, so enable it before reproducing an issue.
## Enable for Clawdbot (`com.clawdbot`)
- Write the plist to a temp file first, then install it atomically as root:
```bash
cat <<'EOF' >/tmp/com.clawdbot.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>DEFAULT-OPTIONS</key>
<dict>
<key>Enable-Private-Data</key>
<true/>
</dict>
</dict>
</plist>
EOF
sudo install -m 644 -o root -g wheel /tmp/com.clawdbot.plist /Library/Preferences/Logging/Subsystems/com.clawdbot.plist
```
- No reboot is required; logd notices the file quickly, but only new log lines will include private payloads.
- View the richer output with the existing helper, e.g. `./scripts/clawlog.sh --category WebChat --last 5m`.
## Disable after debugging
- Remove the override: `sudo rm /Library/Preferences/Logging/Subsystems/com.clawdbot.plist`.
- Optionally run `sudo log config --reload` to force logd to drop the override immediately.
- Remember this surface can include phone numbers and message bodies; keep the plist in place only while you actively need the extra detail.

View File

@@ -0,0 +1,69 @@
---
summary: "Menu bar status logic and what is surfaced to users"
read_when:
- Tweaking mac menu UI or status logic
---
# Menu Bar Status Logic
## What is shown
- We surface the current agent work state in the menu bar icon and in the first status row of the menu.
- Health status is hidden while work is active; it returns when all sessions are idle.
- The “Nodes” block in the menu lists **devices** only (gateway bridge nodes via `node.list`), not client/presence entries.
## State model
- Sessions: events arrive with `runId` (per-run) plus `sessionKey` in the payload. The “main” session is the key `main`; if absent, we fall back to the most recently updated session.
- Priority: main always wins. If main is active, its state is shown immediately. If main is idle, the most recently active nonmain session is shown. We do not flipflop midactivity; we only switch when the current session goes idle or main becomes active.
- Activity kinds:
- `job`: highlevel command execution (`state: started|streaming|done|error`).
- `tool`: `phase: start|result` with `toolName` and `meta/args`.
## IconState enum (Swift)
- `idle`
- `workingMain(ActivityKind)`
- `workingOther(ActivityKind)`
- `overridden(ActivityKind)` (debug override)
### ActivityKind → glyph
- `bash` → 💻
- `read` → 📄
- `write` → ✍️
- `edit` → 📝
- `attach` → 📎
- default → 🛠️
### Visual mapping
- `idle`: normal critter.
- `workingMain`: badge with glyph, full tint, leg “working” animation.
- `workingOther`: badge with glyph, muted tint, no scurry.
- `overridden`: uses the chosen glyph/tint regardless of activity.
## Status row text (menu)
- While work is active: `<Session role> · <activity label>`
- Examples: `Main · bash: pnpm test`, `Other · read: apps/macos/Sources/Clawdbot/AppState.swift`.
- When idle: falls back to the health summary.
## Event ingestion
- Source: controlchannel `agent` events (`ControlChannel.handleAgentEvent`).
- Parsed fields:
- `stream: "job"` with `data.state` for start/stop.
- `stream: "tool"` with `data.phase`, `name`, optional `meta`/`args`.
- Labels:
- `bash`: first line of `args.command`.
- `read`/`write`: shortened path.
- `edit`: path plus inferred change kind from `meta`/diff counts.
- fallback: tool name.
## Debug override
- Settings ▸ Debug ▸ “Icon override” picker:
- `System (auto)` (default)
- `Working: main` (per tool kind)
- `Working: other` (per tool kind)
- `Idle`
- Stored via `@AppStorage("iconOverride")`; mapped to `IconState.overridden`.
## Testing checklist
- Trigger main session job: verify icon switches immediately and status row shows main label.
- Trigger nonmain session job while main idle: icon/status shows nonmain; stays stable until it finishes.
- Start main while other active: icon flips to main instantly.
- Rapid tool bursts: ensure badge does not flicker (TTL grace on tool results).
- Health row reappears once all sessions idle.

View File

@@ -0,0 +1,170 @@
---
summary: "Plan for integrating Peekaboo automation into Clawdbot via PeekabooBridge (socket-based TCC broker)"
read_when:
- Hosting PeekabooBridge in Clawdbot.app
- Integrating Peekaboo as a submodule
- Changing PeekabooBridge protocol/paths
---
# Peekaboo Bridge in Clawdbot (macOS UI automation broker)
## TL;DR
- **Peekaboo removed its XPC helper** and now exposes privileged automation via a **UNIX domain socket bridge** (`PeekabooBridge` / `PeekabooBridgeHost`, socket name `bridge.sock`).
- Clawdbot integrates by **optionally hosting the same bridge** inside **Clawdbot.app** (user-toggleable). The primary client is the **`peekaboo` CLI** (installed via npm); Clawdbot does not need its own `ui …` CLI surface.
- For **visualizations**, we keep them in **Peekaboo.app** (best UX); Clawdbot stays a thin broker host. No visualizer toggle in Clawdbot.
Non-goals:
- No auto-launching Peekaboo.app.
- No onboarding deep links from the automation endpoint (Clawdbot onboarding already handles permissions).
- No AI provider/agent runtime dependencies in Clawdbot (avoid pulling Tachikoma/MCP into the Clawdbot app/CLI).
## Big refactor (Dec 2025): XPC → Bridge
Peekaboos privileged execution moved from “CLI → XPC helper” to “CLI → socket bridge host”. For Clawdbot this is a win:
- It matches the existing “local socket + codesign checks” approach.
- It lets us piggyback on **either** Peekaboo.apps permissions **or** Clawdbot.apps permissions (whichever is running).
- It avoids “two apps with two TCC bubbles” unless needed.
Reference (Peekaboo submodule): `Peekaboo/docs/bridge-host.md`.
## Architecture
### Processes
- **Bridge hosts** (provide TCC-backed automation):
- **Peekaboo.app** (preferred; also provides visualizations + controls)
- **Claude.app** (secondary; lets `peekaboo` reuse Claude Desktops granted permissions)
- **Clawdbot.app** (secondary; “thin host” only)
- **Bridge clients** (trigger single actions):
- `peekaboo …` (preferred; humans + agents)
- Optional: Clawdbot/Node shells out to `peekaboo` when it needs UI automation/capture
### Host discovery (client-side)
Order is deliberate:
1. Peekaboo.app host (full UX)
2. Claude.app host (piggyback on Claude Desktop permissions)
3. Clawdbot.app host (piggyback on Clawdbot permissions)
Socket paths (convention; exact paths must match Peekaboo):
- Peekaboo: `~/Library/Application Support/Peekaboo/bridge.sock`
- Claude: `~/Library/Application Support/Claude/bridge.sock`
- Clawdbot: `~/Library/Application Support/clawdbot/bridge.sock`
No auto-launch: if a host isnt reachable, the command fails with a clear error (start Peekaboo.app, Claude.app, or Clawdbot.app).
Override (debugging): set `PEEKABOO_BRIDGE_SOCKET=/path/to/bridge.sock`.
### Protocol shape
- **Single request per connection**: connect → write one JSON request → half-close → read one JSON response → close.
- **Timeout**: 10 seconds end-to-end per action (client enforced; host should also enforce per-operation).
- **Errors**: human-readable string by default; structured envelope in `--json`.
## Dependency strategy (submodule)
Integrate Peekaboo via git submodule (nested submodules are OK).
Path in Clawdbot repo:
- `./Peekaboo` (Swabble-style; keep stable so SwiftPM path deps dont churn).
What Clawdbot should use:
- **Client side**: `PeekabooBridge` (socket client + protocol models).
- **Host side (Clawdbot.app)**: `PeekabooBridgeHost` + the minimal Peekaboo services needed to implement operations.
What Clawdbot should *not* embed:
- **Visualizer UI**: keep it in Peekaboo.app for now (toggle + controls live there).
- **XPC**: dont reintroduce helper targets; use the bridge.
## IPC / CLI surface
### No `clawdbot ui …`
We avoid a parallel “Clawdbot UI automation CLI”. Instead:
- `peekaboo` is the user/agent-facing CLI surface for automation and capture.
- Clawdbot.app can host PeekabooBridge as a **thin TCC broker** so Peekaboo can piggyback on Clawdbot permissions when Peekaboo.app isnt running.
### Diagnostics
Use Peekaboos built-in diagnostics to see which host would be used:
- `peekaboo bridge status`
- `peekaboo bridge status --verbose`
- `peekaboo bridge status --json`
### Output format
Peekaboo commands default to human text output. Add `--json` for a structured envelope.
### Timeouts
Default timeout for UI actions: **10 seconds** end-to-end (client enforced; host should also enforce per-operation).
## Coordinate model (multi-display)
Requirement: coordinates are **per screen**, not global.
Standardize for the CLI (agent-friendly): **top-left origin per screen**.
Proposed request shape:
- Requests accept `screenIndex` + `{x, y}` in that screens local coordinate space.
- Clawdbot.app converts to global CG coordinates using `NSScreen.screens[screenIndex].frame.origin`.
- Responses should echo both:
- The resolved `screenIndex`
- The local `{x, y}` and bounds
- Optionally the global `{x, y}` for debugging
Ordering: use `NSScreen.screens` ordering consistently (documented in the CLI help + JSON schema).
## Targeting (per app/window)
Expose window/app targeting in the UI surface (align with Peekaboo targeting):
- frontmost
- by app name / bundle id
- by window title substring
- by (app, index)
Peekaboo CLI targeting (agent-friendly):
- `--bundle-id <id>` for app targeting
- `--window-index <n>` (0-based) for disambiguating within an app when capturing
All “see/click/type/scroll/wait” requests should accept a target (default: frontmost).
## “See” + click packs (Playwright-style)
Behavior stays aligned with Peekaboo:
- `peekaboo see` returns element IDs (e.g. `B1`, `T3`) with bounds/labels.
- Follow-up actions reference those IDs without re-scanning.
`peekaboo see` should:
- capture (optionally targeted) window/screen
- return a screenshot **file path** (default: temp directory)
- return a list of elements (text or JSON)
Snapshot lifecycle requirement:
- Host apps are long-lived, so snapshot state should be **in-memory by default**.
- Snapshot scoping: “implicit snapshot” is **per target bundle id** (reuse last snapshot for that app when snapshot id is omitted).
Practical flow (agent-friendly):
- `peekaboo list apps` / `peekaboo list windows` provide bundle-id context for targeting.
- `peekaboo see --bundle-id X` updates the implicit snapshot for `X`.
- `peekaboo click --bundle-id X --on B1` reuses the most recent snapshot for `X` when `--snapshot-id` is omitted.
## Visualizer integration
Keep visualizations in **Peekaboo.app** for now.
- Clawdbot hosts the bridge, but does not render overlays.
- Any “visualizer enabled/disabled” setting is controlled in Peekaboo.app.
## Screenshots (legacy → Peekaboo takeover)
Clawdbot should not grow a separate screenshot CLI surface.
Migration plan:
- Use `peekaboo capture …` / `peekaboo see …` (returns a file path, default temp directory).
- Once Clawdbot legacy screenshot plumbing is replaced, remove it cleanly (no aliases).
## Permissions behavior
If required permissions are missing:
- return `ok=false` with a short human error message (e.g., “Accessibility permission missing”)
- do not try to open System Settings from the automation endpoint
## Security (socket auth)
Both hosts must enforce:
- filesystem perms on the socket path (owner read/write only)
- server-side caller validation:
- require the callers code signature TeamID to be `Y5PE65HELJ`
- optional bundle-id allowlist for tighter scoping
Debug-only escape hatch (development convenience):
- “allow same-UID callers” means: *skip codesign checks for clients running under the same Unix user*.
- This must be **opt-in**, **DEBUG-only**, and guarded by an env var (Peekaboo uses `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`).
## Next integration steps (after this doc)
1. Add Peekaboo as a git submodule (nested submodules OK).
2. Host `PeekabooBridgeHost` inside Clawdbot.app behind a single setting (“Enable Peekaboo Bridge”, default on).
3. Ensure Clawdbot hosts the bridge at `~/Library/Application Support/clawdbot/bridge.sock` and speaks the PeekabooBridge JSON protocol.
4. Validate with `peekaboo bridge status --verbose` that Peekaboo can select Clawdbot as the fallback host (no auto-launch).
5. Keep all protocol decisions aligned with Peekaboo (coordinate system, element IDs, snapshot scoping, error envelopes).

View File

@@ -0,0 +1,40 @@
---
summary: "macOS permission persistence (TCC) and signing requirements"
read_when:
- Debugging missing or stuck macOS permission prompts
- Packaging or signing the macOS app
- Changing bundle IDs or app install paths
---
# macOS permissions (TCC)
macOS permission grants are fragile. TCC associates a permission grant with the
app's code signature, bundle identifier, and on-disk path. If any of those change,
macOS treats the app as new and may drop or hide prompts.
## Requirements for stable permissions
- Same path: run the app from a fixed location (for Clawdbot, `dist/Clawdbot.app`).
- Same bundle identifier: changing the bundle ID creates a new permission identity.
- Signed app: unsigned or ad-hoc signed builds do not persist permissions.
- Consistent signature: use a real Apple Development or Developer ID certificate
so the signature stays stable across rebuilds.
Ad-hoc signatures generate a new identity every build. macOS will forget previous
grants, and prompts can disappear entirely until the stale entries are cleared.
## Recovery checklist when prompts disappear
1. Quit the app.
2. Remove the app entry in System Settings -> Privacy & Security.
3. Relaunch the app from the same path and re-grant permissions.
4. If the prompt still does not appear, reset TCC entries with `tccutil` and try again.
5. Some permissions only reappear after a full macOS restart.
Example resets (replace bundle ID as needed):
```bash
sudo tccutil reset Accessibility com.clawdbot.mac
sudo tccutil reset ScreenCapture com.clawdbot.mac
sudo tccutil reset AppleEvents
```
If you are testing permissions, always sign with a real certificate. Ad-hoc
builds are only acceptable for quick local runs where permissions do not matter.

View File

@@ -0,0 +1,76 @@
---
summary: "Clawdbot macOS release checklist (Sparkle feed, packaging, signing)"
read_when:
- Cutting or validating a Clawdbot macOS release
- Updating the Sparkle appcast or feed assets
---
# Clawdbot macOS release (Sparkle)
This app now ships Sparkle auto-updates. Release builds must be Developer IDsigned, zipped, and published with a signed appcast entry.
## Prereqs
- Developer ID Application cert installed (`Developer ID Application: Peter Steinberger (Y5PE65HELJ)` is expected).
- Sparkle private key path set in the environment as `SPARKLE_PRIVATE_KEY_FILE`; key lives in `/Users/steipete/Library/CloudStorage/Dropbox/Backup/Sparkle` (same key as Trimmy; public key baked into Info.plist).
- Notary credentials (keychain profile or API key) for `xcrun notarytool` if you want Gatekeeper-safe DMG/zip distribution.
- We use a Keychain profile named `clawdbot-notary`, created from App Store Connect API key env vars in your shell profile:
- `APP_STORE_CONNECT_API_KEY_P8`, `APP_STORE_CONNECT_KEY_ID`, `APP_STORE_CONNECT_ISSUER_ID`
- `echo "$APP_STORE_CONNECT_API_KEY_P8" | sed 's/\\n/\n/g' > /tmp/clawdbot-notary.p8`
- `xcrun notarytool store-credentials "clawdbot-notary" --key /tmp/clawdbot-notary.p8 --key-id "$APP_STORE_CONNECT_KEY_ID" --issuer "$APP_STORE_CONNECT_ISSUER_ID"`
- `pnpm` deps installed (`pnpm install --config.node-linker=hoisted`).
- Sparkle tools are fetched automatically via SwiftPM at `apps/macos/.build/artifacts/sparkle/Sparkle/bin/` (`sign_update`, `generate_appcast`, etc.).
## Build & package
Notes:
- `APP_BUILD` maps to `CFBundleVersion`/`sparkle:version`; keep it numeric + monotonic (no `-beta`), or Sparkle compares it as equal.
- Defaults to the current architecture (`$(uname -m)`). For release/universal builds, set `BUILD_ARCHS="arm64 x86_64"` (or `BUILD_ARCHS=all`).
```bash
# From repo root; set release IDs so Sparkle feed is enabled.
# APP_BUILD must be numeric + monotonic for Sparkle compare.
BUNDLE_ID=com.clawdbot.mac \
APP_VERSION=0.1.0 \
APP_BUILD="$(git rev-list --count HEAD)" \
BUILD_CONFIG=release \
SIGN_IDENTITY="Developer ID Application: Peter Steinberger (Y5PE65HELJ)" \
scripts/package-mac-app.sh
# Zip for distribution (includes resource forks for Sparkle delta support)
ditto -c -k --sequesterRsrc --keepParent dist/Clawdbot.app dist/Clawdbot-0.1.0.zip
# Optional: also build a styled DMG for humans (drag to /Applications)
scripts/create-dmg.sh dist/Clawdbot.app dist/Clawdbot-0.1.0.dmg
# Recommended: build + notarize/staple zip + DMG
# First, create a keychain profile once:
# xcrun notarytool store-credentials "clawdbot-notary" \
# --apple-id "<apple-id>" --team-id "<team-id>" --password "<app-specific-password>"
NOTARIZE=1 NOTARYTOOL_PROFILE=clawdbot-notary \
BUNDLE_ID=com.clawdbot.mac \
APP_VERSION=0.1.0 \
APP_BUILD="$(git rev-list --count HEAD)" \
BUILD_CONFIG=release \
SIGN_IDENTITY="Developer ID Application: Peter Steinberger (Y5PE65HELJ)" \
scripts/package-mac-dist.sh
# Optional: ship dSYM alongside the release
ditto -c -k --keepParent apps/macos/.build/release/Clawdbot.app.dSYM dist/Clawdbot-0.1.0.dSYM.zip
```
## Appcast entry
Use the release note generator so Sparkle renders formatted HTML notes:
```bash
SPARKLE_PRIVATE_KEY_FILE=/Users/steipete/Library/CloudStorage/Dropbox/Backup/Sparkle/ed25519-private-key scripts/make_appcast.sh dist/Clawdbot-0.1.0.zip https://raw.githubusercontent.com/clawdbot/clawdbot/main/appcast.xml
```
Generates HTML release notes from `CHANGELOG.md` (via [`scripts/changelog-to-html.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/changelog-to-html.sh)) and embeds them in the appcast entry.
Commit the updated `appcast.xml` alongside the release assets (zip + dSYM) when publishing.
## Publish & verify
- Upload `Clawdbot-0.1.0.zip` (and `Clawdbot-0.1.0.dSYM.zip`) to the GitHub release for tag `v0.1.0`.
- Ensure the raw appcast URL matches the baked feed: `https://raw.githubusercontent.com/clawdbot/clawdbot/main/appcast.xml`.
- Sanity checks:
- `curl -I https://raw.githubusercontent.com/clawdbot/clawdbot/main/appcast.xml` returns 200.
- `curl -I <enclosure url>` returns 200 after assets upload.
- On a previous public build, run “Check for Updates…” from the About tab and verify Sparkle installs the new build cleanly.
Definition of done: signed app + appcast are published, update flow works from an older installed version, and release assets are attached to the GitHub release.

View File

@@ -0,0 +1,57 @@
---
summary: "macOS app flow for controlling a remote Clawdbot gateway over SSH"
read_when:
- Setting up or debugging remote mac control
---
# Remote Clawdbot (macOS ⇄ remote host)
Updated: 2025-12-08
This flow lets the macOS app act as a full remote control for a Clawdbot gateway running on another host (e.g. a Mac Studio). All features—health checks, Voice Wake forwarding, and Web Chat—reuse the same remote SSH configuration from *Settings → General*.
## Modes
- **Local (this Mac)**: Everything runs on the laptop. No SSH involved.
- **Remote over SSH**: Clawdbot commands are executed on the remote host. The mac app opens an SSH connection with `-o BatchMode` plus your chosen identity/key.
## Prereqs on the remote host
1) Install Node + pnpm and build/install the Clawdbot CLI (`pnpm install && pnpm build && pnpm link --global`).
2) Ensure `clawdbot` is on PATH for non-interactive shells (symlink into `/usr/local/bin` or `/opt/homebrew/bin` if needed).
3) Open SSH with key auth. We recommend **Tailscale** IPs for stable reachability off-LAN.
## macOS app setup
1) Open *Settings → General*.
2) Under **Clawdbot runs**, pick **Remote over SSH** and set:
- **SSH target**: `user@host` (optional `:port`).
- If the gateway is on the same LAN and advertises Bonjour, pick it from the discovered list to auto-fill this field.
- **Identity file** (advanced): path to your key.
- **Project root** (advanced): remote checkout path used for commands.
- **CLI path** (advanced): optional path to a runnable `clawdbot` entrypoint/binary (auto-filled when advertised).
3) Hit **Test remote**. Success indicates the remote `clawdbot status --json` runs correctly. Failures usually mean PATH/CLI issues; exit 127 means the CLI isnt found remotely.
4) Health checks and Web Chat will now run through this SSH tunnel automatically.
## Web Chat over SSH
- Web Chat connects to the gateway over the forwarded WebSocket control port (default 18789).
- There is no separate WebChat HTTP server anymore.
## Permissions
- The remote host needs the same TCC approvals as local (Automation, Accessibility, Screen Recording, Microphone, Speech Recognition, Notifications). Run onboarding on that machine to grant them once.
- Nodes advertise their permission state via `node.list` / `node.describe` so agents know whats available.
## WhatsApp login flow (remote)
- Run `clawdbot login --verbose` **on the remote host**. Scan the QR with WhatsApp on your phone.
- Re-run login on that host if auth expires. Health check will surface link problems.
## Troubleshooting
- **exit 127 / not found**: `clawdbot` isnt on PATH for non-login shells. Add it to `/etc/paths`, your shell rc, or symlink into `/usr/local/bin`/`/opt/homebrew/bin`.
- **Health probe failed**: check SSH reachability, PATH, and that Baileys is logged in (`clawdbot status --json`).
- **Web Chat stuck**: confirm the gateway is running on the remote host and the forwarded port matches the gateway WS port; the UI requires a healthy WS connection.
- **Voice Wake**: trigger phrases are forwarded automatically in remote mode; no separate forwarder is needed.
## Notification sounds
Pick sounds per notification from scripts with `clawdbot` and `node.invoke`, e.g.:
```bash
clawdbot nodes notify --node <id> --title "Ping" --body "Remote gateway ready" --sound Glass
```
There is no global “default sound” toggle in the app anymore; callers choose a sound (or none) per request.

View File

@@ -0,0 +1,41 @@
---
summary: "Signing steps for macOS debug builds generated by packaging scripts"
read_when:
- Building or signing mac debug builds
---
# mac signing (debug builds)
This app is usually built from [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh), which now:
- sets a stable debug bundle identifier: `com.clawdbot.mac.debug`
- writes the Info.plist with that bundle id (override via `BUNDLE_ID=...`)
- calls [`scripts/codesign-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/codesign-mac-app.sh) to sign the main binary, bundled CLI, and app bundle so macOS treats each rebuild as the same signed bundle and keeps TCC permissions (notifications, accessibility, screen recording, mic, speech). For stable permissions, use a real signing identity; ad-hoc is opt-in and fragile (see [`docs/mac/permissions.md`](/mac/permissions)).
- uses `CODESIGN_TIMESTAMP=auto` by default; it enables trusted timestamps for Developer ID signatures. Set `CODESIGN_TIMESTAMP=off` to skip timestamping (offline debug builds).
- inject build metadata into Info.plist: `ClawdbotBuildTimestamp` (UTC) and `ClawdbotGitCommit` (short hash) so the About pane can show build, git, and debug/release channel.
- **Packaging requires Bun**: The embedded gateway relay is compiled using `bun`. Ensure it is installed (`curl -fsSL https://bun.sh/install | bash`).
- reads `SIGN_IDENTITY` from the environment. Add `export SIGN_IDENTITY="Apple Development: Your Name (TEAMID)"` (or your Developer ID Application cert) to your shell rc to always sign with your cert. Ad-hoc signing requires explicit opt-in via `ALLOW_ADHOC_SIGNING=1` or `SIGN_IDENTITY="-"` (not recommended for permission testing).
## Usage
```bash
# from repo root
scripts/package-mac-app.sh # auto-selects identity; errors if none found
SIGN_IDENTITY="Developer ID Application: Your Name" scripts/package-mac-app.sh # real cert
ALLOW_ADHOC_SIGNING=1 scripts/package-mac-app.sh # ad-hoc (permissions will not stick)
SIGN_IDENTITY="-" scripts/package-mac-app.sh # explicit ad-hoc (same caveat)
```
### Ad-hoc Signing Note
When signing with `SIGN_IDENTITY="-"` (ad-hoc), the script automatically disables the **Hardened Runtime** (`--options runtime`). This is necessary to prevent crashes when the app attempts to load embedded frameworks (like Sparkle) that do not share the same Team ID. Ad-hoc signatures also break TCC permission persistence; see [`docs/mac/permissions.md`](/mac/permissions) for recovery steps.
## Build metadata for About
`package-mac-app.sh` stamps the bundle with:
- `ClawdbotBuildTimestamp`: ISO8601 UTC at package time
- `ClawdbotGitCommit`: short git hash (or `unknown` if unavailable)
The About tab reads these keys to show version, build date, git commit, and whether its a debug build (via `#if DEBUG`). Run the packager to refresh these values after code changes.
## Why
TCC permissions are tied to the bundle identifier *and* code signature. Unsigned debug builds with changing UUIDs were causing macOS to forget grants after each rebuild. Signing the binaries (adhoc by default) and keeping a fixed bundle id/path (`dist/Clawdbot.app`) preserves the grants between builds, matching the VibeTunnel approach.

View File

@@ -0,0 +1,27 @@
---
summary: "macOS Skills settings UI and gateway-backed status"
read_when:
- Updating the macOS Skills settings UI
- Changing skills gating or install behavior
---
# Skills (macOS)
The macOS app surfaces Clawdbot skills via the gateway; it does not parse skills locally.
## Data source
- `skills.status` (gateway) returns all skills plus eligibility and missing requirements
(including allowlist blocks for bundled skills).
- Requirements are derived from `metadata.clawdbot.requires` in each `SKILL.md`.
## Install actions
- `metadata.clawdbot.install` defines install options (brew/node/go/uv).
- The app calls `skills.install` to run installers on the gateway host.
- The gateway surfaces only one preferred installer when multiple are provided
(brew when available, otherwise node manager from `skills.install`, default npm).
## Env/API keys
- The app stores keys in `~/.clawdbot/clawdbot.json` under `skills.entries.<skillKey>`.
- `skills.update` patches `enabled`, `apiKey`, and `env`.
## Remote mode
- Install + config updates happen on the gateway host (not the local Mac).

View File

@@ -0,0 +1,52 @@
---
summary: "Voice overlay lifecycle when wake-word and push-to-talk overlap"
read_when:
- Adjusting voice overlay behavior
---
## Voice Overlay Lifecycle (macOS)
Audience: macOS app contributors. Goal: keep the voice overlay predictable when wake-word and push-to-talk overlap.
### Current intent
- If the overlay is already visible from wake-word and the user presses the hotkey, the hotkey session *adopts* the existing text instead of resetting it. The overlay stays up while the hotkey is held. When the user releases: send if there is trimmed text, otherwise dismiss.
- Wake-word alone still auto-sends on silence; push-to-talk sends immediately on release.
### Implemented (Dec 9, 2025)
- Overlay sessions now carry a token per capture (wake-word or push-to-talk). Partial/final/send/dismiss/level updates are dropped when the token doesnt match, avoiding stale callbacks.
- Push-to-talk adopts any visible overlay text as a prefix (so pressing the hotkey while the wake overlay is up keeps the text and appends new speech). It waits up to 1.5s for a final transcript before falling back to the current text.
- Chime/overlay logging is emitted at `info` in categories `voicewake.overlay`, `voicewake.ptt`, and `voicewake.chime` (session start, partial, final, send, dismiss, chime reason).
### Next steps
1. **VoiceSessionCoordinator (actor)**
- Owns exactly one `VoiceSession` at a time.
- API (token-based): `beginWakeCapture`, `beginPushToTalk`, `updatePartial`, `endCapture`, `cancel`, `applyCooldown`.
- Drops callbacks that carry stale tokens (prevents old recognizers from reopening the overlay).
2. **VoiceSession (model)**
- Fields: `token`, `source` (wakeWord|pushToTalk), committed/volatile text, chime flags, timers (auto-send, idle), `overlayMode` (display|editing|sending), cooldown deadline.
3. **Overlay binding**
- `VoiceSessionPublisher` (`ObservableObject`) mirrors the active session into SwiftUI.
- `VoiceWakeOverlayView` renders only via the publisher; it never mutates global singletons directly.
- Overlay user actions (`sendNow`, `dismiss`, `edit`) call back into the coordinator with the session token.
4. **Unified send path**
- On `endCapture`: if trimmed text is empty → dismiss; else `performSend(session:)` (plays send chime once, forwards, dismisses).
- Push-to-talk: no delay; wake-word: optional delay for auto-send.
- Apply a short cooldown to the wake runtime after push-to-talk finishes so wake-word doesnt immediately retrigger.
5. **Logging**
- Coordinator emits `.info` logs in subsystem `com.clawdbot`, categories `voicewake.overlay` and `voicewake.chime`.
- Key events: `session_started`, `adopted_by_push_to_talk`, `partial`, `finalized`, `send`, `dismiss`, `cancel`, `cooldown`.
### Debugging checklist
- Stream logs while reproducing a sticky overlay:
```bash
sudo log stream --predicate 'subsystem == "com.clawdbot" AND category CONTAINS "voicewake"' --level info --style compact
```
- Verify only one active session token; stale callbacks should be dropped by the coordinator.
- Ensure push-to-talk release always calls `endCapture` with the active token; if text is empty, expect `dismiss` without chime or send.
### Migration steps (suggested)
1. Add `VoiceSessionCoordinator`, `VoiceSession`, and `VoiceSessionPublisher`.
2. Refactor `VoiceWakeRuntime` to create/update/end sessions instead of touching `VoiceWakeOverlayController` directly.
3. Refactor `VoicePushToTalk` to adopt existing sessions and call `endCapture` on release; apply runtime cooldown.
4. Wire `VoiceWakeOverlayController` to the publisher; remove direct calls from runtime/PTT.
5. Add integration tests for session adoption, cooldown, and empty-text dismissal.

View File

@@ -0,0 +1,56 @@
---
summary: "Voice wake and push-to-talk modes plus routing details in the mac app"
read_when:
- Working on voice wake or PTT pathways
---
# Voice Wake & Push-to-Talk
Updated: 2025-12-23 · Owners: mac app
## Modes
- **Wake-word mode** (default): always-on Speech recognizer waits for trigger tokens (`swabbleTriggerWords`). On match it starts capture, shows the overlay with partial text, and auto-sends after silence.
- **Push-to-talk (Right Option hold)**: hold the right Option key to capture immediately—no trigger needed. The overlay appears while held; releasing finalizes and forwards after a short delay so you can tweak text.
## Runtime behavior (wake-word)
- Speech recognizer lives in `VoiceWakeRuntime`.
- Trigger only fires when theres a **meaningful pause** between the wake word and the next word (~0.45s gap).
- Silence windows: 2.0s when speech is flowing, 5.0s if only the trigger was heard.
- Hard stop: 120s to prevent runaway sessions.
- Debounce between sessions: 350ms.
- Overlay is driven via `VoiceWakeOverlayController` with committed/volatile coloring.
- After send, recognizer restarts cleanly to listen for the next trigger.
## Lifecycle invariants
- If Voice Wake is enabled and permissions are granted, the wake-word recognizer should be listening (except during an explicit push-to-talk capture).
- Overlay visibility (including manual dismiss via the X button) must never prevent the recognizer from resuming.
## Sticky overlay failure mode (previous)
Previously, if the overlay got stuck visible and you manually closed it, Voice Wake could appear “dead” because the runtimes restart attempt could be blocked by overlay visibility and no subsequent restart was scheduled.
Hardening:
- Wake runtime restart is no longer blocked by overlay visibility.
- Overlay dismiss completion triggers a `VoiceWakeRuntime.refresh(...)` via `VoiceSessionCoordinator`, so manual X-dismiss always resumes listening.
## Push-to-talk specifics
- Hotkey detection uses a global `.flagsChanged` monitor for **right Option** (`keyCode 61` + `.option`). We only observe events (no swallowing).
- Capture pipeline lives in `VoicePushToTalk`: starts Speech immediately, streams partials to the overlay, and calls `VoiceWakeForwarder` on release.
- When push-to-talk starts we pause the wake-word runtime to avoid dueling audio taps; it restarts automatically after release.
- Permissions: requires Microphone + Speech; seeing events needs Accessibility/Input Monitoring approval.
- External keyboards: some may not expose right Option as expected—offer a fallback shortcut if users report misses.
## User-facing settings
- **Voice Wake** toggle: enables wake-word runtime.
- **Hold Cmd+Fn to talk**: enables the push-to-talk monitor. Disabled on macOS < 26.
- Language & mic pickers, live level meter, trigger-word table, tester.
- **Sounds**: chimes on trigger detect and on send; defaults to the macOS “Glass” system sound. You can pick any `NSSound`-loadable file (e.g. MP3/WAV/AIFF) for each event or choose **No Sound**.
## Forwarding behavior
- When Voice Wake is enabled, transcripts are forwarded to the active gateway/agent (the same local vs remote mode used by the rest of the mac app).
- Replies are delivered to the **last-used main provider** (WhatsApp/Telegram/Discord/WebChat). If delivery fails, the error is logged and the run is still visible via WebChat/session logs.
## Forwarding payload
- `VoiceWakeForwarder.prefixedTranscript(_:)` prepends the machine hint before sending. Shared between wake-word and push-to-talk paths.
## Quick verification
- Toggle push-to-talk on, hold Cmd+Fn, speak, release: overlay should show partials then send.
- While holding, menu-bar ears should stay enlarged (uses `triggerVoiceEars(ttl:nil)`); they drop after release.

View File

@@ -0,0 +1,27 @@
---
summary: "How the mac app embeds the gateway WebChat and how to debug it"
read_when:
- Debugging mac WebChat view or loopback port
---
# Web Chat (macOS app)
The macOS menu bar app shows the WebChat UI as a native SwiftUI view and reuses the **primary Clawd session** (`main`, or `global` when scope is global).
- **Local mode**: connects directly to the local Gateway WebSocket.
- **Remote mode**: forwards the Gateway WebSocket control port over SSH and uses that as the data plane.
## Launch & debugging
- Manual: Lobster menu → “Open Chat”.
- Auto-open for testing: run `dist/Clawdbot.app/Contents/MacOS/Clawdbot --webchat` (or pass `--webchat` to the binary launched by launchd). The window opens on startup.
- Logs: see [`./scripts/clawlog.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/clawlog.sh) (subsystem `com.clawdbot`, category `WebChatSwiftUI`).
## How its wired
- Implementation: [`apps/macos/Sources/Clawdbot/WebChatSwiftUI.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/WebChatSwiftUI.swift) hosts `ClawdbotChatUI` and speaks to the Gateway over `GatewayConnection`.
- Data plane: Gateway WebSocket methods `chat.history`, `chat.send`, `chat.abort`; events `chat`, `agent`, `presence`, `tick`, `health`.
- Session: usually primary (`main`); multiple transports (WhatsApp/Telegram/Discord/Desktop) share the same key. The onboarding flow uses a dedicated `onboarding` session to keep first-run setup separate.
## Security / surface area
- Remote mode forwards only the Gateway WebSocket control port over SSH.
## Known limitations
- The UI is optimized for the primary session and typical “chat” usage (not a full browser-based sandbox surface).

40
docs/platforms/mac/xpc.md Normal file
View File

@@ -0,0 +1,40 @@
---
summary: "macOS IPC architecture for Clawdbot app, gateway node bridge, and PeekabooBridge"
read_when:
- Editing IPC contracts or menu bar app IPC
---
# Clawdbot macOS IPC architecture (Dec 2025)
**Current model:** there is **no local control socket** and no `clawdbot-mac` CLI. All agent actions go through the Gateway WebSocket and `node.invoke`. UI automation still uses PeekabooBridge.
## Goals
- Single GUI app instance that owns all TCC-facing work (notifications, screen recording, mic, speech, AppleScript).
- A small surface for automation: Gateway + node commands, plus PeekabooBridge for UI automation.
- Predictable permissions: always the same signed bundle ID, launched by launchd, so TCC grants stick.
## How it works
### Gateway + node bridge (current)
- The app runs the Gateway (local mode) and connects to it as a node.
- Agent actions are performed via `node.invoke` (e.g. `system.run`, `system.notify`, `canvas.*`).
### PeekabooBridge (UI automation)
- UI automation uses a separate UNIX socket named `bridge.sock` and the PeekabooBridge JSON protocol.
- Host preference order (client-side): Peekaboo.app → Claude.app → Clawdbot.app → local execution.
- Security: bridge hosts require TeamID `Y5PE65HELJ`; DEBUG-only same-UID escape hatch is guarded by `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (Peekaboo convention).
- See: [`docs/mac/peekaboo.md`](/mac/peekaboo) for the Clawdbot plan and naming.
### Mach/XPC (future direction)
- Still optional for internal app services, but **not required** for automation now that node.invoke is the surface.
## Operational flows
- Restart/rebuild: `SIGN_IDENTITY="Apple Development: Peter Steinberger (2ZAC4GM7GD)" scripts/restart-mac.sh`
- Kills existing instances
- Swift build + package
- Writes/bootstraps/kickstarts the LaunchAgent
- Single instance: app exits early if another instance with the same bundle ID is running.
## Hardening notes
- Prefer requiring a TeamID match for all privileged surfaces.
- PeekabooBridge: `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (DEBUG-only) may allow same-UID callers for local development.
- All communication remains local-only; no network sockets are exposed.
- TCC prompts originate only from the GUI app bundle; run [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh) so the signed bundle ID stays stable.

104
docs/platforms/macos.md Normal file
View File

@@ -0,0 +1,104 @@
---
summary: "Spec for the Clawdbot macOS companion menu bar app (gateway + node broker)"
read_when:
- Implementing macOS app features
- Changing gateway lifecycle or node bridging on macOS
---
# Clawdbot macOS Companion (menu bar + gateway broker)
Author: steipete · Status: draft spec · Date: 2025-12-20
## Purpose
- Single macOS menu-bar app named **Clawdbot** that:
- Shows native notifications for Clawdbot/clawdbot events.
- Owns TCC prompts (Notifications, Accessibility, Screen Recording, Automation/AppleScript, Microphone, Speech Recognition).
- Runs (or connects to) the **Gateway** and exposes itself as a **node** so agents can reach macOSonly features.
- Hosts **PeekabooBridge** for UI automation (consumed by `peekaboo`; see [`docs/mac/peekaboo.md`](/mac/peekaboo)).
- Installs a single CLI (`clawdbot`) by symlinking the bundled binary.
## High-level design
- SwiftPM package in `apps/macos/` (macOS 15+, Swift 6).
- Targets:
- `ClawdbotIPC` (shared Codable types + helpers for appinternal actions).
- `Clawdbot` (LSUIElement MenuBarExtra app; hosts Gateway + node bridge + PeekabooBridgeHost).
- Bundle ID: `com.clawdbot.mac`.
- Bundled runtime binaries live under `Contents/Resources/Relay/`:
- `clawdbot` (buncompiled relay: CLI + gateway-daemon)
- The app symlinks `clawdbot` into `/usr/local/bin` and `/opt/homebrew/bin`.
## Gateway + node bridge
- The mac app runs the Gateway in **local** mode (unless configured remote).
- The gateway port is configurable via `gateway.port` or `CLAWDBOT_GATEWAY_PORT` (default 18789). The mac app reads that value for launchd, probes, and remote SSH tunnels.
- The mac app connects to the bridge as a **node** and advertises capabilities/commands.
- Agentfacing actions are exposed via `node.invoke` (no local control socket).
- The mac app watches `~/.clawdbot/clawdbot.json` and switches modes live when `gateway.mode` or `gateway.remote.url` changes.
- If `gateway.mode` is unset but `gateway.remote.url` is set, the mac app treats it as remote mode.
- Changing connection mode in the mac app writes `gateway.mode` (and `gateway.remote.url` in remote mode) back to the config file.
### Node commands (mac)
- Canvas: `canvas.present|navigate|eval|snapshot|a2ui.*`
- Camera: `camera.snap|camera.clip`
- Screen: `screen.record`
- System: `system.run` (shell) and `system.notify`
### Permission advertising
- Nodes include a `permissions` map in hello/pairing.
- The Gateway surfaces it via `node.list` / `node.describe` so agents can decide what to run.
## CLI (`clawdbot`)
- The **only** CLI is `clawdbot` (TS/bun). There is no `clawdbot-mac` helper.
- For macspecific actions, the CLI uses `node.invoke`:
- `clawdbot canvas present|navigate|eval|snapshot|a2ui push|a2ui reset`
- `clawdbot nodes run --node <id> -- <command...>`
- `clawdbot nodes notify --node <id> --title ...`
## Onboarding
- Install CLI (symlink) → Permissions checklist → Test notification → Done.
- Remote mode skips local gateway/CLI steps.
- Selecting Local auto-enables the bundled Gateway via launchd (unless “Attach only” debug mode is enabled).
## Deep links (URL scheme)
Clawdbot (the macOS app) registers a URL scheme for triggering local actions from anywhere (browser, Shortcuts, CLI, etc.).
Scheme:
- `clawdbot://…`
### `clawdbot://agent`
Triggers a Gateway `agent` request (same machinery as WebChat/agent runs).
Example:
```bash
open 'clawdbot://agent?message=Hello%20from%20deep%20link'
```
Query parameters:
- `message` (required): the agent prompt (URL-encoded).
- `sessionKey` (optional): explicit session key to use.
- `thinking` (optional): thinking hint (e.g. `low`; omit for default).
- `deliver` (optional): `true|false` (default: false).
- `to` / `provider` (optional): forwarded to the Gateway `agent` method (only meaningful with `deliver=true`).
- `timeoutSeconds` (optional): timeout hint forwarded to the Gateway.
- `key` (optional): unattended mode key (see below).
Safety/guardrails:
- Always enabled.
- Without a `key` query param, the app will prompt for confirmation before invoking the agent.
- With `key=<value>`, Clawdbot runs without prompting (intended for personal automations).
- The current key is shown in Debug Settings and stored locally in UserDefaults.
Notes:
- In local mode, Clawdbot will start the local Gateway if needed before issuing the request.
- In remote mode, Clawdbot will use the configured remote tunnel/endpoint.
## Build & dev workflow (native)
- `cd apps/macos && swift build` (debug) / `swift build -c release`.
- Run app for dev: `swift run Clawdbot` (or Xcode scheme).
- Package app + CLI: [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh) (builds bun CLI + gateway).
- Tests: add Swift Testing suites under `apps/macos/Tests`.
## Open questions / decisions
- Should `system.run` support streaming stdout/stderr or keep buffered responses only?
- Should we allow nodeside permission prompts, or always require explicit app UI action?

11
docs/platforms/windows.md Normal file
View File

@@ -0,0 +1,11 @@
---
summary: "Windows app status + contribution call"
read_when:
- Looking for Windows companion app status
- Planning platform coverage or contributions
---
# Windows App
Clawdbot core is fully supported on Windows. The core is written in TypeScript, so it runs anywhere Node runs.
We do not have a Windows companion app yet. It is planned, and we would love contributions to make it happen.