docs: refresh and simplify docs
This commit is contained in:
@@ -1,381 +1,105 @@
|
||||
---
|
||||
summary: "iOS app (node): architecture + connection runbook"
|
||||
summary: "iOS node app: connect to the Gateway, pairing, canvas, and troubleshooting"
|
||||
read_when:
|
||||
- Pairing or reconnecting the iOS node
|
||||
- Debugging iOS bridge discovery or auth
|
||||
- Sending screen/canvas commands to iOS
|
||||
- Designing iOS node + gateway integration
|
||||
- Extending the Gateway protocol for node/canvas commands
|
||||
- Implementing Bonjour pairing or transport security
|
||||
- Running the iOS app from source
|
||||
- Debugging bridge discovery or canvas commands
|
||||
---
|
||||
# iOS App (Node)
|
||||
|
||||
Status: prototype implemented (internal) · Date: 2025-12-13
|
||||
Availability: internal preview. The iOS app is not publicly distributed yet.
|
||||
|
||||
## Support snapshot
|
||||
- Role: companion node app (iOS does not host the Gateway).
|
||||
- Gateway required: yes (run it on macOS, Linux, or Windows via WSL2).
|
||||
- Install: [Getting Started](/start/getting-started) + [Pairing](/gateway/pairing).
|
||||
- Gateway: [Runbook](/gateway) + [Configuration](/gateway/configuration).
|
||||
## What it does
|
||||
|
||||
## System control
|
||||
System control (launchd/systemd) lives on the Gateway host. See [Gateway](/gateway).
|
||||
- Connects to a Gateway over the bridge (LAN or tailnet).
|
||||
- Exposes node capabilities: Canvas, Screen snapshot, Camera capture, Location, Talk mode, Voice wake.
|
||||
- Receives `node.invoke` commands and reports node status events.
|
||||
|
||||
## Connection Runbook
|
||||
## Requirements
|
||||
|
||||
This is the practical “how do I connect the iOS node” guide:
|
||||
- Gateway running on another device (macOS, Linux, or Windows via WSL2).
|
||||
- Bridge enabled (default).
|
||||
- Network path:
|
||||
- Same LAN via Bonjour, **or**
|
||||
- Tailnet via unicast DNS-SD (`clawdbot.internal.`), **or**
|
||||
- Manual host/port (fallback).
|
||||
|
||||
**iOS app** ⇄ (Bonjour + TCP bridge) ⇄ **Gateway bridge** ⇄ (loopback WS) ⇄ **Gateway**
|
||||
## Quick start (pair + connect)
|
||||
|
||||
The Gateway WebSocket stays loopback-only (`ws://127.0.0.1:18789`). The iOS node talks to the LAN-facing **bridge** (default `tcp://0.0.0.0:18790`) and uses Gateway-owned pairing.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- You can run the Gateway on the “master” machine.
|
||||
- iOS node app can reach the gateway bridge:
|
||||
- Same LAN with Bonjour/mDNS, **or**
|
||||
- Same Tailscale tailnet using Wide-Area Bonjour / unicast DNS-SD (see below), **or**
|
||||
- Manual bridge host/port (fallback)
|
||||
- You can run the CLI (`clawdbot`) on the gateway machine (or via SSH).
|
||||
|
||||
### 1) Start the Gateway (with bridge enabled)
|
||||
|
||||
Bridge is enabled by default (disable via `CLAWDBOT_BRIDGE_ENABLED=0`).
|
||||
1) Start the Gateway (bridge enabled by default):
|
||||
|
||||
```bash
|
||||
clawdbot gateway --port 18789 --verbose
|
||||
clawdbot gateway --port 18789
|
||||
```
|
||||
|
||||
Confirm in logs you see something like:
|
||||
- `bridge listening on tcp://0.0.0.0:18790 (node)`
|
||||
2) In the iOS app, open Settings and pick a discovered gateway (or enable Manual Bridge and enter host/port).
|
||||
|
||||
For tailnet-only setups (recommended for Vienna ⇄ London), bind the bridge to the gateway machine’s Tailscale IP instead:
|
||||
|
||||
- Set `bridge.bind: "tailnet"` in `~/.clawdbot/clawdbot.json` on the gateway host.
|
||||
- Restart the Gateway / macOS menubar app.
|
||||
|
||||
### 2) Verify Bonjour discovery (optional but recommended)
|
||||
|
||||
From the gateway machine:
|
||||
|
||||
```bash
|
||||
dns-sd -B _clawdbot-bridge._tcp local.
|
||||
```
|
||||
|
||||
You should see your gateway advertising `_clawdbot-bridge._tcp`.
|
||||
|
||||
If browse works, but the iOS node can’t connect, try resolving one instance:
|
||||
|
||||
```bash
|
||||
dns-sd -L "<instance name>" _clawdbot-bridge._tcp local.
|
||||
```
|
||||
|
||||
More debugging notes: [`docs/bonjour.md`](/gateway/bonjour).
|
||||
|
||||
#### Tailnet (Vienna ⇄ London) discovery via unicast DNS-SD
|
||||
|
||||
If the iOS node and the gateway are on different networks but connected via Tailscale, multicast mDNS won’t cross the boundary. Use Wide-Area Bonjour / unicast DNS-SD instead:
|
||||
|
||||
1) Set up a DNS-SD zone (example `clawdbot.internal.`) on the gateway host and publish `_clawdbot-bridge._tcp` records.
|
||||
2) Configure Tailscale split DNS for `clawdbot.internal` pointing at that DNS server.
|
||||
|
||||
Details and example CoreDNS config: [`docs/bonjour.md`](/gateway/bonjour).
|
||||
|
||||
### 3) Connect from the iOS node app
|
||||
|
||||
In the iOS node app:
|
||||
- Pick the discovered bridge (or hit refresh).
|
||||
- If not paired yet, it will initiate pairing automatically.
|
||||
- After the first successful pairing, it will auto-reconnect **strictly to the last discovered gateway** on launch (including after reinstall), as long as the iOS Keychain entry is still present.
|
||||
|
||||
#### Connection indicator (always visible)
|
||||
|
||||
The Settings tab icon shows a small status dot:
|
||||
- **Green**: connected to the bridge
|
||||
- **Yellow**: connecting (subtle pulse)
|
||||
- **Red**: not connected / error
|
||||
|
||||
### 4) Approve pairing (CLI)
|
||||
|
||||
On the gateway machine:
|
||||
3) Approve the pairing request on the gateway host:
|
||||
|
||||
```bash
|
||||
clawdbot nodes pending
|
||||
```
|
||||
|
||||
Approve the request:
|
||||
|
||||
```bash
|
||||
clawdbot nodes approve <requestId>
|
||||
```
|
||||
|
||||
After approval, the iOS node receives/stores the token and reconnects authenticated.
|
||||
|
||||
Pairing details: [`docs/gateway/pairing.md`](/gateway/pairing).
|
||||
|
||||
### 5) Verify the node is connected
|
||||
|
||||
- In the macOS app: **Instances** tab should show something like `iOS Node (...)` with a green “Active” presence dot shortly after connect.
|
||||
- Via nodes status (paired + connected):
|
||||
```bash
|
||||
clawdbot nodes status
|
||||
```
|
||||
- Via Gateway (paired + connected):
|
||||
```bash
|
||||
clawdbot gateway call node.list --params "{}"
|
||||
```
|
||||
- Via Gateway presence (legacy-ish, still useful):
|
||||
```bash
|
||||
clawdbot gateway call system-presence --params "{}"
|
||||
```
|
||||
Look for the node `instanceId` (often a UUID).
|
||||
|
||||
### 6) Drive the iOS Canvas (draw / snapshot)
|
||||
|
||||
The iOS node runs a WKWebView “Canvas” scaffold which exposes:
|
||||
- `window.__clawdbot.canvas`
|
||||
- `window.__clawdbot.ctx` (2D context)
|
||||
- `window.__clawdbot.setStatus(title, subtitle)`
|
||||
|
||||
#### Gateway Canvas Host (recommended for web content)
|
||||
|
||||
If you want the node to show real HTML/CSS/JS that the agent can edit on disk, point it at the Gateway canvas host.
|
||||
|
||||
Note: nodes always use the standalone canvas host on `canvasHost.port` (default `18793`), bound to the bridge interface.
|
||||
|
||||
1) Create `~/clawd/canvas/index.html` on the gateway host.
|
||||
|
||||
2) Navigate the node to it (LAN):
|
||||
4) Verify connection:
|
||||
|
||||
```bash
|
||||
clawdbot nodes invoke --node "iOS Node" --command canvas.navigate --params '{"url":"http://<gateway-hostname>.local:18793/__clawdbot__/canvas/"}'
|
||||
clawdbot nodes status
|
||||
clawdbot gateway call node.list --params "{}"
|
||||
```
|
||||
|
||||
## Discovery paths
|
||||
|
||||
### Bonjour (LAN)
|
||||
|
||||
The Gateway advertises `_clawdbot-bridge._tcp` on `local.`. The iOS app lists these automatically.
|
||||
|
||||
### Tailnet (cross-network)
|
||||
|
||||
If mDNS is blocked, use a unicast DNS-SD zone (recommended domain: `clawdbot.internal.`) and Tailscale split DNS.
|
||||
See [`docs/bonjour.md`](/gateway/bonjour) for the CoreDNS example.
|
||||
|
||||
### Manual host/port
|
||||
|
||||
In Settings, enable **Manual Bridge** and enter the gateway host + port (default `18790`).
|
||||
|
||||
## Canvas + A2UI
|
||||
|
||||
The iOS node renders a WKWebView canvas. Use `node.invoke` to drive it:
|
||||
|
||||
```bash
|
||||
clawdbot nodes invoke --node "iOS Node" --command canvas.navigate --params '{"url":"http://<gateway-host>:18793/__clawdbot__/canvas/"}'
|
||||
```
|
||||
|
||||
Notes:
|
||||
- The server injects a live-reload client into HTML and reloads on file changes.
|
||||
- A2UI is hosted on the same canvas host at `http://<gateway-host>:18793/__clawdbot__/a2ui/`.
|
||||
- Tailnet (optional): if both devices are on Tailscale, use a MagicDNS name or tailnet IP instead of `.local`, e.g. `http://<gateway-magicdns>:18793/__clawdbot__/canvas/`.
|
||||
- iOS may require App Transport Security allowances to load plain `http://` URLs; if it fails to load, prefer HTTPS or adjust the iOS app’s ATS config.
|
||||
- The Gateway canvas host serves `/__clawdbot__/canvas/` and `/__clawdbot__/a2ui/`.
|
||||
- The iOS node auto-navigates to A2UI on connect when a canvas host URL is advertised.
|
||||
- Return to the built-in scaffold with `canvas.navigate` and `{"url":""}`.
|
||||
|
||||
#### Draw with `canvas.eval`
|
||||
### Canvas eval / snapshot
|
||||
|
||||
```bash
|
||||
clawdbot nodes invoke --node "iOS Node" --command canvas.eval --params "$(cat <<'JSON'
|
||||
{"javaScript":"(() => { const {ctx,setStatus} = window.__clawdbot; setStatus('Drawing','…'); ctx.clearRect(0,0,innerWidth,innerHeight); ctx.lineWidth=6; ctx.strokeStyle='#ff2d55'; ctx.beginPath(); ctx.moveTo(40,40); ctx.lineTo(innerWidth-40, innerHeight-40); ctx.stroke(); setStatus(null,null); return 'ok'; })()"}
|
||||
JSON
|
||||
)"
|
||||
clawdbot nodes invoke --node "iOS Node" --command canvas.eval --params '{"javaScript":"(() => { const {ctx} = window.__clawdbot; ctx.clearRect(0,0,innerWidth,innerHeight); ctx.lineWidth=6; ctx.strokeStyle=\"#ff2d55\"; ctx.beginPath(); ctx.moveTo(40,40); ctx.lineTo(innerWidth-40, innerHeight-40); ctx.stroke(); return \"ok\"; })()"}'
|
||||
```
|
||||
|
||||
#### Snapshot with `canvas.snapshot`
|
||||
|
||||
```bash
|
||||
clawdbot nodes invoke --node 192.168.0.88 --command canvas.snapshot --params '{"maxWidth":900}'
|
||||
clawdbot nodes invoke --node "iOS Node" --command canvas.snapshot --params '{"maxWidth":900,"format":"jpeg"}'
|
||||
```
|
||||
|
||||
The response includes `{ format, base64 }` image data (default `format="jpeg"`; pass `{"format":"png"}` when you specifically need lossless PNG).
|
||||
## Voice wake + talk mode
|
||||
|
||||
### Common gotchas
|
||||
- Voice wake and talk mode are available in Settings.
|
||||
- iOS may suspend background audio; treat voice features as best-effort when the app is not active.
|
||||
|
||||
- **iOS in background:** all `canvas.*` commands fail fast with `NODE_BACKGROUND_UNAVAILABLE` (bring the iOS node app to foreground).
|
||||
- **Return to default scaffold:** `canvas.navigate` with `{"url":""}` or `{"url":"/"}` returns to the built-in scaffold page.
|
||||
- **mDNS blocked:** some networks block multicast; use a different LAN or plan a tailnet-capable bridge (see [`docs/discovery.md`](/gateway/discovery)).
|
||||
- **Wrong node selector:** `--node` can be the node id (UUID), display name (e.g. `iOS Node`), IP, or an unambiguous prefix. If it’s ambiguous, the CLI will tell you.
|
||||
- **Stale pairing / Keychain cleared:** if the pairing token is missing (or iOS Keychain was wiped), the node must pair again; approve a new pending request.
|
||||
- **App reinstall but no reconnect:** the node restores `instanceId` + last bridge preference from Keychain; if it still comes up “unpaired”, verify Keychain persistence on your device/simulator and re-pair once.
|
||||
## Common errors
|
||||
|
||||
## Design + Architecture
|
||||
|
||||
### Goals
|
||||
- Build an **iOS app** that acts as a **remote node** for Clawdbot:
|
||||
- **Voice trigger** (wake-word / always-listening intent) that forwards transcripts to the Gateway `agent` method.
|
||||
- **Canvas** surface that the agent can control: navigate, draw/render, evaluate JS, snapshot.
|
||||
- **Dead-simple setup**:
|
||||
- Auto-discover the host on the local network via **Bonjour**.
|
||||
- One-tap pairing with an approval prompt on the Mac.
|
||||
- iOS is **never** a local gateway; it is always a remote node.
|
||||
- Operational clarity:
|
||||
- When iOS is backgrounded, voice may still run; **canvas commands must fail fast** with a structured error.
|
||||
- Provide **settings**: node display name, enable/disable voice wake, pairing status.
|
||||
|
||||
Non-goals (v1):
|
||||
- Exposing the Node Gateway directly on the LAN.
|
||||
- Supporting arbitrary third-party “plugins” on iOS.
|
||||
- Perfect App Store compliance; this is **internal-only** initially.
|
||||
|
||||
### Current repo reality (constraints we respect)
|
||||
- The Gateway WebSocket server binds to `127.0.0.1:18789` ([`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts)) with an optional `CLAWDBOT_GATEWAY_TOKEN`.
|
||||
- The Gateway exposes a Canvas file server (`canvasHost`) on `canvasHost.port` (default `18793`), so nodes can `canvas.navigate` to `http://<lanHost>:18793/__clawdbot__/canvas/` and auto-reload on file changes ([`docs/configuration.md`](/gateway/configuration)).
|
||||
- macOS “Canvas” is controlled via the Gateway node protocol (`canvas.*`), matching iOS/Android ([`docs/mac/canvas.md`](/platforms/mac/canvas)).
|
||||
- Voice wake forwards via `GatewayChannel` to Gateway `agent` (mac app: `VoiceWakeForwarder` → `GatewayConnection.sendAgent`).
|
||||
|
||||
### Recommended topology (B): Gateway-owned Bridge + loopback Gateway
|
||||
Keep the Node gateway loopback-only; expose a dedicated **gateway-owned bridge** to the LAN/tailnet.
|
||||
|
||||
**iOS App** ⇄ (TLS + pairing) ⇄ **Bridge (in gateway)** ⇄ (loopback) ⇄ **Gateway WS** (`ws://127.0.0.1:18789`)
|
||||
|
||||
Why:
|
||||
- Preserves current threat model: Gateway remains local-only.
|
||||
- Centralizes auth, rate limiting, and allowlisting in the bridge.
|
||||
- Lets us unify “canvas node” semantics across mac + iOS without exposing raw gateway methods.
|
||||
|
||||
### Security plan (internal, but still robust)
|
||||
#### Transport
|
||||
- **Current (v0):** bridge is a LAN-facing **TCP** listener with token-based auth after pairing.
|
||||
- **Next:** wrap the bridge in **TLS** and prefer key-pinned or mTLS-like auth after pairing.
|
||||
|
||||
#### Pairing
|
||||
- Bonjour discovery shows a candidate “Clawdbot Bridge” on the LAN.
|
||||
- First connection:
|
||||
1) iOS generates a keypair (Secure Enclave if available).
|
||||
2) iOS connects to the bridge and requests pairing.
|
||||
3) The bridge forwards the pairing request to the **Gateway** as a *pending request*.
|
||||
4) Approval can happen via:
|
||||
- **macOS UI** (Clawdbot shows an alert with Approve/Reject/Later, including the node IP), or
|
||||
- **Terminal/CLI** (headless flows).
|
||||
5) Once approved, the bridge returns a token to iOS; iOS stores it in Keychain.
|
||||
- Subsequent connections:
|
||||
- The bridge requires the paired identity. Unpaired clients get a structured “not paired” error and no access.
|
||||
|
||||
##### Gateway-owned pairing (Option B details)
|
||||
Pairing decisions must be owned by the Gateway (`clawd` / Node) so nodes can be approved without the macOS app running.
|
||||
|
||||
Key idea:
|
||||
- The Swift app may still show an alert, but it is only a **frontend** for pending requests stored in the Gateway.
|
||||
|
||||
Desired behavior:
|
||||
- If the Swift UI is present: show alert with Approve/Reject/Later.
|
||||
- If the Swift UI is not present: `clawdbot` CLI can list pending requests and approve/reject.
|
||||
|
||||
See [`docs/gateway/pairing.md`](/gateway/pairing) for the API/events and storage.
|
||||
|
||||
CLI (headless approvals):
|
||||
- `clawdbot nodes pending`
|
||||
- `clawdbot nodes approve <requestId>`
|
||||
- `clawdbot nodes reject <requestId>`
|
||||
|
||||
#### Authorization / scope control (bridge-side ACL)
|
||||
The bridge must not be a raw proxy to every gateway method.
|
||||
|
||||
- Allow by default:
|
||||
- `agent` (with guardrails; idempotency required)
|
||||
- minimal `system-event` beacons (presence updates for the node)
|
||||
- node/canvas methods defined below (new protocol surface)
|
||||
- Deny by default:
|
||||
- anything that widens control without explicit intent (future “shell”, “files”, etc.)
|
||||
- Rate limit:
|
||||
- handshake attempts
|
||||
- voice forwards per minute
|
||||
- snapshot frequency / payload size
|
||||
|
||||
### Protocol unification: add “node/canvas” to Gateway protocol
|
||||
#### Principle
|
||||
Unify mac Canvas + iOS Canvas under a single conceptual surface:
|
||||
- The agent talks to the Gateway using a stable method set (typed protocol).
|
||||
- The Gateway routes node-targeted requests to:
|
||||
- local mac Canvas implementation, or
|
||||
- remote iOS node via the bridge
|
||||
|
||||
#### Minimal protocol additions (v1)
|
||||
Add to [`src/gateway/protocol/schema.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/protocol/schema.ts) (and regenerate Swift models):
|
||||
|
||||
**Identity**
|
||||
- Node identity comes from `connect.params.client.instanceId` (stable), and `connect.params.client.mode = "node"` (or `"ios-node"`).
|
||||
|
||||
**Methods**
|
||||
- `node.list` → list paired/connected nodes + capabilities
|
||||
- `node.describe` → describe a node (capabilities + supported `node.invoke` commands)
|
||||
- `node.invoke` → send a command to a specific node
|
||||
- Params: `{ nodeId, command, params?, timeoutMs? }`
|
||||
|
||||
**Events**
|
||||
- `node.event` → async node status/errors
|
||||
- e.g. background/foreground transitions, voice availability, canvas availability
|
||||
|
||||
#### Node command set (canvas)
|
||||
These are values for `node.invoke.command`:
|
||||
- `canvas.present` / `canvas.hide`
|
||||
- `canvas.navigate` with `{ url }` (loads a URL; use `""` or `"/"` to return to the default scaffold)
|
||||
- `canvas.eval` with `{ javaScript }`
|
||||
- `canvas.snapshot` with `{ maxWidth?, quality?, format? }`
|
||||
- A2UI (mobile + macOS canvas):
|
||||
- `canvas.a2ui.push` with `{ messages: [...] }` (A2UI v0.8 server→client messages)
|
||||
- `canvas.a2ui.pushJSONL` with `{ jsonl: "..." }` (legacy alias)
|
||||
- `canvas.a2ui.reset`
|
||||
- A2UI is hosted by the Gateway canvas host (`/__clawdbot__/a2ui/`) on `canvasHost.port`. Commands fail if the host is unreachable.
|
||||
|
||||
Result pattern:
|
||||
- Request is a standard `req/res` with `ok` / `error`.
|
||||
- Long operations (loads, streaming drawing, etc.) may also emit `node.event` progress.
|
||||
|
||||
##### Current (implemented)
|
||||
As of 2025-12-13, the Gateway supports `node.invoke` for bridge-connected nodes.
|
||||
|
||||
Example: draw a diagonal line on the iOS Canvas:
|
||||
```bash
|
||||
clawdbot nodes invoke --node ios-node --command canvas.eval --params '{"javaScript":"(() => { const {ctx} = window.__clawdbot; ctx.clearRect(0,0,innerWidth,innerHeight); ctx.lineWidth=6; ctx.strokeStyle=\"#ff2d55\"; ctx.beginPath(); ctx.moveTo(40,40); ctx.lineTo(innerWidth-40, innerHeight-40); ctx.stroke(); return \"ok\"; })()"}'
|
||||
```
|
||||
|
||||
### Background behavior requirement
|
||||
When iOS is backgrounded:
|
||||
- Voice may still be active (subject to iOS suspension).
|
||||
- **All `canvas.*` commands must fail** with a stable error code, e.g.:
|
||||
- `NODE_BACKGROUND_UNAVAILABLE`
|
||||
- Include `retryable: true` and `retryAfterMs` if we want the agent to wait.
|
||||
|
||||
## iOS app architecture (SwiftUI)
|
||||
### App structure
|
||||
- Single fullscreen Canvas surface (WKWebView).
|
||||
- One settings entry point: a **gear button** that opens a settings sheet.
|
||||
- All navigation is **agent-driven** (no local URL bar).
|
||||
|
||||
### Components
|
||||
- `BridgeDiscovery`: Bonjour browse + resolve (Network.framework `NWBrowser`)
|
||||
- `BridgeConnection`: TCP session + pairing handshake + reconnect (TLS planned)
|
||||
- `NodeRuntime`:
|
||||
- Voice pipeline (wake-word + capture + forward)
|
||||
- Canvas pipeline (WKWebView controller + snapshot + eval)
|
||||
- Background state tracking; enforces “canvas unavailable in background”
|
||||
|
||||
### Voice in background (internal)
|
||||
- Enable background audio mode (and required session configuration) so the mic pipeline can keep running when the user switches apps.
|
||||
- If iOS suspends the app anyway, surface a clear node status (`node.event`) so operators can see voice is unavailable.
|
||||
|
||||
## Code sharing (macOS + iOS)
|
||||
Create/expand SwiftPM targets so both apps share:
|
||||
- `ClawdbotProtocol` (generated models; platform-neutral)
|
||||
- `ClawdbotGatewayClient` (shared WS framing + connect/req/res + seq-gap handling)
|
||||
- `ClawdbotKit` (node/canvas command types + deep links + shared utilities)
|
||||
|
||||
macOS continues to own:
|
||||
- local Canvas implementation details (custom scheme handler serving on-disk HTML, window/panel presentation)
|
||||
|
||||
iOS owns:
|
||||
- iOS-specific audio/speech + WKWebView presentation and lifecycle
|
||||
|
||||
## Repo layout
|
||||
- iOS app: `apps/ios/` (XcodeGen `project.yml`)
|
||||
- Shared Swift packages: `apps/shared/`
|
||||
- Lint/format: iOS target runs `swiftformat --lint` + `swiftlint lint` using repo configs (`.swiftformat`, `.swiftlint.yml`).
|
||||
|
||||
Generate the Xcode project:
|
||||
```bash
|
||||
cd apps/ios
|
||||
xcodegen generate
|
||||
open Clawdbot.xcodeproj
|
||||
```
|
||||
|
||||
## Storage plan (private by default)
|
||||
### iOS
|
||||
- Canvas/workspace files (persistent, private):
|
||||
- `Application Support/Clawdbot/canvas/<sessionKey>/...`
|
||||
- Snapshots / temp exports (evictable):
|
||||
- `Library/Caches/Clawdbot/canvas-snapshots/<sessionKey>/...`
|
||||
- Credentials:
|
||||
- Keychain (paired identity + bridge trust anchor)
|
||||
- `NODE_BACKGROUND_UNAVAILABLE`: bring the iOS app to the foreground (canvas/camera/screen commands require it).
|
||||
- `A2UI_HOST_NOT_CONFIGURED`: the Gateway did not advertise a canvas host URL; check `canvasHost` in [`docs/configuration.md`](/gateway/configuration).
|
||||
- Pairing prompt never appears: run `clawdbot nodes pending` and approve manually.
|
||||
- Reconnect fails after reinstall: the Keychain pairing token was cleared; re-pair the node.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [`docs/gateway.md`](/gateway) (gateway runbook)
|
||||
- [`docs/gateway/pairing.md`](/gateway/pairing) (approval + storage)
|
||||
- [`docs/bonjour.md`](/gateway/bonjour) (discovery debugging)
|
||||
- [`docs/discovery.md`](/gateway/discovery) (LAN vs tailnet vs SSH)
|
||||
- [Pairing](/gateway/pairing)
|
||||
- [Discovery](/gateway/discovery)
|
||||
- [Bonjour](/gateway/bonjour)
|
||||
|
||||
@@ -15,7 +15,7 @@ Goal: ship **Clawdbot.app** with a self-contained relay binary that can run both
|
||||
App bundle layout:
|
||||
|
||||
- `Clawdbot.app/Contents/Resources/Relay/clawdbot`
|
||||
- bun `--compile` relay executable built from [`dist/macos/relay.js`](https://github.com/clawdbot/clawdbot/blob/main/dist/macos/relay.js)
|
||||
- bun `--compile` relay executable built from `dist/macos/relay.js`
|
||||
- Supports:
|
||||
- `clawdbot …` (CLI)
|
||||
- `clawdbot gateway …` (LaunchAgent daemon)
|
||||
@@ -47,7 +47,7 @@ Important bundler flags:
|
||||
|
||||
Version injection:
|
||||
- `--define "__CLAWDBOT_VERSION__=\"<pkg version>\""`
|
||||
- [`src/version.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/version.ts) also supports `__CLAWDBOT_VERSION__` (and `CLAWDBOT_BUNDLED_VERSION`) so `--version` doesn’t depend on reading `package.json` at runtime.
|
||||
- The relay honors `__CLAWDBOT_VERSION__` / `CLAWDBOT_BUNDLED_VERSION` so `--version` doesn’t depend on reading `package.json` at runtime.
|
||||
|
||||
## Launchd (Gateway as LaunchAgent)
|
||||
|
||||
@@ -58,7 +58,7 @@ Plist location (per-user):
|
||||
- `~/Library/LaunchAgents/com.clawdbot.gateway.plist`
|
||||
|
||||
Manager:
|
||||
- [`apps/macos/Sources/Clawdbot/GatewayLaunchAgentManager.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/GatewayLaunchAgentManager.swift)
|
||||
- The macOS app owns LaunchAgent install/update for the bundled gateway.
|
||||
|
||||
Behavior:
|
||||
- “Clawdbot Active” enables/disables the LaunchAgent.
|
||||
@@ -79,7 +79,7 @@ Symptom (when mis-signed):
|
||||
|
||||
Fix:
|
||||
- The bun executable needs JIT-ish permissions under hardened runtime.
|
||||
- [`scripts/codesign-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/codesign-mac-app.sh) signs `Relay/clawdbot` with:
|
||||
- `scripts/codesign-mac-app.sh` signs `Relay/clawdbot` with:
|
||||
- `com.apple.security.cs.allow-jit`
|
||||
- `com.apple.security.cs.allow-unsigned-executable-memory`
|
||||
|
||||
@@ -89,18 +89,14 @@ Problem:
|
||||
- bun can’t load some native Node addons like `sharp` (and we don’t want to ship native addon trees for the gateway).
|
||||
|
||||
Solution:
|
||||
- Central helper [`src/media/image-ops.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/media/image-ops.ts)
|
||||
- Prefers `/usr/bin/sips` on macOS (esp. when running under bun)
|
||||
- Falls back to `sharp` when available (Node/dev)
|
||||
- Used by:
|
||||
- [`src/web/media.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/web/media.ts) (optimize inbound/outbound images)
|
||||
- [`src/browser/screenshot.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/browser/screenshot.ts)
|
||||
- [`src/agents/pi-tools.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-tools.ts) (image sanitization)
|
||||
- Image operations prefer `/usr/bin/sips` on macOS (especially under bun).
|
||||
- When running in Node/dev, `sharp` is used when available.
|
||||
- This affects inbound/outbound media, screenshots, and tool image sanitization.
|
||||
|
||||
## Browser control server
|
||||
|
||||
The Gateway starts the browser control server (loopback only) from [`src/gateway/server.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/gateway/server.ts).
|
||||
It’s started from the relay daemon process, so the relay binary includes Playwright deps.
|
||||
The Gateway starts the browser control server (loopback only) from the relay daemon process,
|
||||
so the relay binary includes Playwright deps.
|
||||
|
||||
## Tests / smoke checks
|
||||
|
||||
@@ -127,7 +123,7 @@ Bun may leave dotfiles like `*.bun-build` in the repo root or subfolders.
|
||||
|
||||
## DMG styling (human installer)
|
||||
|
||||
[`scripts/create-dmg.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/create-dmg.sh) styles the DMG via Finder AppleScript.
|
||||
`scripts/create-dmg.sh` styles the DMG via Finder AppleScript.
|
||||
|
||||
Rules of thumb:
|
||||
- Use a **72dpi** background image that matches the Finder window size in points.
|
||||
|
||||
@@ -5,157 +5,117 @@ read_when:
|
||||
- Adding agent controls for visual workspace
|
||||
- Debugging WKWebView canvas loads
|
||||
---
|
||||
|
||||
# Canvas (macOS app)
|
||||
|
||||
Status: draft spec · Date: 2025-12-12
|
||||
The macOS app embeds an agent‑controlled **Canvas panel** using `WKWebView`. It
|
||||
is a lightweight visual workspace for HTML/CSS/JS, A2UI, and small interactive
|
||||
UI surfaces.
|
||||
|
||||
Note: for iOS/Android nodes that should render agent-edited HTML/CSS/JS over the network, prefer the Gateway `canvasHost` (serves `~/clawd/canvas` over LAN/tailnet with live reload). A2UI is also **hosted by the Gateway** over HTTP. This doc focuses on the macOS in-app canvas panel. See [`docs/configuration.md`](/gateway/configuration).
|
||||
## Where Canvas lives
|
||||
|
||||
Clawdbot can embed an agent-controlled “visual workspace” panel (“Canvas”) inside the macOS app using `WKWebView`, served via a **custom URL scheme** (no loopback HTTP port required).
|
||||
Canvas state is stored under Application Support:
|
||||
|
||||
This is designed for:
|
||||
- Agent-written HTML/CSS/JS on disk (per-session directory).
|
||||
- A real browser engine for layout, rendering, and basic interactivity.
|
||||
- Agent-driven visibility (show/hide), navigation, DOM/JS queries, and snapshots.
|
||||
- Minimal chrome: borderless panel; bezel/chrome appears only on hover.
|
||||
- `~/Library/Application Support/Clawdbot/canvas/<session>/...`
|
||||
|
||||
## Why a custom scheme (vs. loopback HTTP)
|
||||
The Canvas panel serves those files via a **custom URL scheme**:
|
||||
|
||||
Using `WKURLSchemeHandler` keeps Canvas entirely in-process:
|
||||
- No port conflicts and no extra local server lifecycle.
|
||||
- Easier to sandbox: only serve files we explicitly map.
|
||||
- Works offline and can use an ephemeral data store (no persistent cookies/cache).
|
||||
|
||||
If a Canvas page truly needs “real web” semantics (CORS, fetch to loopback endpoints, service workers), consider the loopback-server variant instead (out of scope for this doc).
|
||||
|
||||
## URL ↔ directory mapping
|
||||
|
||||
The Canvas scheme is:
|
||||
- `clawdbot-canvas://<session>/<path>`
|
||||
|
||||
Routing model:
|
||||
- `clawdbot-canvas://main/` → `<canvasRoot>/main/index.html` (or `index.htm`)
|
||||
- `clawdbot-canvas://main/yolo` → `<canvasRoot>/main/yolo/index.html` (or `index.htm`)
|
||||
Examples:
|
||||
- `clawdbot-canvas://main/` → `<canvasRoot>/main/index.html`
|
||||
- `clawdbot-canvas://main/assets/app.css` → `<canvasRoot>/main/assets/app.css`
|
||||
- `clawdbot-canvas://main/widgets/todo/` → `<canvasRoot>/main/widgets/todo/index.html`
|
||||
|
||||
Directory listings are not served.
|
||||
If no `index.html` exists at the root, the app shows a **built‑in scaffold page**.
|
||||
|
||||
When `/` has no `index.html` yet, the handler serves a **built-in scaffold page** (bundled with the macOS app).
|
||||
This is a visual placeholder only (no A2UI renderer).
|
||||
## Panel behavior
|
||||
|
||||
### Suggested on-disk location
|
||||
- Borderless, resizable panel anchored near the menu bar (or mouse cursor).
|
||||
- Remembers size/position per session.
|
||||
- Auto‑reloads when local canvas files change.
|
||||
- Only one Canvas panel is visible at a time (session is switched as needed).
|
||||
|
||||
Store Canvas state under the app support directory:
|
||||
- `~/Library/Application Support/Clawdbot/canvas/<session>/…`
|
||||
Canvas can be disabled from Settings → **Allow Canvas**. When disabled, canvas
|
||||
node commands return `CANVAS_DISABLED`.
|
||||
|
||||
This keeps it alongside other app-owned state and avoids mixing with `~/.clawdbot/` gateway config.
|
||||
## Agent API surface
|
||||
|
||||
## Panel behavior (agent-controlled)
|
||||
Canvas is exposed via the **node bridge**, so the agent can:
|
||||
|
||||
Canvas is presented as a borderless `NSPanel` (similar to the existing WebChat panel):
|
||||
- Can be shown/hidden at any time by the agent.
|
||||
- Supports an “anchored” presentation (near the menu bar icon or another anchor rect).
|
||||
- Uses a rounded container; shadow stays on, but **chrome/bezel only appears on hover**.
|
||||
- Default position is the **top-right corner** of the current screen’s visible frame (unless the user moved/resized it previously).
|
||||
- The panel is **user-resizable** (edge resize + hover resize handle) and the last frame is persisted per session.
|
||||
- show/hide the panel
|
||||
- navigate to a path or URL
|
||||
- evaluate JavaScript
|
||||
- capture a snapshot image
|
||||
|
||||
### Hover-only chrome
|
||||
CLI examples:
|
||||
|
||||
Implementation notes:
|
||||
- Keep the window borderless at all times (don’t toggle `styleMask`).
|
||||
- Add an overlay view inside the content container for chrome (stroke + subtle gradient/material).
|
||||
- Use an `NSTrackingArea` to fade the chrome in/out on `mouseEntered/mouseExited`.
|
||||
- Optionally show close/drag affordances only while hovered.
|
||||
```bash
|
||||
clawdbot nodes canvas present --node <id>
|
||||
clawdbot nodes canvas navigate --node <id> --url "/"
|
||||
clawdbot nodes canvas eval --node <id> --js "document.title"
|
||||
clawdbot nodes canvas snapshot --node <id>
|
||||
```
|
||||
|
||||
## Agent API surface (current)
|
||||
Notes:
|
||||
- `canvas.navigate` accepts **local canvas paths**, `http(s)` URLs, and `file://` URLs.
|
||||
- If you pass `"/"`, the Canvas shows the local scaffold or `index.html`.
|
||||
|
||||
Canvas is exposed via the Gateway **node bridge**, so the agent can:
|
||||
- Show/hide the panel.
|
||||
- Navigate to a path (relative to the session root).
|
||||
- Evaluate JavaScript and optionally return results.
|
||||
- Query/modify DOM (helpers mirroring “dom query/all/attr/click/type/wait” patterns).
|
||||
- Capture a snapshot image of the current canvas view.
|
||||
- Optionally set panel placement (screen `x/y` + `width/height`) when showing/navigating.
|
||||
## A2UI in Canvas
|
||||
|
||||
This should be modeled after `WebChatManager`/`WebChatSwiftUIWindowController` but targeting `clawdbot-canvas://…` URLs.
|
||||
A2UI is hosted by the Gateway canvas host and rendered inside the Canvas panel.
|
||||
When the Gateway advertises a Canvas host, the macOS app auto‑navigates to the
|
||||
A2UI host page on first open.
|
||||
|
||||
Related:
|
||||
- For “invoke the agent again from UI” flows, prefer the macOS deep link scheme (`clawdbot://agent?...`) so *any* UI surface (Canvas, WebChat, native views) can trigger a new agent run. See [`docs/macos.md`](/platforms/macos).
|
||||
|
||||
## Agent commands (current)
|
||||
|
||||
Use the main `clawdbot` CLI; it invokes canvas commands via `node.invoke`.
|
||||
|
||||
- `clawdbot nodes canvas present --node <id> [--target <...>] [--x/--y/--width/--height]`
|
||||
- Local targets map into the session directory via the custom scheme (directory targets resolve `index.html|index.htm`).
|
||||
- If `/` has no index file, Canvas shows the built-in scaffold page and returns `status: "welcome"`.
|
||||
- `clawdbot nodes canvas hide --node <id>`
|
||||
- `clawdbot nodes canvas eval --js <code> --node <id>`
|
||||
- `clawdbot nodes canvas snapshot --node <id>`
|
||||
|
||||
### Canvas A2UI
|
||||
|
||||
Canvas A2UI is hosted by the **Gateway canvas host** at:
|
||||
Default A2UI host URL:
|
||||
|
||||
```
|
||||
http://<gateway-host>:18793/__clawdbot__/a2ui/
|
||||
```
|
||||
|
||||
The macOS app simply renders that page in the Canvas panel. The agent can drive it with JSONL **server→client protocol messages** (one JSON object per line):
|
||||
### A2UI commands (v0.8)
|
||||
|
||||
- `clawdbot nodes canvas a2ui push --jsonl <path> --node <id>`
|
||||
- `clawdbot nodes canvas a2ui reset --node <id>`
|
||||
Canvas currently accepts **A2UI v0.8** server→client messages:
|
||||
|
||||
`push` expects a JSONL file where **each line is a single JSON object** (parsed and forwarded to the in-page A2UI renderer).
|
||||
- `beginRendering`
|
||||
- `surfaceUpdate`
|
||||
- `dataModelUpdate`
|
||||
- `deleteSurface`
|
||||
|
||||
Minimal example (v0.8):
|
||||
`createSurface` (v0.9) is not supported.
|
||||
|
||||
CLI example:
|
||||
|
||||
```bash
|
||||
cat > /tmp/a2ui-v0.8.jsonl <<'EOF'
|
||||
{"surfaceUpdate":{"surfaceId":"main","components":[{"id":"root","component":{"Column":{"children":{"explicitList":["title","content"]}}}},{"id":"title","component":{"Text":{"text":{"literalString":"Canvas (A2UI v0.8)"},"usageHint":"h1"}}},{"id":"content","component":{"Text":{"text":{"literalString":"If you can read this, `nodes canvas a2ui push` works."},"usageHint":"body"}}}]}}
|
||||
cat > /tmp/a2ui-v0.8.jsonl <<'EOFA2'
|
||||
{"surfaceUpdate":{"surfaceId":"main","components":[{"id":"root","component":{"Column":{"children":{"explicitList":["title","content"]}}}},{"id":"title","component":{"Text":{"text":{"literalString":"Canvas (A2UI v0.8)"},"usageHint":"h1"}}},{"id":"content","component":{"Text":{"text":{"literalString":"If you can read this, A2UI push works."},"usageHint":"body"}}}]}}
|
||||
{"beginRendering":{"surfaceId":"main","root":"root"}}
|
||||
EOF
|
||||
EOFA2
|
||||
|
||||
clawdbot nodes canvas a2ui push --jsonl /tmp/a2ui-v0.8.jsonl --node <id>
|
||||
```
|
||||
|
||||
Notes:
|
||||
- This does **not** support the A2UI v0.9 examples using `createSurface`.
|
||||
- A2UI **fails** if the Gateway canvas host is unreachable (no local fallback).
|
||||
- `nodes canvas a2ui push` validates JSONL (line numbers on errors) and rejects v0.9 payloads.
|
||||
- Quick smoke: `clawdbot nodes canvas a2ui push --node <id> --text "Hello from A2UI"` renders a minimal v0.8 view.
|
||||
Quick smoke:
|
||||
|
||||
## Triggering agent runs from Canvas (deep links)
|
||||
```bash
|
||||
clawdbot nodes canvas a2ui push --node <id> --text "Hello from A2UI"
|
||||
```
|
||||
|
||||
## Triggering agent runs from Canvas
|
||||
|
||||
Canvas can trigger new agent runs via deep links:
|
||||
|
||||
Canvas can trigger new agent runs via the macOS app deep-link scheme:
|
||||
- `clawdbot://agent?...`
|
||||
|
||||
This is intentionally separate from `clawdbot-canvas://…` (which is only for serving local Canvas files into the `WKWebView`).
|
||||
Example (in JS):
|
||||
|
||||
Suggested patterns:
|
||||
- HTML: render links/buttons that navigate to `clawdbot://agent?message=...`.
|
||||
- JS: set `window.location.href = 'clawdbot://agent?...'` for “run this now” actions.
|
||||
```js
|
||||
window.location.href = "clawdbot://agent?message=Review%20this%20design";
|
||||
```
|
||||
|
||||
Implementation note (important):
|
||||
- In `WKWebView`, intercept `clawdbot://…` navigations in `WKNavigationDelegate` and forward them to the app, e.g. by calling `DeepLinkHandler.shared.handle(url:)` and returning `.cancel` for the navigation.
|
||||
The app prompts for confirmation unless a valid key is provided.
|
||||
|
||||
Safety:
|
||||
- Deep links (`clawdbot://agent?...`) are always enabled.
|
||||
- Without a `key` query param, the app will prompt for confirmation before invoking the agent.
|
||||
- With a valid `key`, the run is unattended (no prompt). For Canvas-originated actions, the app injects an internal key automatically.
|
||||
## Security notes
|
||||
|
||||
## Security / guardrails
|
||||
|
||||
Recommended defaults:
|
||||
- `WKWebsiteDataStore.nonPersistent()` for Canvas (ephemeral).
|
||||
- Navigation policy: allow only `clawdbot-canvas://…` (and optionally `about:blank`); open `http/https` externally.
|
||||
- Scheme handler must prevent directory traversal: resolved file paths must stay under `<canvasRoot>/<session>/`.
|
||||
- Disable or tightly scope any JS bridge; prefer query-string/bootstrap config over `window.webkit.messageHandlers` for sensitive data.
|
||||
|
||||
## Debugging
|
||||
|
||||
Suggested debugging hooks:
|
||||
- Enable Web Inspector for Canvas builds (same approach as WebChat).
|
||||
- Log scheme requests + resolution decisions to OSLog (subsystem `com.clawdbot`, category `Canvas`).
|
||||
- Provide a “copy canvas dir” action in debug settings to quickly reveal the session directory in Finder.
|
||||
- Canvas scheme blocks directory traversal; files must live under the session root.
|
||||
- Local Canvas content uses a custom scheme (no loopback server required).
|
||||
- External `http(s)` URLs are allowed only when explicitly navigated.
|
||||
|
||||
@@ -1,72 +1,56 @@
|
||||
---
|
||||
summary: "Running the gateway as a child process of the macOS app and why"
|
||||
summary: "Gateway lifecycle on macOS (launchd + attach-only)"
|
||||
read_when:
|
||||
- Integrating the mac app with the gateway lifecycle
|
||||
---
|
||||
# Clawdbot gateway as a child process of the macOS app
|
||||
# Gateway lifecycle on macOS
|
||||
|
||||
Date: 2025-12-06 · Status: draft · Owner: steipete
|
||||
The macOS app **manages the Gateway via launchd** by default. This gives you
|
||||
reliable auto‑start at login and restart on crashes.
|
||||
|
||||
Note (2025-12-19): the current implementation prefers a **launchd LaunchAgent** that runs the **bundled bun-compiled gateway**. This doc remains as an alternative mode for tighter coupling to the UI.
|
||||
Child‑process mode (Gateway spawned directly by the app) is **not in use** today.
|
||||
If you need tighter coupling to the UI, use **Attach‑only** and run the Gateway
|
||||
manually in a terminal.
|
||||
|
||||
## Goal
|
||||
Run the Node-based Clawdbot/clawdbot gateway as a direct child of the LSUIElement app (instead of a launchd agent) while keeping all TCC-sensitive work inside the Swift app/broker layer and wiring the existing “Clawdbot Active” toggle to start/stop the child.
|
||||
## Default behavior (launchd)
|
||||
|
||||
## When to prefer the child-process mode
|
||||
- You want gateway lifetime strictly coupled to the menu-bar app (dies when the app quits) and controlled by the “Clawdbot Active” toggle without touching launchd.
|
||||
- You’re okay giving up login persistence/auto-restart that launchd provides, or you’ll add your own backoff loop.
|
||||
- You want simpler log capture and supervision inside the app (no external plist or user-visible LaunchAgent).
|
||||
- The app installs a per‑user LaunchAgent labeled `com.clawdbot.gateway`.
|
||||
- When Local mode is enabled, the app ensures the LaunchAgent is loaded and
|
||||
starts the Gateway if needed.
|
||||
- Logs are written to the launchd gateway log path (visible in Debug Settings).
|
||||
|
||||
## Tradeoffs vs. launchd
|
||||
- **Pros:** tighter coupling to UI state; simpler surface (no plist install/bootout); easier to stream stdout/stderr; fewer moving parts for beta users.
|
||||
- **Cons:** no built-in KeepAlive/login auto-start; app crash kills gateway; you must build your own restart/backoff; Activity Monitor will show both processes under the app; still need correct TCC handling (see below).
|
||||
- **TCC:** behaviorally, child processes often inherit the parent app’s “responsible process” for TCC, but this is *not a contract*. Continue to route all protected actions through the Swift app/broker so prompts stay tied to the signed app bundle.
|
||||
Common commands:
|
||||
|
||||
## TCC guardrails (must keep)
|
||||
- Screen Recording, Accessibility, mic, and speech prompts must originate from the signed Swift app/broker. The Node child should never call these APIs directly; route through the app’s node commands (via Gateway `node.invoke`) for:
|
||||
- `system.notify`
|
||||
- `system.run` (including `needsScreenRecording`)
|
||||
- `screen.record` / `camera.*`
|
||||
- PeekabooBridge UI automation (`peekaboo …`)
|
||||
- Usage strings (`NSMicrophoneUsageDescription`, `NSSpeechRecognitionUsageDescription`, etc.) stay in the app target’s Info.plist; a bare Node binary has none and would fail.
|
||||
- If you ever embed Node that *must* touch TCC, wrap that call in a tiny signed helper target inside the app bundle and have Node exec that helper instead of calling the API directly.
|
||||
```bash
|
||||
launchctl kickstart -k gui/$UID/com.clawdbot.gateway
|
||||
launchctl bootout gui/$UID/com.clawdbot.gateway
|
||||
```
|
||||
|
||||
## Process manager design (Swift Subprocess)
|
||||
- Add a small `GatewayProcessManager` (Swift) that owns:
|
||||
- `execution: Execution?` from `Swift Subprocess` to track the child.
|
||||
- `start(config)` called when “Clawdbot Active” flips ON:
|
||||
- binary: host Node running the bundled gateway under `Clawdbot.app/Contents/Resources/Gateway/`
|
||||
- args: current clawdbot entrypoint and flags
|
||||
- cwd/env: point to `~/.clawdbot` as today; inject the expanded PATH so Homebrew Node resolves under launchd
|
||||
- output: stream stdout/stderr to `/tmp/clawdbot-gateway.log` (cap buffer via Subprocess OutputLimits)
|
||||
- restart: optional linear/backoff restart if exit was non-zero and Active is still true
|
||||
- `stop()` called when Active flips OFF or app terminates: cancel the execution and `waitUntilExit`.
|
||||
- Wire SwiftUI toggle:
|
||||
- ON: `GatewayProcessManager.start(...)`
|
||||
- OFF: `GatewayProcessManager.stop()` (no launchctl calls in this mode)
|
||||
- Keep the existing `LaunchdManager` around so we can switch back if needed; the toggle can choose between launchd or child mode with a flag if we want both.
|
||||
## Attach‑only (developer mode)
|
||||
|
||||
## Packaging and signing
|
||||
- Bundle the gateway payload (dist + production node_modules) under `Contents/Resources/Gateway/`; rely on host Node ≥22 instead of embedding a runtime.
|
||||
- Codesign native addons and dylibs inside the bundle; no nested runtime binary to sign now.
|
||||
- Host runtime should not call TCC APIs directly; keep privileged work inside the app/broker.
|
||||
Attach‑only tells the app to **connect to an existing Gateway** without spawning
|
||||
one. This is ideal for local dev (hot‑reload, custom flags).
|
||||
|
||||
## Logging and observability
|
||||
- Stream child stdout/stderr to `/tmp/clawdbot-gateway.log`; surface the last N lines in the Debug tab.
|
||||
- Emit a user notification (via existing NotificationManager) on crash/exit while Active is true.
|
||||
- Add a lightweight heartbeat from Node → app (e.g., ping over stdout) so the app can show status in the menu.
|
||||
Steps:
|
||||
|
||||
## Failure/edge cases
|
||||
- App crash/quit kills the gateway. Decide if that is acceptable for the deployment tier; otherwise, stick with launchd for production and keep child-process for dev/experiments.
|
||||
- If the gateway exits repeatedly, back off (e.g., 1s/2s/5s/10s) and give up after N attempts with a menu warning.
|
||||
- Respect the existing pause semantics: when paused, the broker should return `ok=false, "clawdbot paused"`; the gateway should avoid calling privileged routes while paused.
|
||||
1) Start the Gateway yourself:
|
||||
```bash
|
||||
pnpm gateway:watch
|
||||
```
|
||||
2) In the macOS app: Debug Settings → Gateway → **Attach only**.
|
||||
|
||||
## Open questions / follow-ups
|
||||
- Do we need dual-mode (launchd for prod, child for dev)? If yes, gate via a setting or build flag.
|
||||
- Embedding a runtime is off the table for now; we rely on host Node for size/simplicity. Revisit only if host PATH drift becomes painful.
|
||||
- Do we want a tiny signed helper for rare TCC actions that cannot be brokered via the Swift app/broker?
|
||||
The UI should show “Using existing gateway …” once connected.
|
||||
|
||||
## Decision snapshot (current recommendation)
|
||||
- Keep all TCC surfaces in the Swift app/broker (node commands + PeekabooBridgeHost).
|
||||
- Implement `GatewayProcessManager` with Swift Subprocess to start/stop the gateway on the “Clawdbot Active” toggle.
|
||||
- Maintain the launchd path as a fallback for uptime/login persistence until child-mode proves stable.
|
||||
## Remote mode
|
||||
|
||||
Remote mode never starts a local Gateway. The app uses an SSH tunnel to the
|
||||
remote host and connects over that tunnel.
|
||||
|
||||
## Why we prefer launchd
|
||||
|
||||
- Auto‑start at login.
|
||||
- Built‑in restart/KeepAlive semantics.
|
||||
- Predictable logs and supervision.
|
||||
|
||||
If a true child‑process mode is ever needed again, it should be documented as a
|
||||
separate, explicit dev‑only mode.
|
||||
|
||||
@@ -1,170 +1,62 @@
|
||||
---
|
||||
summary: "Plan for integrating Peekaboo automation into Clawdbot via PeekabooBridge (socket-based TCC broker)"
|
||||
summary: "PeekabooBridge integration for macOS UI automation"
|
||||
read_when:
|
||||
- Hosting PeekabooBridge in Clawdbot.app
|
||||
- Integrating Peekaboo as a submodule
|
||||
- Changing PeekabooBridge protocol/paths
|
||||
---
|
||||
# Peekaboo Bridge in Clawdbot (macOS UI automation broker)
|
||||
# Peekaboo Bridge (macOS UI automation)
|
||||
|
||||
## TL;DR
|
||||
- **Peekaboo removed its XPC helper** and now exposes privileged automation via a **UNIX domain socket bridge** (`PeekabooBridge` / `PeekabooBridgeHost`, socket name `bridge.sock`).
|
||||
- Clawdbot integrates by **optionally hosting the same bridge** inside **Clawdbot.app** (user-toggleable). The primary client is the **`peekaboo` CLI** (installed via npm); Clawdbot does not need its own `ui …` CLI surface.
|
||||
- For **visualizations**, we keep them in **Peekaboo.app** (best UX); Clawdbot stays a thin broker host. No visualizer toggle in Clawdbot.
|
||||
Clawdbot can host **PeekabooBridge** as a local, permission‑aware UI automation
|
||||
broker. This lets the `peekaboo` CLI drive UI automation while reusing the
|
||||
macOS app’s TCC permissions.
|
||||
|
||||
Non-goals:
|
||||
- No auto-launching Peekaboo.app.
|
||||
- No onboarding deep links from the automation endpoint (Clawdbot onboarding already handles permissions).
|
||||
- No AI provider/agent runtime dependencies in Clawdbot (avoid pulling Tachikoma/MCP into the Clawdbot app/CLI).
|
||||
## What this is (and isn’t)
|
||||
|
||||
## Big refactor (Dec 2025): XPC → Bridge
|
||||
Peekaboo’s privileged execution moved from “CLI → XPC helper” to “CLI → socket bridge host”. For Clawdbot this is a win:
|
||||
- It matches the existing “local socket + codesign checks” approach.
|
||||
- It lets us piggyback on **either** Peekaboo.app’s permissions **or** Clawdbot.app’s permissions (whichever is running).
|
||||
- It avoids “two apps with two TCC bubbles” unless needed.
|
||||
- **Host**: Clawdbot.app can act as a PeekabooBridge host.
|
||||
- **Client**: use the `peekaboo` CLI (no separate `clawdbot ui ...` surface).
|
||||
- **UI**: visual overlays stay in Peekaboo.app; Clawdbot is a thin broker host.
|
||||
|
||||
Reference (Peekaboo submodule): `Peekaboo/docs/bridge-host.md`.
|
||||
## Enable the bridge
|
||||
|
||||
## Architecture
|
||||
### Processes
|
||||
- **Bridge hosts** (provide TCC-backed automation):
|
||||
- **Peekaboo.app** (preferred; also provides visualizations + controls)
|
||||
- **Claude.app** (secondary; lets `peekaboo` reuse Claude Desktop’s granted permissions)
|
||||
- **Clawdbot.app** (secondary; “thin host” only)
|
||||
- **Bridge clients** (trigger single actions):
|
||||
- `peekaboo …` (preferred; humans + agents)
|
||||
- Optional: Clawdbot/Node shells out to `peekaboo` when it needs UI automation/capture
|
||||
In the macOS app:
|
||||
- Settings → **Enable Peekaboo Bridge**
|
||||
|
||||
### Host discovery (client-side)
|
||||
Order is deliberate:
|
||||
1. Peekaboo.app host (full UX)
|
||||
2. Claude.app host (piggyback on Claude Desktop permissions)
|
||||
3. Clawdbot.app host (piggyback on Clawdbot permissions)
|
||||
When enabled, Clawdbot starts a local UNIX socket server. If disabled, the host
|
||||
is stopped and `peekaboo` will fall back to other available hosts.
|
||||
|
||||
Socket paths (convention; exact paths must match Peekaboo):
|
||||
- Peekaboo: `~/Library/Application Support/Peekaboo/bridge.sock`
|
||||
- Claude: `~/Library/Application Support/Claude/bridge.sock`
|
||||
- Clawdbot: `~/Library/Application Support/clawdbot/bridge.sock`
|
||||
## Client discovery order
|
||||
|
||||
No auto-launch: if a host isn’t reachable, the command fails with a clear error (start Peekaboo.app, Claude.app, or Clawdbot.app).
|
||||
Peekaboo clients typically try hosts in this order:
|
||||
|
||||
Override (debugging): set `PEEKABOO_BRIDGE_SOCKET=/path/to/bridge.sock`.
|
||||
1. Peekaboo.app (full UX)
|
||||
2. Claude.app (if installed)
|
||||
3. Clawdbot.app (thin broker)
|
||||
|
||||
### Protocol shape
|
||||
- **Single request per connection**: connect → write one JSON request → half-close → read one JSON response → close.
|
||||
- **Timeout**: 10 seconds end-to-end per action (client enforced; host should also enforce per-operation).
|
||||
- **Errors**: human-readable string by default; structured envelope in `--json`.
|
||||
Use `peekaboo bridge status --verbose` to see which host is active and which
|
||||
socket path is in use. You can override with:
|
||||
|
||||
## Dependency strategy (submodule)
|
||||
Integrate Peekaboo via git submodule (nested submodules are OK).
|
||||
```bash
|
||||
export PEEKABOO_BRIDGE_SOCKET=/path/to/bridge.sock
|
||||
```
|
||||
|
||||
Path in Clawdbot repo:
|
||||
- `./Peekaboo` (Swabble-style; keep stable so SwiftPM path deps don’t churn).
|
||||
## Security & permissions
|
||||
|
||||
What Clawdbot should use:
|
||||
- **Client side**: `PeekabooBridge` (socket client + protocol models).
|
||||
- **Host side (Clawdbot.app)**: `PeekabooBridgeHost` + the minimal Peekaboo services needed to implement operations.
|
||||
- The bridge validates **caller code signatures**; TeamID `Y5PE65HELJ` is
|
||||
allowed by default (Peekaboo’s signing team), plus the Clawdbot app’s TeamID.
|
||||
- Requests time out after ~10 seconds.
|
||||
- If required permissions are missing, the bridge returns a clear error message
|
||||
rather than launching System Settings.
|
||||
|
||||
What Clawdbot should *not* embed:
|
||||
- **Visualizer UI**: keep it in Peekaboo.app for now (toggle + controls live there).
|
||||
- **XPC**: don’t reintroduce helper targets; use the bridge.
|
||||
## Snapshot behavior (automation)
|
||||
|
||||
## IPC / CLI surface
|
||||
### No `clawdbot ui …`
|
||||
We avoid a parallel “Clawdbot UI automation CLI”. Instead:
|
||||
- `peekaboo` is the user/agent-facing CLI surface for automation and capture.
|
||||
- Clawdbot.app can host PeekabooBridge as a **thin TCC broker** so Peekaboo can piggyback on Clawdbot permissions when Peekaboo.app isn’t running.
|
||||
Snapshots are stored in memory and expire automatically after a short window.
|
||||
If you need longer retention, re‑capture from the client.
|
||||
|
||||
### Diagnostics
|
||||
Use Peekaboo’s built-in diagnostics to see which host would be used:
|
||||
- `peekaboo bridge status`
|
||||
- `peekaboo bridge status --verbose`
|
||||
- `peekaboo bridge status --json`
|
||||
## Troubleshooting
|
||||
|
||||
### Output format
|
||||
Peekaboo commands default to human text output. Add `--json` for a structured envelope.
|
||||
|
||||
### Timeouts
|
||||
Default timeout for UI actions: **10 seconds** end-to-end (client enforced; host should also enforce per-operation).
|
||||
|
||||
## Coordinate model (multi-display)
|
||||
Requirement: coordinates are **per screen**, not global.
|
||||
|
||||
Standardize for the CLI (agent-friendly): **top-left origin per screen**.
|
||||
|
||||
Proposed request shape:
|
||||
- Requests accept `screenIndex` + `{x, y}` in that screen’s local coordinate space.
|
||||
- Clawdbot.app converts to global CG coordinates using `NSScreen.screens[screenIndex].frame.origin`.
|
||||
- Responses should echo both:
|
||||
- The resolved `screenIndex`
|
||||
- The local `{x, y}` and bounds
|
||||
- Optionally the global `{x, y}` for debugging
|
||||
|
||||
Ordering: use `NSScreen.screens` ordering consistently (documented in the CLI help + JSON schema).
|
||||
|
||||
## Targeting (per app/window)
|
||||
Expose window/app targeting in the UI surface (align with Peekaboo targeting):
|
||||
- frontmost
|
||||
- by app name / bundle id
|
||||
- by window title substring
|
||||
- by (app, index)
|
||||
|
||||
Peekaboo CLI targeting (agent-friendly):
|
||||
- `--bundle-id <id>` for app targeting
|
||||
- `--window-index <n>` (0-based) for disambiguating within an app when capturing
|
||||
|
||||
All “see/click/type/scroll/wait” requests should accept a target (default: frontmost).
|
||||
|
||||
## “See” + click packs (Playwright-style)
|
||||
Behavior stays aligned with Peekaboo:
|
||||
- `peekaboo see` returns element IDs (e.g. `B1`, `T3`) with bounds/labels.
|
||||
- Follow-up actions reference those IDs without re-scanning.
|
||||
|
||||
`peekaboo see` should:
|
||||
- capture (optionally targeted) window/screen
|
||||
- return a screenshot **file path** (default: temp directory)
|
||||
- return a list of elements (text or JSON)
|
||||
|
||||
Snapshot lifecycle requirement:
|
||||
- Host apps are long-lived, so snapshot state should be **in-memory by default**.
|
||||
- Snapshot scoping: “implicit snapshot” is **per target bundle id** (reuse last snapshot for that app when snapshot id is omitted).
|
||||
|
||||
Practical flow (agent-friendly):
|
||||
- `peekaboo list apps` / `peekaboo list windows` provide bundle-id context for targeting.
|
||||
- `peekaboo see --bundle-id X` updates the implicit snapshot for `X`.
|
||||
- `peekaboo click --bundle-id X --on B1` reuses the most recent snapshot for `X` when `--snapshot-id` is omitted.
|
||||
|
||||
## Visualizer integration
|
||||
Keep visualizations in **Peekaboo.app** for now.
|
||||
- Clawdbot hosts the bridge, but does not render overlays.
|
||||
- Any “visualizer enabled/disabled” setting is controlled in Peekaboo.app.
|
||||
|
||||
## Screenshots (legacy → Peekaboo takeover)
|
||||
Clawdbot should not grow a separate screenshot CLI surface.
|
||||
|
||||
Migration plan:
|
||||
- Use `peekaboo capture …` / `peekaboo see …` (returns a file path, default temp directory).
|
||||
- Once Clawdbot’ legacy screenshot plumbing is replaced, remove it cleanly (no aliases).
|
||||
|
||||
## Permissions behavior
|
||||
If required permissions are missing:
|
||||
- return `ok=false` with a short human error message (e.g., “Accessibility permission missing”)
|
||||
- do not try to open System Settings from the automation endpoint
|
||||
|
||||
## Security (socket auth)
|
||||
Both hosts must enforce:
|
||||
- filesystem perms on the socket path (owner read/write only)
|
||||
- server-side caller validation:
|
||||
- require the caller’s code signature TeamID to be `Y5PE65HELJ`
|
||||
- optional bundle-id allowlist for tighter scoping
|
||||
|
||||
Debug-only escape hatch (development convenience):
|
||||
- “allow same-UID callers” means: *skip codesign checks for clients running under the same Unix user*.
|
||||
- This must be **opt-in**, **DEBUG-only**, and guarded by an env var (Peekaboo uses `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`).
|
||||
|
||||
## Next integration steps (after this doc)
|
||||
1. Add Peekaboo as a git submodule (nested submodules OK).
|
||||
2. Host `PeekabooBridgeHost` inside Clawdbot.app behind a single setting (“Enable Peekaboo Bridge”, default on).
|
||||
3. Ensure Clawdbot hosts the bridge at `~/Library/Application Support/clawdbot/bridge.sock` and speaks the PeekabooBridge JSON protocol.
|
||||
4. Validate with `peekaboo bridge status --verbose` that Peekaboo can select Clawdbot as the fallback host (no auto-launch).
|
||||
5. Keep all protocol decisions aligned with Peekaboo (coordinate system, element IDs, snapshot scoping, error envelopes).
|
||||
- If `peekaboo` reports “bridge client is not authorized”, ensure the client is
|
||||
properly signed or run the host with `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1`
|
||||
in **debug** mode only.
|
||||
- If no hosts are found, open one of the host apps (Peekaboo.app or Clawdbot.app)
|
||||
and confirm permissions are granted.
|
||||
|
||||
@@ -3,25 +3,37 @@ summary: "How the mac app embeds the gateway WebChat and how to debug it"
|
||||
read_when:
|
||||
- Debugging mac WebChat view or loopback port
|
||||
---
|
||||
# Web Chat (macOS app)
|
||||
# WebChat (macOS app)
|
||||
|
||||
The macOS menu bar app shows the WebChat UI as a native SwiftUI view and reuses the **primary Clawd session** (`main`, or `global` when scope is global).
|
||||
The macOS menu bar app embeds the WebChat UI as a native SwiftUI view. It
|
||||
connects to the Gateway and defaults to the **main session** for the selected
|
||||
agent (with a session switcher for other sessions).
|
||||
|
||||
- **Local mode**: connects directly to the local Gateway WebSocket.
|
||||
- **Remote mode**: forwards the Gateway WebSocket control port over SSH and uses that as the data plane.
|
||||
- **Remote mode**: forwards the Gateway control port over SSH and uses that
|
||||
tunnel as the data plane.
|
||||
|
||||
## Launch & debugging
|
||||
|
||||
- Manual: Lobster menu → “Open Chat”.
|
||||
- Auto-open for testing: run `dist/Clawdbot.app/Contents/MacOS/Clawdbot --webchat` (or pass `--webchat` to the binary launched by launchd). The window opens on startup.
|
||||
- Logs: see [`./scripts/clawlog.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/clawlog.sh) (subsystem `com.clawdbot`, category `WebChatSwiftUI`).
|
||||
- Auto‑open for testing:
|
||||
```bash
|
||||
dist/Clawdbot.app/Contents/MacOS/Clawdbot --webchat
|
||||
```
|
||||
- Logs: `./scripts/clawlog.sh` (subsystem `com.clawdbot`, category `WebChatSwiftUI`).
|
||||
|
||||
## How it’s wired
|
||||
- Implementation: [`apps/macos/Sources/Clawdbot/WebChatSwiftUI.swift`](https://github.com/clawdbot/clawdbot/blob/main/apps/macos/Sources/Clawdbot/WebChatSwiftUI.swift) hosts `ClawdbotChatUI` and speaks to the Gateway over `GatewayConnection`.
|
||||
- Data plane: Gateway WebSocket methods `chat.history`, `chat.send`, `chat.abort`; events `chat`, `agent`, `presence`, `tick`, `health`.
|
||||
- Session: usually primary (`main`); multiple transports (WhatsApp/Telegram/Discord/Desktop) share the same key. The onboarding flow uses a dedicated `onboarding` session to keep first-run setup separate.
|
||||
|
||||
## Security / surface area
|
||||
- Data plane: Gateway WS methods `chat.history`, `chat.send`, `chat.abort` and
|
||||
events `chat`, `agent`, `presence`, `tick`, `health`.
|
||||
- Session: defaults to the primary session (`main`, or `global` when scope is
|
||||
global). The UI can switch between sessions.
|
||||
- Onboarding uses a dedicated session to keep first‑run setup separate.
|
||||
|
||||
## Security surface
|
||||
|
||||
- Remote mode forwards only the Gateway WebSocket control port over SSH.
|
||||
|
||||
## Known limitations
|
||||
- The UI is optimized for the primary session and typical “chat” usage (not a full browser-based sandbox surface).
|
||||
|
||||
- The UI is optimized for chat sessions (not a full browser sandbox).
|
||||
|
||||
@@ -3,7 +3,7 @@ summary: "macOS IPC architecture for Clawdbot app, gateway node bridge, and Peek
|
||||
read_when:
|
||||
- Editing IPC contracts or menu bar app IPC
|
||||
---
|
||||
# Clawdbot macOS IPC architecture (Dec 2025)
|
||||
# Clawdbot macOS IPC architecture
|
||||
|
||||
**Current model:** there is **no local control socket** and no `clawdbot-mac` CLI. All agent actions go through the Gateway WebSocket and `node.invoke`. UI automation still uses PeekabooBridge.
|
||||
|
||||
@@ -21,10 +21,10 @@ read_when:
|
||||
- UI automation uses a separate UNIX socket named `bridge.sock` and the PeekabooBridge JSON protocol.
|
||||
- Host preference order (client-side): Peekaboo.app → Claude.app → Clawdbot.app → local execution.
|
||||
- Security: bridge hosts require TeamID `Y5PE65HELJ`; DEBUG-only same-UID escape hatch is guarded by `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (Peekaboo convention).
|
||||
- See: [`docs/mac/peekaboo.md`](/platforms/mac/peekaboo) for the Clawdbot plan and naming.
|
||||
- See: [`docs/mac/peekaboo.md`](/platforms/mac/peekaboo) for PeekabooBridge usage.
|
||||
|
||||
### Mach/XPC (future direction)
|
||||
- Still optional for internal app services, but **not required** for automation now that node.invoke is the surface.
|
||||
### Mach/XPC
|
||||
- Not required for automation; `node.invoke` + PeekabooBridge cover current needs.
|
||||
|
||||
## Operational flows
|
||||
- Restart/rebuild: `SIGN_IDENTITY="Apple Development: Peter Steinberger (2ZAC4GM7GD)" scripts/restart-mac.sh`
|
||||
@@ -37,4 +37,4 @@ read_when:
|
||||
- Prefer requiring a TeamID match for all privileged surfaces.
|
||||
- PeekabooBridge: `PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1` (DEBUG-only) may allow same-UID callers for local development.
|
||||
- All communication remains local-only; no network sockets are exposed.
|
||||
- TCC prompts originate only from the GUI app bundle; run [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh) so the signed bundle ID stays stable.
|
||||
- TCC prompts originate only from the GUI app bundle; keep the signed bundle ID stable across rebuilds.
|
||||
|
||||
@@ -1,123 +1,97 @@
|
||||
---
|
||||
summary: "Spec for the Clawdbot macOS companion menu bar app (gateway + node broker)"
|
||||
summary: "Clawdbot macOS companion app (menu bar + gateway broker)"
|
||||
read_when:
|
||||
- Implementing macOS app features
|
||||
- Changing gateway lifecycle or node bridging on macOS
|
||||
---
|
||||
# Clawdbot macOS Companion (menu bar + gateway broker)
|
||||
|
||||
Author: steipete · Status: draft spec · Date: 2025-12-20
|
||||
The macOS app is the **menu‑bar companion** for Clawdbot. It owns permissions,
|
||||
manages the Gateway locally, and exposes macOS capabilities to the agent as a
|
||||
node.
|
||||
|
||||
## Support snapshot
|
||||
- Core Gateway: supported (TypeScript on Node/Bun).
|
||||
- Companion app: macOS menu bar app with permissions + node bridge.
|
||||
- Install: [Getting Started](/start/getting-started) or [Install & updates](/install/updating).
|
||||
- Gateway: [Runbook](/gateway) + [Configuration](/gateway/configuration).
|
||||
## What it does
|
||||
|
||||
## System control (launchd)
|
||||
If you run the bundled macOS app, it installs a per-user LaunchAgent labeled `com.clawdbot.gateway`.
|
||||
CLI-only installs can use `clawdbot onboard --install-daemon`, `clawdbot daemon install`, or `clawdbot configure` → **Gateway daemon**.
|
||||
- Shows native notifications and status in the menu bar.
|
||||
- Owns TCC prompts (Notifications, Accessibility, Screen Recording, Microphone,
|
||||
Speech Recognition, Automation/AppleScript).
|
||||
- Runs or connects to the Gateway (local or remote).
|
||||
- Exposes macOS‑only tools (Canvas, Camera, Screen Recording, `system.run`).
|
||||
- Optionally hosts **PeekabooBridge** for UI automation.
|
||||
- Installs a helper CLI (`clawdbot`) into `/usr/local/bin` and
|
||||
`/opt/homebrew/bin` on request.
|
||||
|
||||
## Local vs remote mode
|
||||
|
||||
- **Local** (default): the app ensures a local Gateway is running via launchd.
|
||||
- **Remote**: the app connects to a Gateway over SSH/Tailscale and never starts
|
||||
a local process.
|
||||
- **Attach‑only** (debug): the app connects to an already‑running local Gateway
|
||||
and never spawns its own.
|
||||
|
||||
## Launchd control
|
||||
|
||||
The app manages a per‑user LaunchAgent labeled `com.clawdbot.gateway`.
|
||||
|
||||
```bash
|
||||
launchctl kickstart -k gui/$UID/com.clawdbot.gateway
|
||||
launchctl bootout gui/$UID/com.clawdbot.gateway
|
||||
```
|
||||
|
||||
`launchctl` only works if the LaunchAgent is installed; otherwise run `clawdbot daemon install` first.
|
||||
If the LaunchAgent isn’t installed, enable it from the app or run
|
||||
`clawdbot daemon install`.
|
||||
|
||||
Details: [Gateway runbook](/gateway) and [Bundled bun Gateway](/platforms/mac/bun).
|
||||
## Node capabilities (mac)
|
||||
|
||||
## Purpose
|
||||
- Single macOS menu-bar app named **Clawdbot** that:
|
||||
- Shows native notifications for Clawdbot/clawdbot events.
|
||||
- Owns TCC prompts (Notifications, Accessibility, Screen Recording, Automation/AppleScript, Microphone, Speech Recognition).
|
||||
- Runs (or connects to) the **Gateway** and exposes itself as a **node** so agents can reach macOS‑only features.
|
||||
- Hosts **PeekabooBridge** for UI automation (consumed by `peekaboo`; see [`docs/mac/peekaboo.md`](/platforms/mac/peekaboo)).
|
||||
- Installs a single CLI (`clawdbot`) by symlinking the bundled binary.
|
||||
The macOS app presents itself as a node. Common commands:
|
||||
|
||||
## High-level design
|
||||
- SwiftPM package in `apps/macos/` (macOS 15+, Swift 6).
|
||||
- Targets:
|
||||
- `ClawdbotIPC` (shared Codable types + helpers for app‑internal actions).
|
||||
- `Clawdbot` (LSUIElement MenuBarExtra app; hosts Gateway + node bridge + PeekabooBridgeHost).
|
||||
- Bundle ID: `com.clawdbot.mac`.
|
||||
- Bundled runtime binaries live under `Contents/Resources/Relay/`:
|
||||
- `clawdbot` (bun‑compiled relay: CLI + gateway)
|
||||
- The app symlinks `clawdbot` into `/usr/local/bin` and `/opt/homebrew/bin`.
|
||||
|
||||
## Gateway + node bridge
|
||||
- The mac app runs the Gateway in **local** mode (unless configured remote).
|
||||
- The gateway port is configurable via `gateway.port` or `CLAWDBOT_GATEWAY_PORT` (default 18789). The mac app reads that value for launchd, probes, and remote SSH tunnels.
|
||||
- The mac app connects to the bridge as a **node** and advertises capabilities/commands.
|
||||
- Agent‑facing actions are exposed via `node.invoke` (no local control socket).
|
||||
- The mac app watches `~/.clawdbot/clawdbot.json` and switches modes live when `gateway.mode` or `gateway.remote.url` changes.
|
||||
- If `gateway.mode` is unset but `gateway.remote.url` is set, the mac app treats it as remote mode.
|
||||
- Changing connection mode in the mac app writes `gateway.mode` (and `gateway.remote.url` in remote mode) back to the config file.
|
||||
|
||||
### Node commands (mac)
|
||||
- Canvas: `canvas.present|navigate|eval|snapshot|a2ui.*`
|
||||
- Camera: `camera.snap|camera.clip`
|
||||
- Canvas: `canvas.present`, `canvas.navigate`, `canvas.eval`, `canvas.snapshot`, `canvas.a2ui.*`
|
||||
- Camera: `camera.snap`, `camera.clip`
|
||||
- Screen: `screen.record`
|
||||
- System: `system.run` (shell) and `system.notify`
|
||||
- System: `system.run`, `system.notify`
|
||||
|
||||
### Permission advertising
|
||||
- Nodes include a `permissions` map in hello/pairing.
|
||||
- The Gateway surfaces it via `node.list` / `node.describe` so agents can decide what to run.
|
||||
The node reports a `permissions` map so agents can decide what’s allowed.
|
||||
|
||||
## CLI (`clawdbot`)
|
||||
- The **only** CLI is `clawdbot` (TS/bun). There is no `clawdbot-mac` helper.
|
||||
- For mac‑specific actions, the CLI uses `node.invoke`:
|
||||
- `clawdbot nodes canvas present|navigate|eval|snapshot|a2ui push|a2ui reset`
|
||||
- `clawdbot nodes run --node <id> -- <command...>`
|
||||
- `clawdbot nodes notify --node <id> --title ...`
|
||||
## Deep links
|
||||
|
||||
## Onboarding
|
||||
- Install CLI (symlink) → Permissions checklist → Test notification → Done.
|
||||
- Remote mode skips local gateway/CLI steps.
|
||||
- Selecting Local auto-enables the bundled Gateway via launchd (unless “Attach only” debug mode is enabled).
|
||||
|
||||
## Deep links (URL scheme)
|
||||
|
||||
Clawdbot (the macOS app) registers a URL scheme for triggering local actions from anywhere (browser, Shortcuts, CLI, etc.).
|
||||
|
||||
Scheme:
|
||||
- `clawdbot://…`
|
||||
The app registers the `clawdbot://` URL scheme for local actions.
|
||||
|
||||
### `clawdbot://agent`
|
||||
|
||||
Triggers a Gateway `agent` request (same machinery as WebChat/agent runs).
|
||||
|
||||
Example:
|
||||
Triggers a Gateway `agent` request.
|
||||
|
||||
```bash
|
||||
open 'clawdbot://agent?message=Hello%20from%20deep%20link'
|
||||
```
|
||||
|
||||
Query parameters:
|
||||
- `message` (required): the agent prompt (URL-encoded).
|
||||
- `sessionKey` (optional): explicit session key to use.
|
||||
- `thinking` (optional): thinking hint (e.g. `low`; omit for default).
|
||||
- `deliver` (optional): `true|false` (default: false).
|
||||
- `to` / `provider` (optional): forwarded to the Gateway `agent` method (only meaningful with `deliver=true`).
|
||||
- `timeoutSeconds` (optional): timeout hint forwarded to the Gateway.
|
||||
- `key` (optional): unattended mode key (see below).
|
||||
- `message` (required)
|
||||
- `sessionKey` (optional)
|
||||
- `thinking` (optional)
|
||||
- `deliver` / `to` / `provider` (optional)
|
||||
- `timeoutSeconds` (optional)
|
||||
- `key` (optional unattended mode key)
|
||||
|
||||
Safety/guardrails:
|
||||
- Always enabled.
|
||||
- Without a `key` query param, the app will prompt for confirmation before invoking the agent.
|
||||
- With `key=<value>`, Clawdbot runs without prompting (intended for personal automations).
|
||||
- The current key is shown in Debug Settings and stored locally in UserDefaults.
|
||||
Safety:
|
||||
- Without `key`, the app prompts for confirmation.
|
||||
- With a valid `key`, the run is unattended (intended for personal automations).
|
||||
|
||||
Notes:
|
||||
- In local mode, Clawdbot will start the local Gateway if needed before issuing the request.
|
||||
- In remote mode, Clawdbot will use the configured remote tunnel/endpoint.
|
||||
## Onboarding flow (typical)
|
||||
|
||||
1) Install and launch **Clawdbot.app**.
|
||||
2) Complete the permissions checklist (TCC prompts).
|
||||
3) Ensure **Local** mode is active and the Gateway is running.
|
||||
4) Install the CLI helper if you want terminal access.
|
||||
|
||||
## Build & dev workflow (native)
|
||||
- `cd apps/macos && swift build` (debug) / `swift build -c release`.
|
||||
- Run app for dev: `swift run Clawdbot` (or Xcode scheme).
|
||||
- Package app + CLI: [`scripts/package-mac-app.sh`](https://github.com/clawdbot/clawdbot/blob/main/scripts/package-mac-app.sh) (builds bun CLI + gateway).
|
||||
- Tests: add Swift Testing suites under `apps/macos/Tests`.
|
||||
|
||||
## Open questions / decisions
|
||||
- Should `system.run` support streaming stdout/stderr or keep buffered responses only?
|
||||
- Should we allow node‑side permission prompts, or always require explicit app UI action?
|
||||
- `cd apps/macos && swift build`
|
||||
- `swift run Clawdbot` (or Xcode)
|
||||
- Package app + CLI: `scripts/package-mac-app.sh`
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Gateway runbook](/gateway)
|
||||
- [Bundled bun Gateway](/platforms/mac/bun)
|
||||
- [macOS permissions](/platforms/mac/permissions)
|
||||
- [Canvas](/platforms/mac/canvas)
|
||||
|
||||
Reference in New Issue
Block a user