Files
clawdbot/docs/mac/canvas.md
2025-12-27 01:36:29 +01:00

162 lines
7.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
summary: "Agent-controlled Canvas panel embedded via WKWebView + custom URL scheme"
read_when:
- Implementing the macOS Canvas panel
- Adding agent controls for visual workspace
- Debugging WKWebView canvas loads
---
# Canvas (macOS app)
Status: draft spec · Date: 2025-12-12
Note: for iOS/Android nodes that should render agent-edited HTML/CSS/JS over the network, prefer the Gateway `canvasHost` (serves `~/clawd/canvas` over LAN/tailnet with live reload). A2UI is also **hosted by the Gateway** over HTTP. This doc focuses on the macOS in-app canvas panel. See `docs/configuration.md`.
Clawdis can embed an agent-controlled “visual workspace” panel (“Canvas”) inside the macOS app using `WKWebView`, served via a **custom URL scheme** (no loopback HTTP port required).
This is designed for:
- Agent-written HTML/CSS/JS on disk (per-session directory).
- A real browser engine for layout, rendering, and basic interactivity.
- Agent-driven visibility (show/hide), navigation, DOM/JS queries, and snapshots.
- Minimal chrome: borderless panel; bezel/chrome appears only on hover.
## Why a custom scheme (vs. loopback HTTP)
Using `WKURLSchemeHandler` keeps Canvas entirely in-process:
- No port conflicts and no extra local server lifecycle.
- Easier to sandbox: only serve files we explicitly map.
- Works offline and can use an ephemeral data store (no persistent cookies/cache).
If a Canvas page truly needs “real web” semantics (CORS, fetch to loopback endpoints, service workers), consider the loopback-server variant instead (out of scope for this doc).
## URL ↔ directory mapping
The Canvas scheme is:
- `clawdis-canvas://<session>/<path>`
Routing model:
- `clawdis-canvas://main/``<canvasRoot>/main/index.html` (or `index.htm`)
- `clawdis-canvas://main/yolo``<canvasRoot>/main/yolo/index.html` (or `index.htm`)
- `clawdis-canvas://main/assets/app.css``<canvasRoot>/main/assets/app.css`
Directory listings are not served.
When `/` has no `index.html` yet, the handler serves a **built-in scaffold page** (bundled with the macOS app).
This is a visual placeholder only (no A2UI renderer).
### Suggested on-disk location
Store Canvas state under the app support directory:
- `~/Library/Application Support/Clawdis/canvas/<session>/…`
This keeps it alongside other app-owned state and avoids mixing with `~/.clawdis/` gateway config.
## Panel behavior (agent-controlled)
Canvas is presented as a borderless `NSPanel` (similar to the existing WebChat panel):
- Can be shown/hidden at any time by the agent.
- Supports an “anchored” presentation (near the menu bar icon or another anchor rect).
- Uses a rounded container; shadow stays on, but **chrome/bezel only appears on hover**.
- Default position is the **top-right corner** of the current screens visible frame (unless the user moved/resized it previously).
- The panel is **user-resizable** (edge resize + hover resize handle) and the last frame is persisted per session.
### Hover-only chrome
Implementation notes:
- Keep the window borderless at all times (dont toggle `styleMask`).
- Add an overlay view inside the content container for chrome (stroke + subtle gradient/material).
- Use an `NSTrackingArea` to fade the chrome in/out on `mouseEntered/mouseExited`.
- Optionally show close/drag affordances only while hovered.
## Agent API surface (current)
Canvas is exposed via the Gateway **node bridge**, so the agent can:
- Show/hide the panel.
- Navigate to a path (relative to the session root).
- Evaluate JavaScript and optionally return results.
- Query/modify DOM (helpers mirroring “dom query/all/attr/click/type/wait” patterns).
- Capture a snapshot image of the current canvas view.
- Optionally set panel placement (screen `x/y` + `width/height`) when showing/navigating.
This should be modeled after `WebChatManager`/`WebChatSwiftUIWindowController` but targeting `clawdis-canvas://…` URLs.
Related:
- For “invoke the agent again from UI” flows, prefer the macOS deep link scheme (`clawdis://agent?...`) so *any* UI surface (Canvas, WebChat, native views) can trigger a new agent run. See `docs/clawdis-mac.md`.
## Agent commands (current)
Use the main `clawdis` CLI; it invokes canvas commands via `node.invoke`.
- `clawdis canvas present [--node <id>] [--target <...>] [--x/--y/--width/--height]`
- Local targets map into the session directory via the custom scheme (directory targets resolve `index.html|index.htm`).
- If `/` has no index file, Canvas shows the built-in scaffold page and returns `status: "welcome"`.
- `clawdis canvas hide [--node <id>]`
- `clawdis canvas eval --js <code> [--node <id>]`
- `clawdis canvas snapshot [--node <id>]`
### Canvas A2UI
Canvas A2UI is hosted by the **Gateway canvas host** at:
```
http://<gateway-host>:18793/__clawdis__/a2ui/
```
The macOS app simply renders that page in the Canvas panel. The agent can drive it with JSONL **server→client protocol messages** (one JSON object per line):
- `clawdis canvas a2ui push --jsonl <path> [--node <id>]`
- `clawdis canvas a2ui reset [--node <id>]`
`push` expects a JSONL file where **each line is a single JSON object** (parsed and forwarded to the in-page A2UI renderer).
Minimal example (v0.8):
```bash
cat > /tmp/a2ui-v0.8.jsonl <<'EOF'
{"surfaceUpdate":{"surfaceId":"main","components":[{"id":"root","component":{"Column":{"children":{"explicitList":["title","content"]}}}},{"id":"title","component":{"Text":{"text":{"literalString":"Canvas (A2UI v0.8)"},"usageHint":"h1"}}},{"id":"content","component":{"Text":{"text":{"literalString":"If you can read this, `canvas a2ui push` works."},"usageHint":"body"}}}]}}
{"beginRendering":{"surfaceId":"main","root":"root"}}
EOF
clawdis canvas a2ui push --jsonl /tmp/a2ui-v0.8.jsonl --node <id>
```
Notes:
- This does **not** support the A2UI v0.9 examples using `createSurface`.
- A2UI **fails** if the Gateway canvas host is unreachable (no local fallback).
- `canvas a2ui push` validates JSONL (line numbers on errors) and rejects v0.9 payloads.
- Quick smoke: `clawdis canvas a2ui push --text "Hello from A2UI"` renders a minimal v0.8 view.
## Triggering agent runs from Canvas (deep links)
Canvas can trigger new agent runs via the macOS app deep-link scheme:
- `clawdis://agent?...`
This is intentionally separate from `clawdis-canvas://…` (which is only for serving local Canvas files into the `WKWebView`).
Suggested patterns:
- HTML: render links/buttons that navigate to `clawdis://agent?message=...`.
- JS: set `window.location.href = 'clawdis://agent?...'` for “run this now” actions.
Implementation note (important):
- In `WKWebView`, intercept `clawdis://…` navigations in `WKNavigationDelegate` and forward them to the app, e.g. by calling `DeepLinkHandler.shared.handle(url:)` and returning `.cancel` for the navigation.
Safety:
- Deep links (`clawdis://agent?...`) are always enabled.
- Without a `key` query param, the app will prompt for confirmation before invoking the agent.
- With a valid `key`, the run is unattended (no prompt). For Canvas-originated actions, the app injects an internal key automatically.
## Security / guardrails
Recommended defaults:
- `WKWebsiteDataStore.nonPersistent()` for Canvas (ephemeral).
- Navigation policy: allow only `clawdis-canvas://…` (and optionally `about:blank`); open `http/https` externally.
- Scheme handler must prevent directory traversal: resolved file paths must stay under `<canvasRoot>/<session>/`.
- Disable or tightly scope any JS bridge; prefer query-string/bootstrap config over `window.webkit.messageHandlers` for sensitive data.
## Debugging
Suggested debugging hooks:
- Enable Web Inspector for Canvas builds (same approach as WebChat).
- Log scheme requests + resolution decisions to OSLog (subsystem `com.steipete.clawdis`, category `Canvas`).
- Provide a “copy canvas dir” action in debug settings to quickly reveal the session directory in Finder.