--- summary: "Integrated browser control server + action commands" read_when: - Adding agent-controlled browser automation - Debugging why clawd is interfering with your own Chrome - Implementing browser settings + lifecycle in the macOS app --- # Browser (clawd-managed) Clawdbot can run a **dedicated Chrome/Brave/Edge/Chromium profile** that the agent controls. It is isolated from your personal browser and is managed through a small local control server. Beginner view: - Think of it as a **separate, agent-only browser**. - The `clawd` profile does **not** touch your personal browser profile. - The agent can **open tabs, read pages, click, and type** in a safe lane. - The default `chrome` profile uses the **system default Chromium browser** via the extension relay; switch to `clawd` for the isolated managed browser. ## What you get - A separate browser profile named **clawd** (orange accent by default). - Deterministic tab control (list/open/focus/close). - Agent actions (click/type/drag/select), snapshots, screenshots, PDFs. - Optional multi-profile support (`clawd`, `work`, `remote`, ...). This browser is **not** your daily driver. It is a safe, isolated surface for agent automation and verification. ## Quick start ```bash clawdbot browser --browser-profile clawd status clawdbot browser --browser-profile clawd start clawdbot browser --browser-profile clawd open https://example.com clawdbot browser --browser-profile clawd snapshot ``` If you get “Browser disabled”, enable it in config (see below) and restart the Gateway. ## Profiles: `clawd` vs `chrome` - `clawd`: managed, isolated browser (no extension required). - `chrome`: extension relay to your **system browser** (requires the Clawdbot extension to be attached to a tab). Set `browser.defaultProfile: "clawd"` if you want managed mode by default. ## Configuration Browser settings live in `~/.clawdbot/clawdbot.json`. ```json5 { browser: { enabled: true, // default: true controlUrl: "http://127.0.0.1:18791", cdpUrl: "http://127.0.0.1:18792", // defaults to controlUrl + 1 remoteCdpTimeoutMs: 1500, // remote CDP HTTP timeout (ms) remoteCdpHandshakeTimeoutMs: 3000, // remote CDP WebSocket handshake timeout (ms) defaultProfile: "chrome", color: "#FF4500", headless: false, noSandbox: false, attachOnly: false, executablePath: "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser", profiles: { clawd: { cdpPort: 18800, color: "#FF4500" }, work: { cdpPort: 18801, color: "#0066CC" }, remote: { cdpUrl: "http://10.0.0.42:9222", color: "#00AA00" } } } } ``` Notes: - `controlUrl` defaults to `http://127.0.0.1:18791`. - If you override the Gateway port (`gateway.port` or `CLAWDBOT_GATEWAY_PORT`), the default browser ports shift to stay in the same “family” (control = gateway + 2). - `cdpUrl` defaults to `controlUrl + 1` when unset. - `remoteCdpTimeoutMs` applies to remote (non-loopback) CDP reachability checks. - `remoteCdpHandshakeTimeoutMs` applies to remote CDP WebSocket reachability checks. - `attachOnly: true` means “never launch a local browser; only attach if it is already running.” - `color` + per-profile `color` tint the browser UI so you can see which profile is active. - Default profile is `chrome` (extension relay). Use `defaultProfile: "clawd"` for the managed browser. - Auto-detect order: system default browser if Chromium-based; otherwise Chrome → Brave → Edge → Chromium → Chrome Canary. - Local `clawd` profiles auto-assign `cdpPort`/`cdpUrl` — set those only for remote CDP. ## Use Brave (or another Chromium-based browser) If your **system default** browser is Chromium-based (Chrome/Brave/Edge/etc), Clawdbot uses it automatically. Set `browser.executablePath` to override auto-detection: CLI example: ```bash clawdbot config set browser.executablePath "/usr/bin/google-chrome" ``` ```json5 // macOS { browser: { executablePath: "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser" } } // Windows { browser: { executablePath: "C:\\Program Files\\BraveSoftware\\Brave-Browser\\Application\\brave.exe" } } // Linux { browser: { executablePath: "/usr/bin/brave-browser" } } ``` ## Local vs remote control - **Local control (default):** `controlUrl` is loopback (`127.0.0.1`/`localhost`). The Gateway starts the control server and can launch a local browser. - **Remote control:** `controlUrl` is non-loopback. The Gateway **does not** start a local server; it assumes you are pointing at an existing server elsewhere. - **Remote CDP:** set `browser.profiles..cdpUrl` (or `browser.cdpUrl`) to attach to a remote Chromium-based browser. In this case, Clawdbot will not launch a local browser. ## Remote browser (control server) You can run the **browser control server** on another machine and point your Gateway at it with a remote `controlUrl`. This lets the agent drive a browser outside the host (lab box, VM, remote desktop, etc.). Key points: - The **control server** speaks to Chromium-based browsers (Chrome/Brave/Edge/Chromium) via **CDP**. - The **Gateway** only needs the HTTP control URL. - Profiles are resolved on the **control server** side. Example: ```json5 { browser: { enabled: true, controlUrl: "http://10.0.0.42:18791", defaultProfile: "work" } } ``` Use `profiles..cdpUrl` for **remote CDP** if you want the Gateway to talk directly to a Chromium-based browser instance without a remote control server. Remote CDP URLs can include auth: - Query tokens (e.g., `https://provider.example?token=`) - HTTP Basic auth (e.g., `https://user:pass@provider.example`) Clawdbot preserves the auth when calling `/json/*` endpoints and when connecting to the CDP WebSocket. Prefer environment variables or secrets managers for tokens instead of committing them to config files. ### Browserless (hosted remote CDP) [Browserless](https://browserless.io) is a hosted Chromium service that exposes CDP endpoints over HTTPS. You can point a Clawdbot browser profile at a Browserless region endpoint and authenticate with your API key. Example: ```json5 { browser: { enabled: true, defaultProfile: "browserless", remoteCdpTimeoutMs: 2000, remoteCdpHandshakeTimeoutMs: 4000, profiles: { browserless: { cdpUrl: "https://production-sfo.browserless.io?token=", color: "#00AA00" } } } } ``` Notes: - Replace `` with your real Browserless token. - Choose the region endpoint that matches your Browserless account (see their docs). ### Running the control server on the browser machine Run a standalone browser control server (recommended when your Gateway is remote): ```bash # on the machine that runs Chrome/Brave/Edge clawdbot browser serve --bind --port 18791 --token ``` Then point your Gateway at it: ```json5 { browser: { enabled: true, controlUrl: "http://:18791", // Option A (recommended): keep token in env on the Gateway // (avoid writing secrets into config files) // controlToken: "" } } ``` And set the auth token in the Gateway environment: ```bash export CLAWDBOT_BROWSER_CONTROL_TOKEN="" ``` Option B: store the token in the Gateway config instead (same shared token): ```json5 { browser: { enabled: true, controlUrl: "http://:18791", controlToken: "" } } ``` ## Security This section covers the **browser control server** (`browser.controlUrl`) used for agent browser automation. Key ideas: - Treat the browser control server like an admin API: **private network only**. - Use **token auth** always when the server is reachable off-machine. - Prefer **Tailnet-only** connectivity over LAN exposure. ### Tokens (what is shared with what?) - `browser.controlToken` / `CLAWDBOT_BROWSER_CONTROL_TOKEN` is **only** for authenticating browser control HTTP requests to `browser.controlUrl`. - It is **not** the Gateway token (`gateway.auth.token`) and **not** a node pairing token. - You *can* reuse the same string value, but it’s better to keep them separate to reduce blast radius. ### Binding (don’t expose to your LAN by accident) Recommended: - Keep `clawdbot browser serve` bound to loopback (`127.0.0.1`) and publish it via Tailscale. - Or bind to a Tailnet IP only (never `0.0.0.0`) and require a token. Avoid: - `--bind 0.0.0.0` (LAN-visible). Even with token auth, traffic is plain HTTP unless you also add TLS. ### TLS / HTTPS (recommended approach: terminate in front) Best practice here: keep `clawdbot browser serve` on HTTP and terminate TLS in front. If you’re already using Tailscale, you have two good options: 1) **Tailnet-only, still HTTP** (transport is encrypted by Tailscale): - Keep `controlUrl` as `http://…` but ensure it’s only reachable over your tailnet. 2) **Serve HTTPS via Tailscale** (nice UX: `https://…` URL): ```bash # on the browser machine clawdbot browser serve --bind 127.0.0.1 --port 18791 --token tailscale serve https / http://127.0.0.1:18791 ``` Then set your Gateway config `browser.controlUrl` to the HTTPS URL (MagicDNS/ts.net) and keep using the same token. Notes: - Do **not** use Tailscale Funnel for this unless you explicitly want to make the endpoint public. - For Tailnet setup/background, see [Gateway web surfaces](/web/index) and the [Gateway CLI](/cli/gateway). ## Profiles (multi-browser) Clawdbot supports multiple named profiles (routing configs). Profiles can be: - **clawd-managed**: a dedicated Chromium-based browser instance with its own user data directory + CDP port - **remote**: an explicit CDP URL (Chromium-based browser running elsewhere) - **extension relay**: your existing Chrome tab(s) via the local relay + Chrome extension Defaults: - The `clawd` profile is auto-created if missing. - The `chrome` profile is built-in for the Chrome extension relay (points at `http://127.0.0.1:18792` by default). - Local CDP ports allocate from **18800–18899** by default. - Deleting a profile moves its local data directory to Trash. All control endpoints accept `?profile=`; the CLI uses `--browser-profile`. ## Chrome extension relay (use your existing Chrome) Clawdbot can also drive **your existing Chrome tabs** (no separate “clawd” Chrome instance) via a local CDP relay + a Chrome extension. Full guide: [Chrome extension](/tools/chrome-extension) Flow: - You run a **browser control server** (Gateway on the same machine, or `clawdbot browser serve`). - A local **relay server** listens at a loopback `cdpUrl` (default: `http://127.0.0.1:18792`). - You click the **Clawdbot Browser Relay** extension icon on a tab to attach (it does not auto-attach). - The agent controls that tab via the normal `browser` tool, by selecting the right profile. If the Gateway runs on the same machine as Chrome (default setup), you usually **do not** need `clawdbot browser serve`. Use `browser serve` only when the Gateway runs elsewhere (remote mode). ### Sandboxed sessions If the agent session is sandboxed, the `browser` tool may default to `target="sandbox"` (sandbox browser). Chrome extension relay takeover requires host browser control, so either: - run the session unsandboxed, or - set `agents.defaults.sandbox.browser.allowHostControl: true` and use `target="host"` when calling the tool. ### Setup 1) Load the extension (dev/unpacked): ```bash clawdbot browser extension install ``` - Chrome → `chrome://extensions` → enable “Developer mode” - “Load unpacked” → select the directory printed by `clawdbot browser extension path` - Pin the extension, then click it on the tab you want to control (badge shows `ON`). 2) Use it: - CLI: `clawdbot browser --browser-profile chrome tabs` - Agent tool: `browser` with `profile="chrome"` Optional: if you want a different name or relay port, create your own profile: ```bash clawdbot browser create-profile \ --name my-chrome \ --driver extension \ --cdp-url http://127.0.0.1:18792 \ --color "#00AA00" ``` Notes: - This mode relies on Playwright-on-CDP for most operations (screenshots/snapshots/actions). - Detach by clicking the extension icon again. ## Isolation guarantees - **Dedicated user data dir**: never touches your personal browser profile. - **Dedicated ports**: avoids `9222` to prevent collisions with dev workflows. - **Deterministic tab control**: target tabs by `targetId`, not “last tab”. ## Browser selection When launching locally, Clawdbot picks the first available: 1. Chrome 2. Brave 3. Edge 4. Chromium 5. Chrome Canary You can override with `browser.executablePath`. Platforms: - macOS: checks `/Applications` and `~/Applications`. - Linux: looks for `google-chrome`, `brave`, `microsoft-edge`, `chromium`, etc. - Windows: checks common install locations. ## Control API (optional) If you want to integrate directly, the browser control server exposes a small HTTP API: - Status/start/stop: `GET /`, `POST /start`, `POST /stop` - Tabs: `GET /tabs`, `POST /tabs/open`, `POST /tabs/focus`, `DELETE /tabs/:targetId` - Snapshot/screenshot: `GET /snapshot`, `POST /screenshot` - Actions: `POST /navigate`, `POST /act` - Hooks: `POST /hooks/file-chooser`, `POST /hooks/dialog` - Downloads: `POST /download`, `POST /wait/download` - Debugging: `GET /console`, `POST /pdf` - Debugging: `GET /errors`, `GET /requests`, `POST /trace/start`, `POST /trace/stop`, `POST /highlight` - Network: `POST /response/body` - State: `GET /cookies`, `POST /cookies/set`, `POST /cookies/clear` - State: `GET /storage/:kind`, `POST /storage/:kind/set`, `POST /storage/:kind/clear` - Settings: `POST /set/offline`, `POST /set/headers`, `POST /set/credentials`, `POST /set/geolocation`, `POST /set/media`, `POST /set/timezone`, `POST /set/locale`, `POST /set/device` All endpoints accept `?profile=`. ### Playwright requirement Some features (navigate/act/AI snapshot/role snapshot, element screenshots, PDF) require Playwright. If Playwright isn’t installed, those endpoints return a clear 501 error. ARIA snapshots and basic screenshots still work for clawd-managed Chrome. For the Chrome extension relay driver, ARIA snapshots and screenshots require Playwright. If you see `Playwright is not available in this gateway build`, install the full Playwright package (not `playwright-core`) and restart the gateway, or reinstall Clawdbot with browser support. ## How it works (internal) High-level flow: - A small **control server** accepts HTTP requests. - It connects to Chromium-based browsers (Chrome/Brave/Edge/Chromium) via **CDP**. - For advanced actions (click/type/snapshot/PDF), it uses **Playwright** on top of CDP. - When Playwright is missing, only non-Playwright operations are available. This design keeps the agent on a stable, deterministic interface while letting you swap local/remote browsers and profiles. ## CLI quick reference All commands accept `--browser-profile ` to target a specific profile. All commands also accept `--json` for machine-readable output (stable payloads). Basics: - `clawdbot browser status` - `clawdbot browser start` - `clawdbot browser stop` - `clawdbot browser tabs` - `clawdbot browser tab` - `clawdbot browser tab new` - `clawdbot browser tab select 2` - `clawdbot browser tab close 2` - `clawdbot browser open https://example.com` - `clawdbot browser focus abcd1234` - `clawdbot browser close abcd1234` Inspection: - `clawdbot browser screenshot` - `clawdbot browser screenshot --full-page` - `clawdbot browser screenshot --ref 12` - `clawdbot browser screenshot --ref e12` - `clawdbot browser snapshot` - `clawdbot browser snapshot --format aria --limit 200` - `clawdbot browser snapshot --interactive --compact --depth 6` - `clawdbot browser snapshot --efficient` - `clawdbot browser snapshot --labels` - `clawdbot browser snapshot --selector "#main" --interactive` - `clawdbot browser snapshot --frame "iframe#main" --interactive` - `clawdbot browser console --level error` - `clawdbot browser errors --clear` - `clawdbot browser requests --filter api --clear` - `clawdbot browser pdf` - `clawdbot browser responsebody "**/api" --max-chars 5000` Actions: - `clawdbot browser navigate https://example.com` - `clawdbot browser resize 1280 720` - `clawdbot browser click 12 --double` - `clawdbot browser click e12 --double` - `clawdbot browser type 23 "hello" --submit` - `clawdbot browser press Enter` - `clawdbot browser hover 44` - `clawdbot browser scrollintoview e12` - `clawdbot browser drag 10 11` - `clawdbot browser select 9 OptionA OptionB` - `clawdbot browser download e12 /tmp/report.pdf` - `clawdbot browser waitfordownload /tmp/report.pdf` - `clawdbot browser upload /tmp/file.pdf` - `clawdbot browser fill --fields '[{"ref":"1","type":"text","value":"Ada"}]'` - `clawdbot browser dialog --accept` - `clawdbot browser wait --text "Done"` - `clawdbot browser wait "#main" --url "**/dash" --load networkidle --fn "window.ready===true"` - `clawdbot browser evaluate --fn '(el) => el.textContent' --ref 7` - `clawdbot browser highlight e12` - `clawdbot browser trace start` - `clawdbot browser trace stop` State: - `clawdbot browser cookies` - `clawdbot browser cookies set session abc123 --url "https://example.com"` - `clawdbot browser cookies clear` - `clawdbot browser storage local get` - `clawdbot browser storage local set theme dark` - `clawdbot browser storage session clear` - `clawdbot browser set offline on` - `clawdbot browser set headers --json '{"X-Debug":"1"}'` - `clawdbot browser set credentials user pass` - `clawdbot browser set credentials --clear` - `clawdbot browser set geo 37.7749 -122.4194 --origin "https://example.com"` - `clawdbot browser set geo --clear` - `clawdbot browser set media dark` - `clawdbot browser set timezone America/New_York` - `clawdbot browser set locale en-US` - `clawdbot browser set device "iPhone 14"` Notes: - `upload` and `dialog` are **arming** calls; run them before the click/press that triggers the chooser/dialog. - `upload` can also set file inputs directly via `--input-ref` or `--element`. - `snapshot`: - `--format ai` (default when Playwright is installed): returns an AI snapshot with numeric refs (`aria-ref=""`). - `--format aria`: returns the accessibility tree (no refs; inspection only). - `--efficient` (or `--mode efficient`): compact role snapshot preset (interactive + compact + depth + lower maxChars). - Role snapshot options (`--interactive`, `--compact`, `--depth`, `--selector`) force a role-based snapshot with refs like `ref=e12`. - `--frame "