docs: reorganize documentation structure
This commit is contained in:
21
docs/tools/agent-send.md
Normal file
21
docs/tools/agent-send.md
Normal file
@@ -0,0 +1,21 @@
|
||||
---
|
||||
summary: "Design notes for a direct `clawdbot agent` CLI subcommand without WhatsApp delivery"
|
||||
read_when:
|
||||
- Adding or modifying the agent CLI entrypoint
|
||||
---
|
||||
# `clawdbot agent` (direct-to-agent invocation)
|
||||
|
||||
`clawdbot agent` lets you talk to the **embedded** agent runtime directly (no chat send unless you opt in), while reusing the same session store and thinking/verbose persistence as inbound auto-replies.
|
||||
|
||||
## Behavior
|
||||
- Required: `--message <text>`
|
||||
- Session selection:
|
||||
- If `--session-id` is given, reuse it.
|
||||
- Else if `--to <e164>` is given, derive the session key from `session.scope` (direct chats collapse to `main`, or `global` when scope is global).
|
||||
- Runs the embedded Pi agent (configured via `agent`).
|
||||
- Thinking/verbose:
|
||||
- Flags `--thinking <off|minimal|low|medium|high>` and `--verbose <on|off>` persist into the session store.
|
||||
- Output:
|
||||
- Default: prints text (and `MEDIA:<url>` lines) to stdout.
|
||||
- `--json`: prints structured payloads + meta.
|
||||
- Optional: `--deliver` sends the reply back to the selected provider (`whatsapp`, `telegram`, `discord`, `signal`, `imessage`).
|
||||
32
docs/tools/bash.md
Normal file
32
docs/tools/bash.md
Normal file
@@ -0,0 +1,32 @@
|
||||
---
|
||||
summary: "Bash tool usage, stdin modes, and TTY support"
|
||||
read_when:
|
||||
- Using or modifying the bash tool
|
||||
- Debugging stdin or TTY behavior
|
||||
---
|
||||
|
||||
# Bash tool
|
||||
|
||||
Run shell commands in the workspace. Supports foreground + background execution via `process`.
|
||||
|
||||
## Parameters
|
||||
|
||||
- `command` (required)
|
||||
- `yieldMs` (default 10000): auto-background after delay
|
||||
- `background` (bool): background immediately
|
||||
- `timeout` (seconds, default 1800): kill on expiry
|
||||
- `elevated` (bool): run on host if elevated mode is enabled/allowed
|
||||
- Need a real TTY? Use the tmux skill.
|
||||
|
||||
## Examples
|
||||
|
||||
Foreground:
|
||||
```json
|
||||
{"tool":"bash","command":"ls -la"}
|
||||
```
|
||||
|
||||
Background + poll:
|
||||
```json
|
||||
{"tool":"bash","command":"npm run build","yieldMs":1000}
|
||||
{"tool":"process","action":"poll","sessionId":"<id>"}
|
||||
```
|
||||
114
docs/tools/browser-linux-troubleshooting.md
Normal file
114
docs/tools/browser-linux-troubleshooting.md
Normal file
@@ -0,0 +1,114 @@
|
||||
---
|
||||
summary: "Fix Chrome/Chromium CDP startup issues for Clawdbot browser control on Linux"
|
||||
read_when: "Browser control fails on Linux, especially with snap Chromium"
|
||||
---
|
||||
|
||||
# Browser Troubleshooting (Linux)
|
||||
|
||||
## Problem: "Failed to start Chrome CDP on port 18800"
|
||||
|
||||
Clawdbot's browser control server fails to launch Chrome/Chromium with the error:
|
||||
```
|
||||
{"error":"Error: Failed to start Chrome CDP on port 18800 for profile \"clawd\"."}
|
||||
```
|
||||
|
||||
### Root Cause
|
||||
|
||||
On Ubuntu (and many Linux distros), the default Chromium installation is a **snap package**. Snap's AppArmor confinement interferes with how Clawdbot spawns and monitors the browser process.
|
||||
|
||||
The `apt install chromium` command installs a stub package that redirects to snap:
|
||||
```
|
||||
Note, selecting 'chromium-browser' instead of 'chromium'
|
||||
chromium-browser is already the newest version (2:1snap1-0ubuntu2).
|
||||
```
|
||||
|
||||
This is NOT a real browser — it's just a wrapper.
|
||||
|
||||
### Solution 1: Install Google Chrome (Recommended)
|
||||
|
||||
Install the official Google Chrome `.deb` package, which is not sandboxed by snap:
|
||||
|
||||
```bash
|
||||
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
|
||||
sudo dpkg -i google-chrome-stable_current_amd64.deb
|
||||
sudo apt --fix-broken install -y # if there are dependency errors
|
||||
```
|
||||
|
||||
Then update your Clawdbot config (`~/.clawdbot/clawdbot.json`):
|
||||
|
||||
```json
|
||||
{
|
||||
"browser": {
|
||||
"enabled": true,
|
||||
"executablePath": "/usr/bin/google-chrome-stable",
|
||||
"headless": true,
|
||||
"noSandbox": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Solution 2: Use Snap Chromium with Attach-Only Mode
|
||||
|
||||
If you must use snap Chromium, configure Clawdbot to attach to a manually-started browser:
|
||||
|
||||
1. Update config:
|
||||
```json
|
||||
{
|
||||
"browser": {
|
||||
"enabled": true,
|
||||
"attachOnly": true,
|
||||
"headless": true,
|
||||
"noSandbox": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
2. Start Chromium manually:
|
||||
```bash
|
||||
chromium-browser --headless --no-sandbox --disable-gpu \
|
||||
--remote-debugging-port=18800 \
|
||||
--user-data-dir=$HOME/.clawdbot/browser/clawd/user-data \
|
||||
about:blank &
|
||||
```
|
||||
|
||||
3. Optionally create a systemd user service to auto-start Chrome:
|
||||
```ini
|
||||
# ~/.config/systemd/user/clawd-browser.service
|
||||
[Unit]
|
||||
Description=Clawd Browser (Chrome CDP)
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
ExecStart=/snap/bin/chromium --headless --no-sandbox --disable-gpu --remote-debugging-port=18800 --user-data-dir=%h/.clawdbot/browser/clawd/user-data about:blank
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
```
|
||||
|
||||
Enable with: `systemctl --user enable --now clawd-browser.service`
|
||||
|
||||
### Verifying the Browser Works
|
||||
|
||||
Check status:
|
||||
```bash
|
||||
curl -s http://127.0.0.1:18791/ | jq '{running, pid, chosenBrowser}'
|
||||
```
|
||||
|
||||
Test browsing:
|
||||
```bash
|
||||
curl -s -X POST http://127.0.0.1:18791/start
|
||||
curl -s http://127.0.0.1:18791/tabs
|
||||
```
|
||||
|
||||
### Config Reference
|
||||
|
||||
| Option | Description | Default |
|
||||
|--------|-------------|---------|
|
||||
| `browser.enabled` | Enable browser control | `true` |
|
||||
| `browser.executablePath` | Path to Chrome/Chromium binary | auto-detected |
|
||||
| `browser.headless` | Run without GUI | `false` |
|
||||
| `browser.noSandbox` | Add `--no-sandbox` flag (needed for some Linux setups) | `false` |
|
||||
| `browser.attachOnly` | Don't launch browser, only attach to existing | `false` |
|
||||
| `browser.cdpPort` | Chrome DevTools Protocol port | `18800` |
|
||||
309
docs/tools/browser.md
Normal file
309
docs/tools/browser.md
Normal file
@@ -0,0 +1,309 @@
|
||||
---
|
||||
summary: "Spec: integrated browser control server + action commands"
|
||||
read_when:
|
||||
- Adding agent-controlled browser automation
|
||||
- Debugging why clawd is interfering with your own Chrome
|
||||
- Implementing browser settings + lifecycle in the macOS app
|
||||
---
|
||||
|
||||
# Browser (integrated) — clawd-managed Chrome
|
||||
|
||||
Status: draft spec · Date: 2025-12-20
|
||||
|
||||
Goal: give the **clawd** persona its own browser that is:
|
||||
- Visually distinct (lobster-orange, profile labeled "clawd").
|
||||
- Fully agent-manageable (start/stop, list tabs, focus/close tabs, open URLs, screenshot).
|
||||
- Non-interfering with the user's own browser (separate profile + dedicated ports).
|
||||
|
||||
This doc covers the macOS app/gateway side. It intentionally does not mandate
|
||||
Playwright vs Puppeteer; the key is the **contract** and the **separation guarantees**.
|
||||
|
||||
## User-facing settings
|
||||
|
||||
Add a dedicated settings section (preferably under **Skills** or its own "Browser" tab):
|
||||
|
||||
- **Enable clawd browser** (`default: on`)
|
||||
- When off: no browser is launched, and browser tools return "disabled".
|
||||
- **Browser control URL** (`default: http://127.0.0.1:18791`)
|
||||
- Interpreted as the base URL of the local/remote browser-control server.
|
||||
- If the URL host is not loopback, Clawdbot must **not** attempt to launch a local
|
||||
browser; it only connects.
|
||||
- **CDP URL** (`default: controlUrl + 1`)
|
||||
- Base URL for Chrome DevTools Protocol (e.g. `http://127.0.0.1:18792`).
|
||||
- Set this to a non-loopback host to attach the local control server to a remote
|
||||
Chrome/Chromium CDP endpoint (SSH/Tailscale tunnel recommended).
|
||||
- If the CDP URL host is non-loopback, clawd does **not** auto-launch a local browser.
|
||||
- If you tunnel a remote CDP to `localhost`, set **Attach to existing only** to
|
||||
avoid accidentally launching a local browser.
|
||||
- **Accent color** (`default: #FF4500`, "lobster-orange")
|
||||
- Used to theme the clawd browser profile (best-effort) and to tint UI indicators
|
||||
in Clawdbot.
|
||||
|
||||
Optional (advanced, can be hidden behind Debug initially):
|
||||
- **Use headless browser** (`default: off`)
|
||||
- **Attach to existing only** (`default: off`) — if on, never launch; only connect if
|
||||
already running.
|
||||
- **Browser executable path** (override, optional)
|
||||
- **No sandbox** (`default: off`) — adds `--no-sandbox` + `--disable-setuid-sandbox`
|
||||
|
||||
### Port convention
|
||||
|
||||
Clawdbot already uses:
|
||||
- Gateway WebSocket: `18789`
|
||||
- Bridge (voice/node): `18790`
|
||||
|
||||
For the clawd browser-control server, use "family" ports:
|
||||
- Browser control HTTP API: `18791` (bridge + 1)
|
||||
- Browser CDP/debugging port: `18792` (control + 1)
|
||||
- Canvas host HTTP: `18793` by default, mounted at `/__clawdbot__/canvas/`
|
||||
|
||||
The user usually only configures the **control URL** (port `18791`). CDP is an
|
||||
internal detail.
|
||||
|
||||
## Browser isolation guarantees (non-negotiable)
|
||||
|
||||
1) **Dedicated user data dir**
|
||||
- Never attach to or reuse the user's default Chrome profile.
|
||||
- Store clawd browser state under an app-owned directory, e.g.:
|
||||
- `~/Library/Application Support/Clawdbot/browser/clawd/` (mac app)
|
||||
- or `~/.clawdbot/browser/clawd/` (gateway/CLI)
|
||||
|
||||
2) **Dedicated ports**
|
||||
- Never use `9222` (reserved for ad-hoc dev workflows; avoids colliding with
|
||||
`agent-tools/browser-tools`).
|
||||
- Default ports are `18791/18792` unless overridden.
|
||||
|
||||
3) **Named tab/page management**
|
||||
- The agent must be able to enumerate and target tabs deterministically (by
|
||||
stable `targetId` or equivalent), not "last tab".
|
||||
|
||||
## Browser selection (macOS + Linux)
|
||||
|
||||
On startup (when enabled + local URL), Clawdbot chooses the browser executable
|
||||
in this order:
|
||||
1) **Google Chrome Canary** (if installed)
|
||||
2) **Chromium** (if installed)
|
||||
3) **Google Chrome** (fallback)
|
||||
|
||||
Linux:
|
||||
- Looks for `google-chrome` / `chromium` in common system paths.
|
||||
- Use **Browser executable path** to force a specific binary.
|
||||
|
||||
Implementation detail:
|
||||
- macOS: detection is by existence of the `.app` bundle under `/Applications`
|
||||
(and optionally `~/Applications`), then using the resolved executable path.
|
||||
- Linux: common `/usr/bin`/`/snap/bin` paths.
|
||||
|
||||
Rationale:
|
||||
- Canary/Chromium are easy to visually distinguish from the user's daily driver.
|
||||
- Chrome fallback ensures the feature works on a stock machine.
|
||||
|
||||
## Visual differentiation ("lobster-orange")
|
||||
|
||||
The clawd browser should be obviously different at a glance:
|
||||
- Profile name: **clawd**
|
||||
- Profile color: **#FF4500**
|
||||
|
||||
Preferred behavior:
|
||||
- Seed/patch the profile's preferences on first launch so the color + name persist.
|
||||
|
||||
Fallback behavior:
|
||||
- If preferences patching is not reliable, open with the dedicated profile and let
|
||||
the user set the profile color/name once via Chrome UI; it must persist because
|
||||
the `userDataDir` is persistent.
|
||||
|
||||
## Control server contract (vNext)
|
||||
|
||||
Expose a small local HTTP API (and/or gateway RPC surface) so the agent can manage
|
||||
state without touching the user's Chrome.
|
||||
|
||||
Basics:
|
||||
- `GET /` status payload (enabled/running/pid/cdpPort/etc)
|
||||
- `POST /start` start browser
|
||||
- `POST /stop` stop browser
|
||||
- `GET /tabs` list tabs
|
||||
- `POST /tabs/open` open a new tab
|
||||
- `POST /tabs/focus` focus a tab by id/prefix
|
||||
- `DELETE /tabs/:targetId` close a tab by id/prefix
|
||||
|
||||
Inspection:
|
||||
- `POST /screenshot` `{ targetId?, fullPage?, ref?, element?, type? }`
|
||||
- `GET /snapshot` `?format=aria|ai&targetId?&limit?`
|
||||
- `GET /console` `?level?&targetId?`
|
||||
- `POST /pdf` `{ targetId? }`
|
||||
|
||||
Actions:
|
||||
- `POST /navigate`
|
||||
- `POST /act` `{ kind, targetId?, ... }` where `kind` is one of:
|
||||
- `click`, `type`, `press`, `hover`, `drag`, `select`, `fill`, `wait`, `resize`, `close`, `evaluate`
|
||||
|
||||
Hooks (arming):
|
||||
- `POST /hooks/file-chooser` `{ targetId?, paths, timeoutMs? }`
|
||||
- `POST /hooks/dialog` `{ targetId?, accept, promptText?, timeoutMs? }`
|
||||
|
||||
### "Is it open or closed?"
|
||||
|
||||
"Open" means:
|
||||
- the control server is reachable at the configured URL **and**
|
||||
- it reports a live browser connection.
|
||||
|
||||
"Closed" means:
|
||||
- control server not reachable, or server reports no browser.
|
||||
|
||||
Clawdbot should treat "open/closed" as a health check (fast path), not by scanning
|
||||
global Chrome processes (avoid false positives).
|
||||
|
||||
## Multi-profile support
|
||||
|
||||
Clawdbot supports multiple named browser profiles, each with:
|
||||
- Dedicated CDP port (auto-allocated from 18800-18899) **or** a per-profile CDP URL
|
||||
- Persistent user data directory (`~/.clawdbot/browser/<name>/user-data/`)
|
||||
- Unique color for visual distinction
|
||||
|
||||
### Configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"browser": {
|
||||
"enabled": true,
|
||||
"defaultProfile": "clawd",
|
||||
"profiles": {
|
||||
"clawd": { "cdpPort": 18800, "color": "#FF4500" },
|
||||
"work": { "cdpPort": 18801, "color": "#0066CC" },
|
||||
"remote": { "cdpUrl": "http://10.0.0.42:9222", "color": "#00AA00" }
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Profile actions
|
||||
|
||||
- `GET /profiles` — list all profiles with status
|
||||
- `POST /profiles/create` `{ name, color?, cdpUrl? }` — create new profile (auto-allocates port if no `cdpUrl`)
|
||||
- `DELETE /profiles/:name` — delete profile (stops browser + removes user data for local profiles)
|
||||
- `POST /reset-profile?profile=<name>` — kill orphan process on profile's port (local profiles only)
|
||||
|
||||
### Profile parameter
|
||||
|
||||
All existing endpoints accept optional `?profile=<name>` query parameter:
|
||||
- `GET /?profile=work` — status for work profile
|
||||
- `POST /start?profile=work` — start work profile browser
|
||||
- `GET /tabs?profile=work` — list tabs for work profile
|
||||
- `POST /tabs/open?profile=work` — open tab in work profile
|
||||
- etc.
|
||||
|
||||
When `profile` is omitted, uses `browser.defaultProfile` (defaults to "clawd").
|
||||
|
||||
### Agent browser tool
|
||||
|
||||
The `browser` tool accepts an optional `profile` parameter for all actions:
|
||||
|
||||
```json
|
||||
{
|
||||
"action": "open",
|
||||
"targetUrl": "https://example.com",
|
||||
"profile": "work"
|
||||
}
|
||||
```
|
||||
|
||||
This routes the operation to the specified profile's browser instance. Omitting
|
||||
`profile` uses the default profile.
|
||||
|
||||
### Profile naming rules
|
||||
|
||||
- Lowercase alphanumeric characters and hyphens only
|
||||
- Must start with a letter or number (not a hyphen)
|
||||
- Maximum 64 characters
|
||||
- Examples: `clawd`, `work`, `my-project-1`
|
||||
|
||||
### Port allocation
|
||||
|
||||
Ports are allocated from range 18800-18899 (~100 profiles max). This is far more
|
||||
than practical use — memory and CPU exhaustion occur well before port exhaustion.
|
||||
Ports are allocated once at profile creation and persisted permanently.
|
||||
Remote profiles are attach-only and do **not** use the local port range.
|
||||
## Interaction with the agent (clawd)
|
||||
|
||||
The agent should use browser tools only when:
|
||||
- enabled in settings
|
||||
- control URL is configured
|
||||
|
||||
If disabled, tools must fail fast with a friendly error ("Browser disabled in settings").
|
||||
|
||||
The agent should not assume tabs are ephemeral. It should:
|
||||
- call `browser.tabs.list` to discover existing tabs first
|
||||
- reuse an existing tab when appropriate (e.g. a persistent "main" tab)
|
||||
- avoid opening duplicate tabs unless asked
|
||||
|
||||
## CLI quick reference (one example each)
|
||||
|
||||
All commands accept `--browser-profile <name>` to target a specific profile (default: `clawd`).
|
||||
|
||||
Profile management:
|
||||
- `clawdbot browser profiles`
|
||||
- `clawdbot browser create-profile --name work`
|
||||
- `clawdbot browser create-profile --name remote --cdp-url http://10.0.0.42:9222`
|
||||
- `clawdbot browser delete-profile --name work`
|
||||
Basics:
|
||||
- `clawdbot browser status`
|
||||
- `clawdbot browser start`
|
||||
- `clawdbot browser stop`
|
||||
- `clawdbot browser reset-profile`
|
||||
- `clawdbot browser tabs`
|
||||
- `clawdbot browser open https://example.com`
|
||||
- `clawdbot browser focus abcd1234`
|
||||
- `clawdbot browser close abcd1234`
|
||||
|
||||
Inspection:
|
||||
- `clawdbot browser screenshot`
|
||||
- `clawdbot browser screenshot --full-page`
|
||||
- `clawdbot browser screenshot --ref 12`
|
||||
- `clawdbot browser snapshot`
|
||||
- `clawdbot browser snapshot --format aria --limit 200`
|
||||
|
||||
Actions:
|
||||
- `clawdbot browser navigate https://example.com`
|
||||
- `clawdbot browser resize 1280 720`
|
||||
- `clawdbot browser click 12 --double`
|
||||
- `clawdbot browser type 23 "hello" --submit`
|
||||
- `clawdbot browser press Enter`
|
||||
- `clawdbot browser hover 44`
|
||||
- `clawdbot browser drag 10 11`
|
||||
- `clawdbot browser select 9 OptionA OptionB`
|
||||
- `clawdbot browser upload /tmp/file.pdf`
|
||||
- `clawdbot browser fill --fields '[{\"ref\":\"1\",\"value\":\"Ada\"}]'`
|
||||
- `clawdbot browser dialog --accept`
|
||||
- `clawdbot browser wait --text "Done"`
|
||||
- `clawdbot browser evaluate --fn '(el) => el.textContent' --ref 7`
|
||||
- `clawdbot browser evaluate --fn "document.querySelector('.my-class').click()"`
|
||||
- `clawdbot browser console --level error`
|
||||
- `clawdbot browser pdf`
|
||||
|
||||
Notes:
|
||||
- `upload` and `dialog` are **arming** calls; run them before the click/press that triggers the chooser/dialog.
|
||||
- `upload` can take a `ref` to auto-click after arming (useful for single-step file uploads).
|
||||
- `upload` can also take `inputRef` (aria ref) or `element` (CSS selector) to set `<input type="file">` directly without waiting for a file chooser.
|
||||
- The arm default timeout is **2 minutes** (clamped to max 2 minutes); pass `timeoutMs` if you need shorter.
|
||||
- `snapshot` defaults to `ai`; `aria` returns an accessibility tree for debugging.
|
||||
- `click`/`type` require `ref` from `snapshot --format ai`; use `evaluate` for rare CSS selector one-offs.
|
||||
- Avoid `wait` by default; use it only in exceptional cases when there is no reliable UI state to wait on.
|
||||
|
||||
## Security & privacy notes
|
||||
|
||||
- The clawd browser profile is app-owned; it may contain logged-in sessions.
|
||||
Treat it as sensitive data.
|
||||
- The control server must bind to loopback only by default (`127.0.0.1`) unless the
|
||||
user explicitly configures a non-loopback URL.
|
||||
- Never reuse or copy the user's default Chrome profile.
|
||||
- Remote CDP endpoints should be tunneled or protected; CDP is highly privileged.
|
||||
|
||||
## Non-goals (for the first cut)
|
||||
|
||||
- Cross-device "sync" of tabs between Mac and Pi.
|
||||
- Sharing the user's logged-in Chrome sessions automatically.
|
||||
- General-purpose web scraping; this is primarily for "close-the-loop" verification
|
||||
and interaction.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
For Linux-specific issues (especially Ubuntu with snap Chromium), see [browser-linux-troubleshooting](/browser-linux-troubleshooting).
|
||||
203
docs/tools/clawdhub.md
Normal file
203
docs/tools/clawdhub.md
Normal file
@@ -0,0 +1,203 @@
|
||||
---
|
||||
summary: "ClawdHub guide: public skills registry + CLI workflows"
|
||||
read_when:
|
||||
- Introducing ClawdHub to new users
|
||||
- Installing, searching, or publishing skills
|
||||
- Explaining ClawdHub CLI flags and sync behavior
|
||||
---
|
||||
|
||||
# ClawdHub
|
||||
|
||||
ClawdHub is the **public skill registry for Clawdbot**. It is a free service: all skills are public, open, and visible to everyone for sharing and reuse. A skill is just a folder with a `SKILL.md` file (plus supporting text files). You can browse skills in the web app or use the CLI to search, install, update, and publish skills.
|
||||
|
||||
Site: [clawdhub.com](https://clawdhub.com)
|
||||
|
||||
## Who this is for (beginner-friendly)
|
||||
|
||||
If you want to add new capabilities to your Clawdbot agent, ClawdHub is the easiest way to find and install skills. You do not need to know how the backend works. You can:
|
||||
|
||||
- Search for skills by plain language.
|
||||
- Install a skill into your workspace.
|
||||
- Update skills later with one command.
|
||||
- Back up your own skills by publishing them.
|
||||
|
||||
## Quick start (non-technical)
|
||||
|
||||
1) Install the CLI (see next section).
|
||||
2) Search for something you need:
|
||||
- `clawdhub search "calendar"`
|
||||
3) Install a skill:
|
||||
- `clawdhub install <skill-slug>`
|
||||
4) Start a new Clawdbot session so it picks up the new skill.
|
||||
|
||||
## Install the CLI
|
||||
|
||||
Pick one:
|
||||
|
||||
```bash
|
||||
npm i -g clawdhub
|
||||
```
|
||||
|
||||
```bash
|
||||
pnpm add -g clawdhub
|
||||
```
|
||||
|
||||
```bash
|
||||
bun add -g clawdhub
|
||||
```
|
||||
|
||||
## How it fits into Clawdbot
|
||||
|
||||
By default, the CLI installs skills into `./skills` under your current working directory. Clawdbot loads workspace skills from `<workspace>/skills` and will pick them up in the **next** session. If you already use `~/.clawdbot/skills` or bundled skills, workspace skills take precedence.
|
||||
|
||||
For more detail on how skills are loaded and gated, see `docs/skills.md`.
|
||||
|
||||
## What the service provides (features)
|
||||
|
||||
- **Public browsing** of skills and their `SKILL.md` content.
|
||||
- **Search** powered by embeddings (vector search), not just keywords.
|
||||
- **Versioning** with semver, changelogs, and tags (including `latest`).
|
||||
- **Downloads** as a zip per version.
|
||||
- **Stars and comments** for community feedback.
|
||||
- **Moderation** hooks for approvals and audits.
|
||||
- **CLI-friendly API** for automation and scripting.
|
||||
|
||||
## CLI commands and parameters
|
||||
|
||||
Global options (apply to all commands):
|
||||
|
||||
- `--workdir <dir>`: Working directory (default: current dir).
|
||||
- `--dir <dir>`: Skills directory, relative to workdir (default: `skills`).
|
||||
- `--site <url>`: Site base URL (browser login).
|
||||
- `--registry <url>`: Registry API base URL.
|
||||
- `--no-input`: Disable prompts (non-interactive).
|
||||
- `-V, --cli-version`: Print CLI version.
|
||||
|
||||
Auth:
|
||||
|
||||
- `clawdhub login` (browser flow) or `clawdhub login --token <token>`
|
||||
- `clawdhub logout`
|
||||
- `clawdhub whoami`
|
||||
|
||||
Options:
|
||||
|
||||
- `--token <token>`: Paste an API token.
|
||||
- `--label <label>`: Label stored for browser login tokens (default: `CLI token`).
|
||||
- `--no-browser`: Do not open a browser (requires `--token`).
|
||||
|
||||
Search:
|
||||
|
||||
- `clawdhub search "query"`
|
||||
- `--limit <n>`: Max results.
|
||||
|
||||
Install:
|
||||
|
||||
- `clawdhub install <slug>`
|
||||
- `--version <version>`: Install a specific version.
|
||||
- `--force`: Overwrite if the folder already exists.
|
||||
|
||||
Update:
|
||||
|
||||
- `clawdhub update <slug>`
|
||||
- `clawdhub update --all`
|
||||
- `--version <version>`: Update to a specific version (single slug only).
|
||||
- `--force`: Overwrite when local files do not match any published version.
|
||||
|
||||
List:
|
||||
|
||||
- `clawdhub list` (reads `.clawdhub/lock.json`)
|
||||
|
||||
Publish:
|
||||
|
||||
- `clawdhub publish <path>`
|
||||
- `--slug <slug>`: Skill slug.
|
||||
- `--name <name>`: Display name.
|
||||
- `--version <version>`: Semver version.
|
||||
- `--changelog <text>`: Changelog text (can be empty).
|
||||
- `--tags <tags>`: Comma-separated tags (default: `latest`).
|
||||
|
||||
Delete/undelete (owner/admin only):
|
||||
|
||||
- `clawdhub delete <slug> --yes`
|
||||
- `clawdhub undelete <slug> --yes`
|
||||
|
||||
Sync (scan local skills + publish new/updated):
|
||||
|
||||
- `clawdhub sync`
|
||||
- `--root <dir...>`: Extra scan roots.
|
||||
- `--all`: Upload everything without prompts.
|
||||
- `--dry-run`: Show what would be uploaded.
|
||||
- `--bump <type>`: `patch|minor|major` for updates (default: `patch`).
|
||||
- `--changelog <text>`: Changelog for non-interactive updates.
|
||||
- `--tags <tags>`: Comma-separated tags (default: `latest`).
|
||||
- `--concurrency <n>`: Registry checks (default: 4).
|
||||
|
||||
## Common workflows for agents
|
||||
|
||||
### Search for skills
|
||||
|
||||
```bash
|
||||
clawdhub search "postgres backups"
|
||||
```
|
||||
|
||||
### Download new skills
|
||||
|
||||
```bash
|
||||
clawdhub install my-skill-pack
|
||||
```
|
||||
|
||||
### Update installed skills
|
||||
|
||||
```bash
|
||||
clawdhub update --all
|
||||
```
|
||||
|
||||
### Back up your skills (publish or sync)
|
||||
|
||||
For a single skill folder:
|
||||
|
||||
```bash
|
||||
clawdhub publish ./my-skill --slug my-skill --name "My Skill" --version 1.0.0 --tags latest
|
||||
```
|
||||
|
||||
To scan and back up many skills at once:
|
||||
|
||||
```bash
|
||||
clawdhub sync --all
|
||||
```
|
||||
|
||||
## Advanced details (technical)
|
||||
|
||||
### Versioning and tags
|
||||
|
||||
- Each publish creates a new **semver** `SkillVersion`.
|
||||
- Tags (like `latest`) point to a version; moving tags lets you roll back.
|
||||
- Changelogs are attached per version and can be empty when syncing or publishing updates.
|
||||
|
||||
### Local changes vs registry versions
|
||||
|
||||
Updates compare the local skill contents to registry versions using a content hash. If local files do not match any published version, the CLI asks before overwriting (or requires `--force` in non-interactive runs).
|
||||
|
||||
### Sync scanning and fallback roots
|
||||
|
||||
`clawdhub sync` scans your current workdir first. If no skills are found, it falls back to known legacy locations (for example `~/clawdbot/skills` and `~/.clawdbot/skills`). This is designed to find older skill installs without extra flags.
|
||||
|
||||
### Storage and lockfile
|
||||
|
||||
- Installed skills are recorded in `.clawdhub/lock.json` under your workdir.
|
||||
- Auth tokens are stored in the ClawdHub CLI config file (override via `CLAWDHUB_CONFIG_PATH`).
|
||||
|
||||
### Telemetry (install counts)
|
||||
|
||||
When you run `clawdhub sync` while logged in, the CLI sends a minimal snapshot to compute install counts. You can disable this entirely:
|
||||
|
||||
```bash
|
||||
export CLAWDHUB_DISABLE_TELEMETRY=1
|
||||
```
|
||||
|
||||
## Environment variables
|
||||
|
||||
- `CLAWDHUB_SITE`: Override the site URL.
|
||||
- `CLAWDHUB_REGISTRY`: Override the registry API URL.
|
||||
- `CLAWDHUB_CONFIG_PATH`: Override where the CLI stores the token/config.
|
||||
- `CLAWDHUB_DISABLE_TELEMETRY=1`: Disable telemetry on `sync`.
|
||||
31
docs/tools/elevated.md
Normal file
31
docs/tools/elevated.md
Normal file
@@ -0,0 +1,31 @@
|
||||
---
|
||||
summary: "Elevated bash mode and /elevated directives"
|
||||
read_when:
|
||||
- Adjusting elevated mode defaults, allowlists, or slash command behavior
|
||||
---
|
||||
# Elevated Mode (/elevated directives)
|
||||
|
||||
## What it does
|
||||
- Elevated mode allows the bash tool to run with elevated privileges when the feature is available and the sender is approved.
|
||||
- Directive forms: `/elevated on`, `/elevated off`, `/elev on`, `/elev off`.
|
||||
- Only `on|off` are accepted; anything else returns a hint and does not change state.
|
||||
|
||||
## Resolution order
|
||||
1. Inline directive on the message (applies only to that message).
|
||||
2. Session override (set by sending a directive-only message).
|
||||
3. Global default (`agent.elevatedDefault` in config).
|
||||
|
||||
## Setting a session default
|
||||
- Send a message that is **only** the directive (whitespace allowed), e.g. `/elevated on`.
|
||||
- Confirmation reply is sent (`Elevated mode enabled.` / `Elevated mode disabled.`).
|
||||
- If elevated access is disabled or the sender is not on the approved allowlist, the directive replies `elevated is not available right now.` and does not change session state.
|
||||
|
||||
## Availability + allowlists
|
||||
- Feature gate: `agent.elevated.enabled` (default can be off via config even if the code supports it).
|
||||
- Sender allowlist: `agent.elevated.allowFrom` with per-provider allowlists (e.g. `discord`, `whatsapp`).
|
||||
- Both must pass; otherwise elevated is treated as unavailable.
|
||||
- Discord fallback: if `agent.elevated.allowFrom.discord` is omitted, the `discord.dm.allowFrom` list is used as a fallback. Set `agent.elevated.allowFrom.discord` (even `[]`) to override.
|
||||
|
||||
## Logging + status
|
||||
- Elevated bash calls are logged at info level.
|
||||
- Session status includes elevated mode (e.g. `elevated=on`).
|
||||
263
docs/tools/index.md
Normal file
263
docs/tools/index.md
Normal file
@@ -0,0 +1,263 @@
|
||||
---
|
||||
summary: "Agent tool surface for Clawdbot (browser, canvas, nodes, cron) replacing clawdbot-* skills"
|
||||
read_when:
|
||||
- Adding or modifying agent tools
|
||||
- Retiring or changing clawdbot-* skills
|
||||
---
|
||||
|
||||
# Tools (Clawdbot)
|
||||
|
||||
Clawdbot exposes **first-class agent tools** for browser, canvas, nodes, and cron.
|
||||
These replace the old `clawdbot-*` skills: the tools are typed, no shelling,
|
||||
and the agent should rely on them directly.
|
||||
|
||||
## Disabling tools
|
||||
|
||||
You can globally allow/deny tools via `agent.tools` in `clawdbot.json`
|
||||
(deny wins). This prevents disallowed tools from being sent to providers.
|
||||
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
tools: {
|
||||
deny: ["browser"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Tool inventory
|
||||
|
||||
### `bash`
|
||||
Run shell commands in the workspace.
|
||||
|
||||
Core parameters:
|
||||
- `command` (required)
|
||||
- `yieldMs` (auto-background after timeout, default 10000)
|
||||
- `background` (immediate background)
|
||||
- `timeout` (seconds; kills the process if exceeded, default 1800)
|
||||
- `elevated` (bool; run on host if elevated mode is enabled/allowed)
|
||||
- Need a real TTY? Use the tmux skill.
|
||||
|
||||
Notes:
|
||||
- Returns `status: "running"` with a `sessionId` when backgrounded.
|
||||
- Use `process` to poll/log/write/kill/clear background sessions.
|
||||
|
||||
### `process`
|
||||
Manage background bash sessions.
|
||||
|
||||
Core actions:
|
||||
- `list`, `poll`, `log`, `write`, `kill`, `clear`, `remove`
|
||||
|
||||
Notes:
|
||||
- `poll` returns new output and exit status when complete.
|
||||
- `log` supports line-based `offset`/`limit` (omit `offset` to grab the last N lines).
|
||||
|
||||
### `browser`
|
||||
Control the dedicated clawd browser.
|
||||
|
||||
Core actions:
|
||||
- `status`, `start`, `stop`, `tabs`, `open`, `focus`, `close`
|
||||
- `snapshot` (aria/ai)
|
||||
- `screenshot` (returns image block + `MEDIA:<path>`)
|
||||
- `act` (UI actions: click/type/press/hover/drag/select/fill/resize/wait/evaluate)
|
||||
- `navigate`, `console`, `pdf`, `upload`, `dialog`
|
||||
|
||||
Profile management:
|
||||
- `profiles` — list all browser profiles with status
|
||||
- `create-profile` — create new profile with auto-allocated port (or `cdpUrl`)
|
||||
- `delete-profile` — stop browser, delete user data, remove from config (local only)
|
||||
- `reset-profile` — kill orphan process on profile's port (local only)
|
||||
|
||||
Common parameters:
|
||||
- `controlUrl` (defaults from config)
|
||||
- `profile` (optional; defaults to `browser.defaultProfile`)
|
||||
Notes:
|
||||
- Requires `browser.enabled=true` (default is `true`; set `false` to disable).
|
||||
- Uses `browser.controlUrl` unless `controlUrl` is passed explicitly.
|
||||
- All actions accept optional `profile` parameter for multi-instance support.
|
||||
- When `profile` is omitted, uses `browser.defaultProfile` (defaults to "clawd").
|
||||
- Profile names: lowercase alphanumeric + hyphens only (max 64 chars).
|
||||
- Port range: 18800-18899 (~100 profiles max).
|
||||
- Remote profiles are attach-only (no start/stop/reset).
|
||||
- `snapshot` defaults to `ai`; use `aria` for the accessibility tree.
|
||||
- `act` requires `ref` from `snapshot --format ai`; use `evaluate` for rare CSS selector needs.
|
||||
- Avoid `act` → `wait` by default; use it only in exceptional cases (no reliable UI state to wait on).
|
||||
- `upload` can optionally pass a `ref` to auto-click after arming.
|
||||
- `upload` also supports `inputRef` (aria ref) or `element` (CSS selector) to set `<input type="file">` directly.
|
||||
|
||||
### `canvas`
|
||||
Drive the node Canvas (present, eval, snapshot, A2UI).
|
||||
|
||||
Core actions:
|
||||
- `present`, `hide`, `navigate`, `eval`
|
||||
- `snapshot` (returns image block + `MEDIA:<path>`)
|
||||
- `a2ui_push`, `a2ui_reset`
|
||||
|
||||
Notes:
|
||||
- Uses gateway `node.invoke` under the hood.
|
||||
- If no `node` is provided, the tool picks a default (single connected node or local mac node).
|
||||
- A2UI is v0.8 only (no `createSurface`); the CLI rejects v0.9 JSONL with line errors.
|
||||
- Quick smoke: `clawdbot canvas a2ui push --text "Hello from A2UI"`.
|
||||
|
||||
### `nodes`
|
||||
Discover and target paired nodes; send notifications; capture camera/screen.
|
||||
|
||||
Core actions:
|
||||
- `status`, `describe`
|
||||
- `pending`, `approve`, `reject` (pairing)
|
||||
- `notify` (macOS `system.notify`)
|
||||
- `camera_snap`, `camera_clip`, `screen_record`
|
||||
- `location_get`
|
||||
|
||||
Notes:
|
||||
- Camera/screen commands require the node app to be foregrounded.
|
||||
- Images return image blocks + `MEDIA:<path>`.
|
||||
- Videos return `FILE:<path>` (mp4).
|
||||
- Location returns a JSON payload (lat/lon/accuracy/timestamp).
|
||||
|
||||
### `image`
|
||||
Analyze an image with the configured image model.
|
||||
|
||||
Core parameters:
|
||||
- `image` (required path or URL)
|
||||
- `prompt` (optional; defaults to "Describe the image.")
|
||||
- `model` (optional override)
|
||||
- `maxBytesMb` (optional size cap)
|
||||
|
||||
Notes:
|
||||
- Only available when `agent.imageModel` is configured (primary or fallbacks).
|
||||
- Uses the image model directly (independent of the main chat model).
|
||||
|
||||
### `cron`
|
||||
Manage Gateway cron jobs and wakeups.
|
||||
|
||||
Core actions:
|
||||
- `status`, `list`
|
||||
- `add`, `update`, `remove`, `run`, `runs`
|
||||
- `wake` (enqueue system event + optional immediate heartbeat)
|
||||
|
||||
Notes:
|
||||
- `add` expects a full cron job object (same schema as `cron.add` RPC).
|
||||
- `update` uses `{ id, patch }`.
|
||||
|
||||
### `gateway`
|
||||
Restart the running Gateway process (in-place).
|
||||
|
||||
Core actions:
|
||||
- `restart` (sends `SIGUSR1` to the current process; `clawdbot gateway`/`gateway-daemon` restart in-place)
|
||||
|
||||
Notes:
|
||||
- Use `delayMs` (defaults to 2000) to avoid interrupting an in-flight reply.
|
||||
|
||||
### `sessions_list` / `sessions_history` / `sessions_send`
|
||||
List sessions, inspect transcript history, or send to another session.
|
||||
|
||||
Core parameters:
|
||||
- `sessions_list`: `kinds?`, `limit?`, `activeMinutes?`, `messageLimit?` (0 = none)
|
||||
- `sessions_history`: `sessionKey`, `limit?`, `includeTools?`
|
||||
- `sessions_send`: `sessionKey`, `message`, `timeoutSeconds?` (0 = fire-and-forget)
|
||||
|
||||
Notes:
|
||||
- `main` is the canonical direct-chat key; global/unknown are hidden.
|
||||
- `messageLimit > 0` fetches last N messages per session (tool messages filtered).
|
||||
- `sessions_send` waits for final completion when `timeoutSeconds > 0`.
|
||||
- `sessions_send` runs a reply‑back ping‑pong (reply `REPLY_SKIP` to stop; max turns via `session.agentToAgent.maxPingPongTurns`, 0–5).
|
||||
- After the ping‑pong, the target agent runs an **announce step**; reply `ANNOUNCE_SKIP` to suppress the announcement.
|
||||
|
||||
### `discord`
|
||||
Send Discord reactions, stickers, or polls.
|
||||
|
||||
Core actions:
|
||||
- `react` (`channelId`, `messageId`, `emoji`)
|
||||
- `reactions` (`channelId`, `messageId`, optional `limit`)
|
||||
- `sticker` (`to`, `stickerIds`, optional `content`)
|
||||
- `poll` (`to`, `question`, `answers`, optional `allowMultiselect`, `durationHours`, `content`)
|
||||
- `permissions` (`channelId`)
|
||||
- `readMessages` (`channelId`, optional `limit`/`before`/`after`/`around`)
|
||||
- `sendMessage` (`to`, `content`, optional `mediaUrl`, `replyTo`)
|
||||
- `editMessage` (`channelId`, `messageId`, `content`)
|
||||
- `deleteMessage` (`channelId`, `messageId`)
|
||||
- `threadCreate` (`channelId`, `name`, optional `messageId`, `autoArchiveMinutes`)
|
||||
- `threadList` (`guildId`, optional `channelId`, `includeArchived`, `before`, `limit`)
|
||||
- `threadReply` (`channelId`, `content`, optional `mediaUrl`, `replyTo`)
|
||||
- `pinMessage`/`unpinMessage` (`channelId`, `messageId`)
|
||||
- `listPins` (`channelId`)
|
||||
- `searchMessages` (`guildId`, `content`, optional `channelId`/`channelIds`, `authorId`/`authorIds`, `limit`)
|
||||
- `memberInfo` (`guildId`, `userId`)
|
||||
- `roleInfo` (`guildId`)
|
||||
- `emojiList` (`guildId`)
|
||||
- `roleAdd`/`roleRemove` (`guildId`, `userId`, `roleId`)
|
||||
- `channelInfo` (`channelId`)
|
||||
- `channelList` (`guildId`)
|
||||
- `voiceStatus` (`guildId`, `userId`)
|
||||
- `eventList` (`guildId`)
|
||||
- `eventCreate` (`guildId`, `name`, `startTime`, optional `endTime`, `description`, `channelId`, `entityType`, `location`)
|
||||
- `timeout` (`guildId`, `userId`, optional `durationMinutes`, `until`, `reason`)
|
||||
- `kick` (`guildId`, `userId`, optional `reason`)
|
||||
- `ban` (`guildId`, `userId`, optional `reason`, `deleteMessageDays`)
|
||||
|
||||
Notes:
|
||||
- `to` accepts `channel:<id>` or `user:<id>`.
|
||||
- Polls require 2–10 answers and default to 24 hours.
|
||||
- `reactions` returns per-emoji user lists (limited to 100 per reaction).
|
||||
- `discord.actions.*` gates Discord tool actions; `roles` + `moderation` default to `false`.
|
||||
- `searchMessages` follows the Discord preview spec (limit max 25, channel/author filters accept arrays).
|
||||
- The tool is only exposed when the current provider is Discord.
|
||||
|
||||
## Parameters (common)
|
||||
|
||||
Gateway-backed tools (`canvas`, `nodes`, `cron`):
|
||||
- `gatewayUrl` (default `ws://127.0.0.1:18789`)
|
||||
- `gatewayToken` (if auth enabled)
|
||||
- `timeoutMs`
|
||||
|
||||
Browser tool:
|
||||
- `controlUrl` (defaults from config)
|
||||
|
||||
## Recommended agent flows
|
||||
|
||||
Browser automation:
|
||||
1) `browser` → `status` / `start`
|
||||
2) `snapshot` (ai or aria)
|
||||
3) `act` (click/type/press)
|
||||
4) `screenshot` if you need visual confirmation
|
||||
|
||||
Canvas render:
|
||||
1) `canvas` → `present`
|
||||
2) `a2ui_push` (optional)
|
||||
3) `snapshot`
|
||||
|
||||
Node targeting:
|
||||
1) `nodes` → `status`
|
||||
2) `describe` on the chosen node
|
||||
3) `notify` / `camera_snap` / `screen_record`
|
||||
|
||||
## Safety
|
||||
|
||||
- Avoid `system.run` (not exposed as a tool).
|
||||
- Respect user consent for camera/screen capture.
|
||||
- Use `status/describe` to ensure permissions before invoking media commands.
|
||||
|
||||
## How the model sees tools (pi-mono internals)
|
||||
|
||||
Tools are exposed to the model in **two parallel channels**:
|
||||
|
||||
1) **System prompt text**: a human-readable list + guidelines.
|
||||
2) **Provider tool schema**: the actual function/tool declarations sent to the model API.
|
||||
|
||||
In pi-mono:
|
||||
- System prompt builder: [`packages/coding-agent/src/core/system-prompt.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/src/core/system-prompt.ts)
|
||||
- Builds the `Available tools:` list from `toolDescriptions`.
|
||||
- Appends skills and project context.
|
||||
- Tool schemas passed to providers:
|
||||
- OpenAI: [`packages/ai/src/providers/openai-responses.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/providers/openai-responses.ts) (`convertTools`)
|
||||
- Anthropic: [`packages/ai/src/providers/anthropic.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/providers/anthropic.ts) (`convertTools`)
|
||||
- Gemini: [`packages/ai/src/providers/google-shared.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/providers/google-shared.ts) (`convertTools`)
|
||||
- Tool execution loop:
|
||||
- Agent loop: [`packages/ai/src/agent/agent-loop.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/agent/agent-loop.ts)
|
||||
- Validates tool arguments and executes tools, then appends `toolResult` messages.
|
||||
|
||||
In Clawdbot:
|
||||
- System prompt append: [`src/agents/system-prompt.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/system-prompt.ts)
|
||||
- Tool list injected via `createClawdbotCodingTools()` in [`src/agents/pi-tools.ts`](https://github.com/clawdbot/clawdbot/blob/main/src/agents/pi-tools.ts)
|
||||
58
docs/tools/skills-config.md
Normal file
58
docs/tools/skills-config.md
Normal file
@@ -0,0 +1,58 @@
|
||||
---
|
||||
summary: "Skills config schema and examples"
|
||||
read_when:
|
||||
- Adding or modifying skills config
|
||||
- Adjusting bundled allowlist or install behavior
|
||||
---
|
||||
# Skills Config
|
||||
|
||||
All skills-related configuration lives under `skills` in `~/.clawdbot/clawdbot.json`.
|
||||
|
||||
```json5
|
||||
{
|
||||
skills: {
|
||||
allowBundled: ["brave-search", "gemini"],
|
||||
load: {
|
||||
extraDirs: [
|
||||
"~/Projects/agent-scripts/skills",
|
||||
"~/Projects/oss/some-skill-pack/skills"
|
||||
]
|
||||
},
|
||||
install: {
|
||||
preferBrew: true,
|
||||
nodeManager: "npm" // npm | pnpm | yarn | bun
|
||||
},
|
||||
entries: {
|
||||
"nano-banana-pro": {
|
||||
enabled: true,
|
||||
apiKey: "GEMINI_KEY_HERE",
|
||||
env: {
|
||||
GEMINI_API_KEY: "GEMINI_KEY_HERE"
|
||||
}
|
||||
},
|
||||
peekaboo: { enabled: true },
|
||||
sag: { enabled: false }
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Fields
|
||||
|
||||
- `allowBundled`: optional allowlist for **bundled** skills only. When set, only
|
||||
bundled skills in the list are eligible (managed/workspace skills unaffected).
|
||||
- `load.extraDirs`: additional skill directories to scan (lowest precedence).
|
||||
- `install.preferBrew`: prefer brew installers when available (default: true).
|
||||
- `install.nodeManager`: node installer preference (`npm` | `pnpm` | `yarn` | `bun`, default: npm).
|
||||
- `entries.<skillKey>`: per-skill overrides.
|
||||
|
||||
Per-skill fields:
|
||||
- `enabled`: set `false` to disable a skill even if it’s bundled/installed.
|
||||
- `env`: environment variables injected for the agent run (only if not already set).
|
||||
- `apiKey`: optional convenience for skills that declare a primary env var.
|
||||
|
||||
## Notes
|
||||
|
||||
- Keys under `entries` map to the skill name by default. If a skill defines
|
||||
`metadata.clawdbot.skillKey`, use that key instead.
|
||||
- Changes to skills are picked up on the next new session.
|
||||
151
docs/tools/skills.md
Normal file
151
docs/tools/skills.md
Normal file
@@ -0,0 +1,151 @@
|
||||
---
|
||||
summary: "Skills: managed vs workspace, gating rules, and config/env wiring"
|
||||
read_when:
|
||||
- Adding or modifying skills
|
||||
- Changing skill gating or load rules
|
||||
---
|
||||
# Skills (Clawdbot)
|
||||
|
||||
Clawdbot uses **[AgentSkills](https://agentskills.io)-compatible** skill folders to teach the agent how to use tools. Each skill is a directory containing a `SKILL.md` with YAML frontmatter and instructions. Clawdbot loads **bundled skills** plus optional local overrides, and filters them at load time based on environment, config, and binary presence.
|
||||
|
||||
## Locations and precedence
|
||||
|
||||
Skills are loaded from **three** places:
|
||||
|
||||
1) **Bundled skills**: shipped with the install (npm package or Clawdbot.app)
|
||||
2) **Managed/local skills**: `~/.clawdbot/skills`
|
||||
3) **Workspace skills**: `<workspace>/skills`
|
||||
|
||||
If a skill name conflicts, precedence is:
|
||||
|
||||
`<workspace>/skills` (highest) → `~/.clawdbot/skills` → bundled skills (lowest)
|
||||
|
||||
Additionally, you can configure extra skill folders (lowest precedence) via
|
||||
`skills.load.extraDirs` in `~/.clawdbot/clawdbot.json`.
|
||||
|
||||
## Format (AgentSkills + Pi-compatible)
|
||||
|
||||
`SKILL.md` must include at least:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: nano-banana-pro
|
||||
description: Generate or edit images via Gemini 3 Pro Image
|
||||
---
|
||||
```
|
||||
|
||||
Notes:
|
||||
- We follow the AgentSkills spec for layout/intent.
|
||||
- The parser used by the embedded agent supports **single-line** frontmatter keys only.
|
||||
- `metadata` should be a **single-line JSON object**.
|
||||
- Use `{baseDir}` in instructions to reference the skill folder path.
|
||||
- Optional frontmatter keys:
|
||||
- `homepage` — URL surfaced as “Website” in the macOS Skills UI (also supported via `metadata.clawdbot.homepage`).
|
||||
|
||||
## Gating (load-time filters)
|
||||
|
||||
Clawdbot **filters skills at load time** using `metadata` (single-line JSON):
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: nano-banana-pro
|
||||
description: Generate or edit images via Gemini 3 Pro Image
|
||||
metadata: {"clawdbot":{"requires":{"bins":["uv"],"env":["GEMINI_API_KEY"],"config":["browser.enabled"]},"primaryEnv":"GEMINI_API_KEY"}}
|
||||
---
|
||||
```
|
||||
|
||||
Fields under `metadata.clawdbot`:
|
||||
- `always: true` — always include the skill (skip other gates).
|
||||
- `emoji` — optional emoji used by the macOS Skills UI.
|
||||
- `homepage` — optional URL shown as “Website” in the macOS Skills UI.
|
||||
- `os` — optional list of platforms (`darwin`, `linux`, `win32`). If set, the skill is only eligible on those OSes.
|
||||
- `requires.bins` — list; each must exist on `PATH`.
|
||||
- `requires.anyBins` — list; at least one must exist on `PATH`.
|
||||
- `requires.env` — list; env var must exist **or** be provided in config.
|
||||
- `requires.config` — list of `clawdbot.json` paths that must be truthy.
|
||||
- `primaryEnv` — env var name associated with `skills.entries.<name>.apiKey`.
|
||||
- `install` — optional array of installer specs used by the macOS Skills UI (brew/node/go/uv).
|
||||
|
||||
Installer example:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: gemini
|
||||
description: Use Gemini CLI for coding assistance and Google search lookups.
|
||||
metadata: {"clawdbot":{"emoji":"♊️","requires":{"bins":["gemini"]},"install":[{"id":"brew","kind":"brew","formula":"gemini-cli","bins":["gemini"],"label":"Install Gemini CLI (brew)"}]}}
|
||||
---
|
||||
```
|
||||
|
||||
Notes:
|
||||
- If multiple installers are listed, the gateway picks a **single** preferred option (brew when available, otherwise node).
|
||||
- Node installs honor `skills.install.nodeManager` in `clawdbot.json` (default: npm; options: npm/pnpm/yarn/bun).
|
||||
- Go installs: if `go` is missing and `brew` is available, the gateway installs Go via Homebrew first and sets `GOBIN` to Homebrew’s `bin` when possible.
|
||||
|
||||
If no `metadata.clawdbot` is present, the skill is always eligible (unless
|
||||
disabled in config or blocked by `skills.allowBundled` for bundled skills).
|
||||
|
||||
## Config overrides (`~/.clawdbot/clawdbot.json`)
|
||||
|
||||
Bundled/managed skills can be toggled and supplied with env values:
|
||||
|
||||
```json5
|
||||
{
|
||||
skills: {
|
||||
entries: {
|
||||
"nano-banana-pro": {
|
||||
enabled: true,
|
||||
apiKey: "GEMINI_KEY_HERE",
|
||||
env: {
|
||||
GEMINI_API_KEY: "GEMINI_KEY_HERE"
|
||||
}
|
||||
},
|
||||
peekaboo: { enabled: true },
|
||||
sag: { enabled: false }
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Note: if the skill name contains hyphens, quote the key (JSON5 allows quoted keys).
|
||||
|
||||
Config keys match the **skill name** by default. If a skill defines
|
||||
`metadata.clawdbot.skillKey`, use that key under `skills.entries`.
|
||||
|
||||
Rules:
|
||||
- `enabled: false` disables the skill even if it’s bundled/installed.
|
||||
- `env`: injected **only if** the variable isn’t already set in the process.
|
||||
- `apiKey`: convenience for skills that declare `metadata.clawdbot.primaryEnv`.
|
||||
- `allowBundled`: optional allowlist for **bundled** skills only. If set, only
|
||||
bundled skills in the list are eligible (managed/workspace skills unaffected).
|
||||
|
||||
## Environment injection (per agent run)
|
||||
|
||||
When an agent run starts, Clawdbot:
|
||||
1) Reads skill metadata.
|
||||
2) Applies any `skills.entries.<key>.env` or `skills.entries.<key>.apiKey` to
|
||||
`process.env`.
|
||||
3) Builds the system prompt with **eligible** skills.
|
||||
4) Restores the original environment after the run ends.
|
||||
|
||||
This is **scoped to the agent run**, not a global shell environment.
|
||||
|
||||
## Session snapshot (performance)
|
||||
|
||||
Clawdbot snapshots the eligible skills **when a session starts** and reuses that list for subsequent turns in the same session. Changes to skills or config take effect on the next new session.
|
||||
|
||||
## Managed skills lifecycle
|
||||
|
||||
Clawdbot ships a baseline set of skills as **bundled skills** as part of the
|
||||
install (npm package or Clawdbot.app). `~/.clawdbot/skills` exists for local
|
||||
overrides (for example, pinning/patching a skill without changing the bundled
|
||||
copy). Workspace skills are user-owned and override both on name conflicts.
|
||||
|
||||
## Config reference
|
||||
|
||||
See [`docs/skills-config.md`](/skills-config) for the full configuration schema.
|
||||
|
||||
## Looking for more skills?
|
||||
|
||||
Browse [ClawdHub](/clawdhub).
|
||||
|
||||
---
|
||||
55
docs/tools/slash-commands.md
Normal file
55
docs/tools/slash-commands.md
Normal file
@@ -0,0 +1,55 @@
|
||||
---
|
||||
summary: "Slash commands: text vs native, config, and supported commands"
|
||||
read_when:
|
||||
- Using or configuring chat commands
|
||||
- Debugging command routing or permissions
|
||||
---
|
||||
# Slash commands
|
||||
|
||||
Commands are handled by the Gateway. Send them as a **standalone** message that starts with `/`.
|
||||
Inline text like `hello /status` is ignored.
|
||||
|
||||
## Config
|
||||
|
||||
```json5
|
||||
{
|
||||
commands: {
|
||||
native: false,
|
||||
text: true,
|
||||
useAccessGroups: true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- `commands.text` (default `true`) enables parsing `/...` in chat messages.
|
||||
- On surfaces without native commands (WhatsApp/WebChat/Signal/iMessage), text commands still work even if you set this to `false`.
|
||||
- `commands.native` (default `false`) registers native commands on Discord/Slack/Telegram.
|
||||
- `false` clears previously registered commands on Discord/Telegram at startup.
|
||||
- Slack commands are managed in the Slack app and are not removed automatically.
|
||||
- `commands.useAccessGroups` (default `true`) enforces allowlists/policies for commands.
|
||||
|
||||
## Command list
|
||||
|
||||
Text + native (when enabled):
|
||||
- `/help`
|
||||
- `/status`
|
||||
- `/stop`
|
||||
- `/restart`
|
||||
- `/activation mention|always` (groups only)
|
||||
- `/send on|off|inherit` (owner-only)
|
||||
- `/reset` or `/new`
|
||||
- `/think <level>` (aliases: `/thinking`, `/t`)
|
||||
- `/verbose on|off` (alias: `/v`)
|
||||
- `/elevated on|off` (alias: `/elev`)
|
||||
- `/model <name>`
|
||||
- `/queue <mode>` (plus options like `debounce:2s cap:25 drop:summarize`)
|
||||
|
||||
Text-only:
|
||||
- `/compact [instructions]`
|
||||
|
||||
## Surface notes
|
||||
|
||||
- **Text commands** run in the normal chat session (DMs share `main`, groups have their own session).
|
||||
- **Native commands** use isolated sessions: `discord:slash:<userId>`, `slack:slash:<userId>`, `telegram:slash:<userId>`.
|
||||
- **`/stop`** targets the active chat session so it can abort the current run.
|
||||
- **Slack:** `slack.slashCommand` is still supported for a single `/clawd`-style command. If you enable `commands.native`, you must create one Slack slash command per built-in command (same names as `/help`).
|
||||
73
docs/tools/subagents.md
Normal file
73
docs/tools/subagents.md
Normal file
@@ -0,0 +1,73 @@
|
||||
---
|
||||
summary: "Sub-agents: spawning isolated agent runs that announce results back to the requester chat"
|
||||
read_when:
|
||||
- You want background/parallel work via the agent
|
||||
- You are changing sessions_spawn or sub-agent tool policy
|
||||
---
|
||||
|
||||
# Sub-agents
|
||||
|
||||
Sub-agents are background agent runs spawned from an existing agent run. They run in their own session (`subagent:<uuid>`) and, when finished, **announce** their result back to the requester chat provider.
|
||||
|
||||
Primary goals:
|
||||
- Parallelize “research / long task / slow tool” work without blocking the main run.
|
||||
- Keep sub-agents isolated by default (session separation + optional sandboxing).
|
||||
- Keep the tool surface hard to misuse: sub-agents do **not** get session tools by default.
|
||||
- Avoid nested fan-out: sub-agents cannot spawn sub-agents.
|
||||
|
||||
## Tool
|
||||
|
||||
Use `sessions_spawn`:
|
||||
- Starts a sub-agent run (`deliver: false`, global lane: `subagent`)
|
||||
- Then runs an announce step and posts the announce reply to the requester chat provider
|
||||
|
||||
Tool params:
|
||||
- `task` (required)
|
||||
- `label?` (optional)
|
||||
- `model?` (optional; overrides the sub-agent model; invalid values error)
|
||||
- `timeoutSeconds?` (default `0`; `0` = fire-and-forget)
|
||||
- `cleanup?` (`delete|keep`, default `delete`)
|
||||
|
||||
## Announce
|
||||
|
||||
Sub-agents report back via an announce step:
|
||||
- The announce step runs inside the sub-agent session (not the requester session).
|
||||
- If the sub-agent replies exactly `ANNOUNCE_SKIP`, nothing is posted.
|
||||
- Otherwise the announce reply is posted to the requester chat provider via the gateway `send` method.
|
||||
|
||||
## Tool Policy (sub-agent tools)
|
||||
|
||||
By default, sub-agents get **all tools except session tools**:
|
||||
- `sessions_list`
|
||||
- `sessions_history`
|
||||
- `sessions_send`
|
||||
- `sessions_spawn`
|
||||
|
||||
Override via config:
|
||||
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
subagents: {
|
||||
maxConcurrent: 1,
|
||||
tools: {
|
||||
// deny wins
|
||||
deny: ["gateway", "cron"],
|
||||
// if allow is set, it becomes allow-only (deny still wins)
|
||||
// allow: ["read", "bash", "process"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Concurrency
|
||||
|
||||
Sub-agents use a dedicated in-process queue lane:
|
||||
- Lane name: `subagent`
|
||||
- Concurrency: `agent.subagents.maxConcurrent` (default `1`)
|
||||
|
||||
## Limitations
|
||||
|
||||
- Sub-agent announce is **best-effort**. If the gateway restarts, pending “announce back” work is lost.
|
||||
- Sub-agents still share the same gateway process resources; treat `maxConcurrent` as a safety valve.
|
||||
46
docs/tools/thinking.md
Normal file
46
docs/tools/thinking.md
Normal file
@@ -0,0 +1,46 @@
|
||||
---
|
||||
summary: "Directive syntax for /think + /verbose and how they affect model reasoning"
|
||||
read_when:
|
||||
- Adjusting thinking or verbose directive parsing or defaults
|
||||
---
|
||||
# Thinking Levels (/think directives)
|
||||
|
||||
## What it does
|
||||
- Inline directive in any inbound body: `/t <level>`, `/think:<level>`, or `/thinking <level>`.
|
||||
- Levels (aliases): `off | minimal | low | medium | high`
|
||||
- minimal → “think”
|
||||
- low → “think hard”
|
||||
- medium → “think harder”
|
||||
- high → “ultrathink” (max budget)
|
||||
- `highest`, `max` map to `high`.
|
||||
|
||||
## Resolution order
|
||||
1. Inline directive on the message (applies only to that message).
|
||||
2. Session override (set by sending a directive-only message).
|
||||
3. Global default (`agent.thinkingDefault` in config).
|
||||
4. Fallback: low for reasoning-capable models; off otherwise.
|
||||
|
||||
## Setting a session default
|
||||
- Send a message that is **only** the directive (whitespace allowed), e.g. `/think:medium` or `/t high`.
|
||||
- That sticks for the current session (per-sender by default); cleared by `/think:off` or session idle reset.
|
||||
- Confirmation reply is sent (`Thinking level set to high.` / `Thinking disabled.`). If the level is invalid (e.g. `/thinking big`), the command is rejected with a hint and the session state is left unchanged.
|
||||
|
||||
## Application by agent
|
||||
- **Embedded Pi**: the resolved level is passed to the in-process Pi agent runtime.
|
||||
|
||||
## Verbose directives (/verbose or /v)
|
||||
- Levels: `on|full` or `off` (default).
|
||||
- Directive-only message toggles session verbose and replies `Verbose logging enabled.` / `Verbose logging disabled.`; invalid levels return a hint without changing state.
|
||||
- Inline directive affects only that message; session/global defaults apply otherwise.
|
||||
- When verbose is on, agents that emit structured tool results (Pi, other JSON agents) send each tool result back as its own metadata-only message, prefixed with `<emoji> <tool-name>: <arg>` when available (path/command); the tool output itself is not forwarded. These tool summaries are sent as soon as each tool finishes (separate bubbles), not as streaming deltas. If you toggle `/verbose on|off` while a run is in-flight, subsequent tool bubbles honor the new setting.
|
||||
|
||||
## Related
|
||||
- Elevated mode docs live in [`docs/elevated.md`](/elevated).
|
||||
|
||||
## Heartbeats
|
||||
- Heartbeat probe body is the configured heartbeat prompt (default: `Read HEARTBEAT.md if exists. Consider outstanding tasks. Checkup sometimes on your human during (user local) day time.`). Inline directives in a heartbeat message apply as usual (but avoid changing session defaults from heartbeats).
|
||||
|
||||
## Web chat UI
|
||||
- The web chat thinking selector mirrors the session's stored level from the inbound session store/config when the page loads.
|
||||
- Picking another level applies only to the next message (`thinkingOnce`); after sending, the selector snaps back to the stored session level.
|
||||
- To change the session default, send a `/think:<level>` directive (as before); the selector will reflect it after the next reload.
|
||||
Reference in New Issue
Block a user