feat: add web tools

2026-01-15 04:07:29 +00:00
parent 31d3aef8d6
commit f275cc180b
18 changed files with 736 additions and 165 deletions
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -889,6 +889,7 @@
          "tools",
          "plugin",
          "tools/exec",
+          "tools/web",
          "tools/apply-patch",
          "tools/elevated",
          "tools/browser",
--- a/docs/gateway/configuration-examples.md
+++ b/docs/gateway/configuration-examples.md
@@ -384,7 +384,7 @@ Save to `~/.clawdbot/clawdbot.json` and you can DM the bot from that number.
  },

  skills: {
-    allowBundled: ["brave-search", "gemini"],
+    allowBundled: ["gemini", "peekaboo"],
    load: {
      extraDirs: ["~/Projects/agent-scripts/skills"]
    },
--- a/docs/gateway/configuration.md
+++ b/docs/gateway/configuration.md
@@ -1605,6 +1605,18 @@ of `every`, keep `HEARTBEAT.md` tiny, and/or choose a cheaper `model`.
 Note: `applyPatch` is only under `tools.exec` (no `tools.bash` alias).
 Legacy: `tools.bash` is still accepted as an alias.

+`tools.web` configures web search + fetch tools:
+- `tools.web.search.enabled` (default: true when key is present)
+- `tools.web.search.apiKey` (or `BRAVE_API_KEY` env var)
+- `tools.web.search.maxResults` (1–10, default 5)
+- `tools.web.search.timeoutSeconds` (default 30)
+- `tools.web.search.cacheTtlMinutes` (default 15)
+- `tools.web.fetch.enabled` (default false; sandboxed sessions auto-enable unless set to false)
+- `tools.web.fetch.maxChars` (default 50000)
+- `tools.web.fetch.timeoutSeconds` (default 30)
+- `tools.web.fetch.cacheTtlMinutes` (default 15)
+- `tools.web.fetch.userAgent` (optional override)
+
 `agents.defaults.subagents` configures sub-agent defaults:
 - `model`: default model for spawned sub-agents (string or `{ primary, fallbacks }`). If omitted, sub-agents inherit the caller’s model unless overridden per agent or per call.
 - `maxConcurrent`: max concurrent sub-agent runs (default 1)
@@ -1685,6 +1697,7 @@ Tool groups (shorthands) work in **global** and **per-agent** tool policies:
 - `group:fs`: `read`, `write`, `edit`, `apply_patch`
 - `group:sessions`: `sessions_list`, `sessions_history`, `sessions_send`, `sessions_spawn`, `session_status`
 - `group:memory`: `memory_search`, `memory_get`
+- `group:web`: `web_search`, `web_fetch`
 - `group:ui`: `browser`, `canvas`
 - `group:automation`: `cron`, `gateway`
 - `group:messaging`: `message`
@@ -2210,7 +2223,7 @@ Example:
 ```json5
 {
  skills: {
-    allowBundled: ["brave-search", "gemini"],
+    allowBundled: ["gemini", "peekaboo"],
    load: {
      extraDirs: [
        "~/Projects/agent-scripts/skills",
--- a/docs/tools/index.md
+++ b/docs/tools/index.md
@@ -131,6 +131,7 @@ Available groups:
 - `group:fs`: `read`, `write`, `edit`, `apply_patch`
 - `group:sessions`: `sessions_list`, `sessions_history`, `sessions_send`, `sessions_spawn`, `session_status`
 - `group:memory`: `memory_search`, `memory_get`
+- `group:web`: `web_search`, `web_fetch`
 - `group:ui`: `browser`, `canvas`
 - `group:automation`: `cron`, `gateway`
 - `group:messaging`: `message`
@@ -188,6 +189,33 @@ Notes:
 - `log` supports line-based `offset`/`limit` (omit `offset` to grab the last N lines).
 - `process` is scoped per agent; sessions from other agents are not visible.

+### `web_search`
+Search the web using Brave Search API.
+
+Core parameters:
+- `query` (required)
+- `count` (1–10; default from `tools.web.search.maxResults`)
+
+Notes:
+- Requires `BRAVE_API_KEY` or `tools.web.search.apiKey`.
+- Enable via `tools.web.search.enabled`.
+- Responses are cached (default 15 min).
+- See [Web tools](/tools/web) for setup.
+
+### `web_fetch`
+Fetch and extract readable content from a URL (HTML → markdown/text).
+
+Core parameters:
+- `url` (required)
+- `extractMode` (`markdown` | `text`)
+- `maxChars` (truncate long pages)
+
+Notes:
+- Enable via `tools.web.fetch.enabled`.
+- Responses are cached (default 15 min).
+- For JS-heavy sites, prefer the browser tool.
+- See [Web tools](/tools/web) for setup.
+
 ### `browser`
 Control the dedicated clawd browser.

--- a/docs/tools/skills-config.md
+++ b/docs/tools/skills-config.md
@@ -11,7 +11,7 @@ All skills-related configuration lives under `skills` in `~/.clawdbot/clawdbot.j
 ```json5
 {
  skills: {
-    allowBundled: ["brave-search", "gemini"],
+    allowBundled: ["gemini", "peekaboo"],
    load: {
      extraDirs: [
        "~/Projects/agent-scripts/skills",
--- a/docs/tools/web.md
+++ b/docs/tools/web.md
@@ -0,0 +1,103 @@
+---
+summary: "Web search + fetch tools (Brave Search API)"
+read_when:
+  - You want to enable web_search or web_fetch
+  - You need Brave Search API key setup
+---
+
+# Web tools
+
+Clawdbot ships two lightweight web tools:
+
+- `web_search` — Brave Search API queries (fast, structured results).
+- `web_fetch` — HTTP fetch + readable extraction (HTML → markdown/text).
+
+These are **not** browser automation. For JS-heavy sites or logins, use the
+[Browser tool](/tools/browser).
+
+## How it works
+
+- `web_search` calls Brave’s Search API and returns structured results
+  (title, URL, snippet). No browser is involved.
+- Results are cached by query for 15 minutes (configurable).
+- `web_fetch` does a plain HTTP GET and extracts readable content
+  (HTML → markdown/text). It does **not** execute JavaScript.
+- In sandboxed sessions, `web_fetch` is enabled automatically (unless explicitly disabled).
+
+## web_search
+
+Search the web with Brave’s API.
+
+### Requirements
+
+- `tools.web.search.enabled: true`
+- Brave API key via `BRAVE_API_KEY` **or** `tools.web.search.apiKey`
+
+### Config
+
+```json5
+{
+  tools: {
+    web: {
+      search: {
+        enabled: true,
+        apiKey: "BRAVE_API_KEY_HERE", // optional if BRAVE_API_KEY is set
+        maxResults: 5,
+        timeoutSeconds: 30,
+        cacheTtlMinutes: 15
+      }
+    }
+  }
+}
+```
+
+### Tool parameters
+
+- `query` (required)
+- `count` (1–10; default from config)
+
+## web_fetch
+
+Fetch a URL and extract readable content.
+
+### Requirements
+
+- `tools.web.fetch.enabled: true`
+
+### Config
+
+```json5
+{
+  tools: {
+    web: {
+      fetch: {
+        enabled: true,
+        maxChars: 50000,
+        timeoutSeconds: 30,
+        cacheTtlMinutes: 15,
+        userAgent: "clawdbot/2026.1.14"
+      }
+    }
+  }
+}
+```
+
+### Tool parameters
+
+- `url` (required, http/https only)
+- `extractMode` (`markdown` | `text`)
+- `maxChars` (truncate long pages)
+
+Notes:
+- `web_fetch` is best-effort extraction; some sites will need the browser tool.
+- Responses are cached (default 15 minutes) to reduce repeated fetches.
+- If you use tool profiles/allowlists, add `web_search`/`web_fetch` or `group:web`.
+
+## Getting a Brave API key
+
+1) Create a Brave Search API account at https://brave.com/search/api/
+2) Generate an API key in the dashboard.
+3) Set `BRAVE_API_KEY` in your environment or paste it into `tools.web.search.apiKey`.
+
+Brave provides a free tier plus paid plans; check the Brave API portal for the
+current limits and pricing.