feat(web): add Perplexity Sonar as alternative search provider
This commit is contained in:
@@ -1,15 +1,16 @@
|
|||||||
---
|
---
|
||||||
summary: "Web search + fetch tools (Brave Search API)"
|
summary: "Web search + fetch tools (Brave Search API, Perplexity via OpenRouter)"
|
||||||
read_when:
|
read_when:
|
||||||
- You want to enable web_search or web_fetch
|
- You want to enable web_search or web_fetch
|
||||||
- You need Brave Search API key setup
|
- You need Brave Search API key setup
|
||||||
|
- You want to use Perplexity Sonar for web search
|
||||||
---
|
---
|
||||||
|
|
||||||
# Web tools
|
# Web tools
|
||||||
|
|
||||||
Clawdbot ships two lightweight web tools:
|
Clawdbot ships two lightweight web tools:
|
||||||
|
|
||||||
- `web_search` — Brave Search API queries (fast, structured results).
|
- `web_search` — Search the web via Brave Search API (default) or Perplexity Sonar (via OpenRouter).
|
||||||
- `web_fetch` — HTTP fetch + readable extraction (HTML → markdown/text).
|
- `web_fetch` — HTTP fetch + readable extraction (HTML → markdown/text).
|
||||||
|
|
||||||
These are **not** browser automation. For JS-heavy sites or logins, use the
|
These are **not** browser automation. For JS-heavy sites or logins, use the
|
||||||
@@ -17,13 +18,35 @@ These are **not** browser automation. For JS-heavy sites or logins, use the
|
|||||||
|
|
||||||
## How it works
|
## How it works
|
||||||
|
|
||||||
- `web_search` calls Brave’s Search API and returns structured results
|
- `web_search` calls your configured provider and returns results.
|
||||||
(title, URL, snippet). No browser is involved.
|
- **Brave** (default): returns structured results (title, URL, snippet).
|
||||||
|
- **Perplexity**: returns AI-synthesized answers with citations from real-time web search.
|
||||||
- Results are cached by query for 15 minutes (configurable).
|
- Results are cached by query for 15 minutes (configurable).
|
||||||
- `web_fetch` does a plain HTTP GET and extracts readable content
|
- `web_fetch` does a plain HTTP GET and extracts readable content
|
||||||
(HTML → markdown/text). It does **not** execute JavaScript.
|
(HTML → markdown/text). It does **not** execute JavaScript.
|
||||||
- `web_fetch` is enabled by default (unless explicitly disabled).
|
- `web_fetch` is enabled by default (unless explicitly disabled).
|
||||||
|
|
||||||
|
## Choosing a search provider
|
||||||
|
|
||||||
|
| Provider | Pros | Cons | API Key |
|
||||||
|
|----------|------|------|---------|
|
||||||
|
| **Brave** (default) | Fast, structured results, free tier | Traditional search results | `BRAVE_API_KEY` |
|
||||||
|
| **Perplexity** | AI-synthesized answers, citations, real-time | Requires OpenRouter credits | `OPENROUTER_API_KEY` or `PERPLEXITY_API_KEY` |
|
||||||
|
|
||||||
|
Set the provider in config:
|
||||||
|
|
||||||
|
```json5
|
||||||
|
{
|
||||||
|
tools: {
|
||||||
|
web: {
|
||||||
|
search: {
|
||||||
|
provider: "brave" // or "perplexity"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
## Getting a Brave API key
|
## Getting a Brave API key
|
||||||
|
|
||||||
1) Create a Brave Search API account at https://brave.com/search/api/
|
1) Create a Brave Search API account at https://brave.com/search/api/
|
||||||
@@ -42,14 +65,62 @@ current limits and pricing.
|
|||||||
environment. For a daemon install, put it in `~/.clawdbot/.env` (or your
|
environment. For a daemon install, put it in `~/.clawdbot/.env` (or your
|
||||||
service environment). See [Env vars](/start/faq#how-does-clawdbot-load-environment-variables).
|
service environment). See [Env vars](/start/faq#how-does-clawdbot-load-environment-variables).
|
||||||
|
|
||||||
|
## Using Perplexity (via OpenRouter)
|
||||||
|
|
||||||
|
Perplexity Sonar models have built-in web search capabilities and return AI-synthesized
|
||||||
|
answers with citations. You can use them via OpenRouter (no credit card required - supports
|
||||||
|
crypto/prepaid).
|
||||||
|
|
||||||
|
### Getting an OpenRouter API key
|
||||||
|
|
||||||
|
1) Create an account at https://openrouter.ai/
|
||||||
|
2) Add credits (supports crypto, prepaid, or credit card)
|
||||||
|
3) Generate an API key in your account settings
|
||||||
|
|
||||||
|
### Setting up Perplexity search
|
||||||
|
|
||||||
|
```json5
|
||||||
|
{
|
||||||
|
tools: {
|
||||||
|
web: {
|
||||||
|
search: {
|
||||||
|
enabled: true,
|
||||||
|
provider: "perplexity",
|
||||||
|
perplexity: {
|
||||||
|
// API key (optional if OPENROUTER_API_KEY or PERPLEXITY_API_KEY is set)
|
||||||
|
apiKey: "sk-or-v1-...",
|
||||||
|
// Base URL (defaults to OpenRouter)
|
||||||
|
baseUrl: "https://openrouter.ai/api/v1",
|
||||||
|
// Model (defaults to perplexity/sonar-pro)
|
||||||
|
model: "perplexity/sonar-pro"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Environment alternative:** set `OPENROUTER_API_KEY` or `PERPLEXITY_API_KEY` in the Gateway
|
||||||
|
environment. For a daemon install, put it in `~/.clawdbot/.env`.
|
||||||
|
|
||||||
|
### Available Perplexity models
|
||||||
|
|
||||||
|
| Model | Description | Best for |
|
||||||
|
|-------|-------------|----------|
|
||||||
|
| `perplexity/sonar` | Fast Q&A with web search | Quick lookups |
|
||||||
|
| `perplexity/sonar-pro` (default) | Multi-step reasoning with web search | Complex questions |
|
||||||
|
| `perplexity/sonar-reasoning-pro` | Chain-of-thought analysis | Deep research |
|
||||||
|
|
||||||
## web_search
|
## web_search
|
||||||
|
|
||||||
Search the web with Brave’s API.
|
Search the web using your configured provider.
|
||||||
|
|
||||||
### Requirements
|
### Requirements
|
||||||
|
|
||||||
- `tools.web.search.enabled` must not be `false` (default: enabled)
|
- `tools.web.search.enabled` must not be `false` (default: enabled)
|
||||||
- Brave API key (recommended: `clawdbot configure --section web`, or set `BRAVE_API_KEY`)
|
- API key for your chosen provider:
|
||||||
|
- **Brave**: `BRAVE_API_KEY` or `tools.web.search.apiKey`
|
||||||
|
- **Perplexity**: `OPENROUTER_API_KEY`, `PERPLEXITY_API_KEY`, or `tools.web.search.perplexity.apiKey`
|
||||||
|
|
||||||
### Config
|
### Config
|
||||||
|
|
||||||
|
|||||||
@@ -5,7 +5,7 @@ import { stringEnum } from "../schema/typebox.js";
|
|||||||
import type { AnyAgentTool } from "./common.js";
|
import type { AnyAgentTool } from "./common.js";
|
||||||
import { jsonResult, readNumberParam, readStringParam } from "./common.js";
|
import { jsonResult, readNumberParam, readStringParam } from "./common.js";
|
||||||
|
|
||||||
const SEARCH_PROVIDERS = ["brave"] as const;
|
const SEARCH_PROVIDERS = ["brave", "perplexity"] as const;
|
||||||
const EXTRACT_MODES = ["markdown", "text"] as const;
|
const EXTRACT_MODES = ["markdown", "text"] as const;
|
||||||
|
|
||||||
const DEFAULT_SEARCH_COUNT = 5;
|
const DEFAULT_SEARCH_COUNT = 5;
|
||||||
@@ -20,6 +20,8 @@ const DEFAULT_FETCH_USER_AGENT =
|
|||||||
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36";
|
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36";
|
||||||
|
|
||||||
const BRAVE_SEARCH_ENDPOINT = "https://api.search.brave.com/res/v1/web/search";
|
const BRAVE_SEARCH_ENDPOINT = "https://api.search.brave.com/res/v1/web/search";
|
||||||
|
const DEFAULT_PERPLEXITY_BASE_URL = "https://openrouter.ai/api/v1";
|
||||||
|
const DEFAULT_PERPLEXITY_MODEL = "perplexity/sonar-pro";
|
||||||
|
|
||||||
type WebSearchConfig = NonNullable<ClawdbotConfig["tools"]>["web"] extends infer Web
|
type WebSearchConfig = NonNullable<ClawdbotConfig["tools"]>["web"] extends infer Web
|
||||||
? Web extends { search?: infer Search }
|
? Web extends { search?: infer Search }
|
||||||
@@ -196,7 +198,15 @@ function resolveFirecrawlMaxAgeMsOrDefault(firecrawl?: FirecrawlFetchConfig): nu
|
|||||||
return DEFAULT_FIRECRAWL_MAX_AGE_MS;
|
return DEFAULT_FIRECRAWL_MAX_AGE_MS;
|
||||||
}
|
}
|
||||||
|
|
||||||
function missingSearchKeyPayload() {
|
function missingSearchKeyPayload(provider: (typeof SEARCH_PROVIDERS)[number]) {
|
||||||
|
if (provider === "perplexity") {
|
||||||
|
return {
|
||||||
|
error: "missing_perplexity_api_key",
|
||||||
|
message:
|
||||||
|
"web_search (perplexity) needs an API key. Set PERPLEXITY_API_KEY or OPENROUTER_API_KEY in the Gateway environment, or configure tools.web.search.perplexity.apiKey.",
|
||||||
|
docs: "https://docs.clawd.bot/tools/web",
|
||||||
|
};
|
||||||
|
}
|
||||||
return {
|
return {
|
||||||
error: "missing_brave_api_key",
|
error: "missing_brave_api_key",
|
||||||
message:
|
message:
|
||||||
@@ -210,10 +220,50 @@ function resolveSearchProvider(search?: WebSearchConfig): (typeof SEARCH_PROVIDE
|
|||||||
search && "provider" in search && typeof search.provider === "string"
|
search && "provider" in search && typeof search.provider === "string"
|
||||||
? search.provider.trim().toLowerCase()
|
? search.provider.trim().toLowerCase()
|
||||||
: "";
|
: "";
|
||||||
|
if (raw === "perplexity") return "perplexity";
|
||||||
if (raw === "brave") return "brave";
|
if (raw === "brave") return "brave";
|
||||||
return "brave";
|
return "brave";
|
||||||
}
|
}
|
||||||
|
|
||||||
|
type PerplexityConfig = {
|
||||||
|
apiKey?: string;
|
||||||
|
baseUrl?: string;
|
||||||
|
model?: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
function resolvePerplexityConfig(search?: WebSearchConfig): PerplexityConfig {
|
||||||
|
if (!search || typeof search !== "object") return {};
|
||||||
|
const perplexity = "perplexity" in search ? search.perplexity : undefined;
|
||||||
|
if (!perplexity || typeof perplexity !== "object") return {};
|
||||||
|
return perplexity as PerplexityConfig;
|
||||||
|
}
|
||||||
|
|
||||||
|
function resolvePerplexityApiKey(perplexity?: PerplexityConfig): string | undefined {
|
||||||
|
const fromConfig =
|
||||||
|
perplexity && "apiKey" in perplexity && typeof perplexity.apiKey === "string"
|
||||||
|
? perplexity.apiKey.trim()
|
||||||
|
: "";
|
||||||
|
const fromEnvPerplexity = (process.env.PERPLEXITY_API_KEY ?? "").trim();
|
||||||
|
const fromEnvOpenRouter = (process.env.OPENROUTER_API_KEY ?? "").trim();
|
||||||
|
return fromConfig || fromEnvPerplexity || fromEnvOpenRouter || undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
function resolvePerplexityBaseUrl(perplexity?: PerplexityConfig): string {
|
||||||
|
const fromConfig =
|
||||||
|
perplexity && "baseUrl" in perplexity && typeof perplexity.baseUrl === "string"
|
||||||
|
? perplexity.baseUrl.trim()
|
||||||
|
: "";
|
||||||
|
return fromConfig || DEFAULT_PERPLEXITY_BASE_URL;
|
||||||
|
}
|
||||||
|
|
||||||
|
function resolvePerplexityModel(perplexity?: PerplexityConfig): string {
|
||||||
|
const fromConfig =
|
||||||
|
perplexity && "model" in perplexity && typeof perplexity.model === "string"
|
||||||
|
? perplexity.model.trim()
|
||||||
|
: "";
|
||||||
|
return fromConfig || DEFAULT_PERPLEXITY_MODEL;
|
||||||
|
}
|
||||||
|
|
||||||
function resolveTimeoutSeconds(value: unknown, fallback: number): number {
|
function resolveTimeoutSeconds(value: unknown, fallback: number): number {
|
||||||
const parsed = typeof value === "number" && Number.isFinite(value) ? value : fallback;
|
const parsed = typeof value === "number" && Number.isFinite(value) ? value : fallback;
|
||||||
return Math.max(1, Math.floor(parsed));
|
return Math.max(1, Math.floor(parsed));
|
||||||
@@ -486,6 +536,56 @@ export async function fetchFirecrawlContent(params: {
|
|||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
type PerplexitySearchResponse = {
|
||||||
|
choices?: Array<{
|
||||||
|
message?: {
|
||||||
|
content?: string;
|
||||||
|
};
|
||||||
|
}>;
|
||||||
|
citations?: string[];
|
||||||
|
};
|
||||||
|
|
||||||
|
async function runPerplexitySearch(params: {
|
||||||
|
query: string;
|
||||||
|
apiKey: string;
|
||||||
|
baseUrl: string;
|
||||||
|
model: string;
|
||||||
|
timeoutSeconds: number;
|
||||||
|
}): Promise<{ content: string; citations: string[] }> {
|
||||||
|
const endpoint = `${params.baseUrl.replace(/\/$/, "")}/chat/completions`;
|
||||||
|
|
||||||
|
const res = await fetch(endpoint, {
|
||||||
|
method: "POST",
|
||||||
|
headers: {
|
||||||
|
"Content-Type": "application/json",
|
||||||
|
Authorization: `Bearer ${params.apiKey}`,
|
||||||
|
"HTTP-Referer": "https://clawdbot.com",
|
||||||
|
"X-Title": "Clawdbot Web Search",
|
||||||
|
},
|
||||||
|
body: JSON.stringify({
|
||||||
|
model: params.model,
|
||||||
|
messages: [
|
||||||
|
{
|
||||||
|
role: "user",
|
||||||
|
content: params.query,
|
||||||
|
},
|
||||||
|
],
|
||||||
|
}),
|
||||||
|
signal: withTimeout(undefined, params.timeoutSeconds * 1000),
|
||||||
|
});
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
const detail = await readResponseText(res);
|
||||||
|
throw new Error(`Perplexity API error (${res.status}): ${detail || res.statusText}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const data = (await res.json()) as PerplexitySearchResponse;
|
||||||
|
const content = data.choices?.[0]?.message?.content ?? "No response";
|
||||||
|
const citations = data.citations ?? [];
|
||||||
|
|
||||||
|
return { content, citations };
|
||||||
|
}
|
||||||
|
|
||||||
async function runWebSearch(params: {
|
async function runWebSearch(params: {
|
||||||
query: string;
|
query: string;
|
||||||
count: number;
|
count: number;
|
||||||
@@ -496,6 +596,8 @@ async function runWebSearch(params: {
|
|||||||
country?: string;
|
country?: string;
|
||||||
search_lang?: string;
|
search_lang?: string;
|
||||||
ui_lang?: string;
|
ui_lang?: string;
|
||||||
|
perplexityBaseUrl?: string;
|
||||||
|
perplexityModel?: string;
|
||||||
}): Promise<Record<string, unknown>> {
|
}): Promise<Record<string, unknown>> {
|
||||||
const cacheKey = normalizeCacheKey(
|
const cacheKey = normalizeCacheKey(
|
||||||
`${params.provider}:${params.query}:${params.count}:${params.country || "default"}:${params.search_lang || "default"}:${params.ui_lang || "default"}`,
|
`${params.provider}:${params.query}:${params.count}:${params.country || "default"}:${params.search_lang || "default"}:${params.ui_lang || "default"}`,
|
||||||
@@ -504,6 +606,28 @@ async function runWebSearch(params: {
|
|||||||
if (cached) return { ...cached.value, cached: true };
|
if (cached) return { ...cached.value, cached: true };
|
||||||
|
|
||||||
const start = Date.now();
|
const start = Date.now();
|
||||||
|
|
||||||
|
if (params.provider === "perplexity") {
|
||||||
|
const { content, citations } = await runPerplexitySearch({
|
||||||
|
query: params.query,
|
||||||
|
apiKey: params.apiKey,
|
||||||
|
baseUrl: params.perplexityBaseUrl ?? DEFAULT_PERPLEXITY_BASE_URL,
|
||||||
|
model: params.perplexityModel ?? DEFAULT_PERPLEXITY_MODEL,
|
||||||
|
timeoutSeconds: params.timeoutSeconds,
|
||||||
|
});
|
||||||
|
|
||||||
|
const payload = {
|
||||||
|
query: params.query,
|
||||||
|
provider: params.provider,
|
||||||
|
model: params.perplexityModel ?? DEFAULT_PERPLEXITY_MODEL,
|
||||||
|
tookMs: Date.now() - start,
|
||||||
|
content,
|
||||||
|
citations,
|
||||||
|
};
|
||||||
|
writeCache(SEARCH_CACHE, cacheKey, payload, params.cacheTtlMs);
|
||||||
|
return payload;
|
||||||
|
}
|
||||||
|
|
||||||
if (params.provider !== "brave") {
|
if (params.provider !== "brave") {
|
||||||
throw new Error("Unsupported web search provider.");
|
throw new Error("Unsupported web search provider.");
|
||||||
}
|
}
|
||||||
@@ -772,16 +896,30 @@ export function createWebSearchTool(options?: {
|
|||||||
}): AnyAgentTool | null {
|
}): AnyAgentTool | null {
|
||||||
const search = resolveSearchConfig(options?.config);
|
const search = resolveSearchConfig(options?.config);
|
||||||
if (!resolveSearchEnabled({ search, sandboxed: options?.sandboxed })) return null;
|
if (!resolveSearchEnabled({ search, sandboxed: options?.sandboxed })) return null;
|
||||||
|
|
||||||
|
const provider = resolveSearchProvider(search);
|
||||||
|
const perplexityConfig = resolvePerplexityConfig(search);
|
||||||
|
|
||||||
|
// Determine description based on provider
|
||||||
|
const description =
|
||||||
|
provider === "perplexity"
|
||||||
|
? "Search the web using Perplexity Sonar (via OpenRouter). Returns AI-synthesized answers with citations from real-time web search."
|
||||||
|
: "Search the web using Brave Search API. Supports region-specific and localized search via country and language parameters. Returns titles, URLs, and snippets for fast research.";
|
||||||
|
|
||||||
return {
|
return {
|
||||||
label: "Web Search",
|
label: "Web Search",
|
||||||
name: "web_search",
|
name: "web_search",
|
||||||
description:
|
description,
|
||||||
"Search the web using Brave Search API. Supports region-specific and localized search via country and language parameters. Returns titles, URLs, and snippets for fast research.",
|
|
||||||
parameters: WebSearchSchema,
|
parameters: WebSearchSchema,
|
||||||
execute: async (_toolCallId, args) => {
|
execute: async (_toolCallId, args) => {
|
||||||
const apiKey = resolveSearchApiKey(search);
|
// Resolve API key based on provider
|
||||||
|
const apiKey =
|
||||||
|
provider === "perplexity"
|
||||||
|
? resolvePerplexityApiKey(perplexityConfig)
|
||||||
|
: resolveSearchApiKey(search);
|
||||||
|
|
||||||
if (!apiKey) {
|
if (!apiKey) {
|
||||||
return jsonResult(missingSearchKeyPayload());
|
return jsonResult(missingSearchKeyPayload(provider));
|
||||||
}
|
}
|
||||||
const params = args as Record<string, unknown>;
|
const params = args as Record<string, unknown>;
|
||||||
const query = readStringParam(params, "query", { required: true });
|
const query = readStringParam(params, "query", { required: true });
|
||||||
@@ -796,10 +934,12 @@ export function createWebSearchTool(options?: {
|
|||||||
apiKey,
|
apiKey,
|
||||||
timeoutSeconds: resolveTimeoutSeconds(search?.timeoutSeconds, DEFAULT_TIMEOUT_SECONDS),
|
timeoutSeconds: resolveTimeoutSeconds(search?.timeoutSeconds, DEFAULT_TIMEOUT_SECONDS),
|
||||||
cacheTtlMs: resolveCacheTtlMs(search?.cacheTtlMinutes, DEFAULT_CACHE_TTL_MINUTES),
|
cacheTtlMs: resolveCacheTtlMs(search?.cacheTtlMinutes, DEFAULT_CACHE_TTL_MINUTES),
|
||||||
provider: resolveSearchProvider(search),
|
provider,
|
||||||
country,
|
country,
|
||||||
search_lang,
|
search_lang,
|
||||||
ui_lang,
|
ui_lang,
|
||||||
|
perplexityBaseUrl: resolvePerplexityBaseUrl(perplexityConfig),
|
||||||
|
perplexityModel: resolvePerplexityModel(perplexityConfig),
|
||||||
});
|
});
|
||||||
return jsonResult(result);
|
return jsonResult(result);
|
||||||
},
|
},
|
||||||
|
|||||||
@@ -222,8 +222,8 @@ export type ToolsConfig = {
|
|||||||
search?: {
|
search?: {
|
||||||
/** Enable web search tool (default: true when API key is present). */
|
/** Enable web search tool (default: true when API key is present). */
|
||||||
enabled?: boolean;
|
enabled?: boolean;
|
||||||
/** Search provider (currently "brave"). */
|
/** Search provider ("brave" or "perplexity"). */
|
||||||
provider?: "brave";
|
provider?: "brave" | "perplexity";
|
||||||
/** Brave Search API key (optional; defaults to BRAVE_API_KEY env var). */
|
/** Brave Search API key (optional; defaults to BRAVE_API_KEY env var). */
|
||||||
apiKey?: string;
|
apiKey?: string;
|
||||||
/** Default search results count (1-10). */
|
/** Default search results count (1-10). */
|
||||||
@@ -232,6 +232,15 @@ export type ToolsConfig = {
|
|||||||
timeoutSeconds?: number;
|
timeoutSeconds?: number;
|
||||||
/** Cache TTL in minutes for search results. */
|
/** Cache TTL in minutes for search results. */
|
||||||
cacheTtlMinutes?: number;
|
cacheTtlMinutes?: number;
|
||||||
|
/** Perplexity-specific configuration (used when provider="perplexity"). */
|
||||||
|
perplexity?: {
|
||||||
|
/** API key for Perplexity or OpenRouter (defaults to PERPLEXITY_API_KEY or OPENROUTER_API_KEY env var). */
|
||||||
|
apiKey?: string;
|
||||||
|
/** Base URL for API requests (defaults to OpenRouter: https://openrouter.ai/api/v1). */
|
||||||
|
baseUrl?: string;
|
||||||
|
/** Model to use (defaults to "perplexity/sonar-pro"). */
|
||||||
|
model?: string;
|
||||||
|
};
|
||||||
};
|
};
|
||||||
fetch?: {
|
fetch?: {
|
||||||
/** Enable web fetch tool (default: true). */
|
/** Enable web fetch tool (default: true). */
|
||||||
|
|||||||
Reference in New Issue
Block a user