Description

Quota-aware multi-provider web search for OpenClaw. Supports 12 search providers with automatic failover, task-level deep search (@dual/@deep), real quota ch...

README (SKILL.md)

Free Search Aggregator

Name: Free Search Aggregator
Author: vulcanusalex

Reliable, provider-diverse web search for OpenClaw with high uptime + low operator overhead.

Why use this skill

12 search providers, 6 requiring no API key at all
Automatic failover: if one provider fails, the next is tried instantly
Quota-aware: tracks daily usage, warns at 80%, skips exhausted providers
Task search mode for multi-angle research queries
Built-in storage lifecycle (cache / index / report), no workspace clutter
Self-healing: health-based smart routing automatically promotes reliable providers
Quality optimization: relevance scoring, fuzzy dedup, domain diversity, re-ranking
Auto-discovery: probes candidate search engines and SearXNG instances for new sources
Self-diagnostic: doctor and setup commands for zero-friction onboarding

Provider Overview

Provider	Key Required	Free Quota	Index Source	Notes
`brave`	BRAVE_API_KEY	2000/day	Brave independent	High quality, privacy-friendly
`exa`	EXA_API_KEY	~33/day (1k/mo)	Neural + web	Semantic search, unique finds
`tavily`	TAVILY_API_KEY	1000/day	Web (AI-optimized)	Designed for AI agents
`duckduckgo`	None	~500/day	Bing + own	No key, privacy-focused
`bing_html`	None	~300/day	Microsoft Bing RSS	No key, stable XML feed
`mojeek`	None (or MOJEEK_API_KEY)	200/day	Mojeek independent	Non-Google/Bing index
`serper`	SERPER_API_KEY	2500/day	Google	High quota free tier
`searchapi`	SEARCHAPI_API_KEY	100/mo	Google / Bing	Multi-engine
`google_cse`	GOOGLE_API_KEY + GOOGLE_CX	100/day	Google	Official Google API
`baidu`	BAIDU_API_KEY	200/day	Baidu	Best for Chinese content
`wikipedia`	None	1000/day	Wikipedia	Factual/encyclopedic queries
`searxng`	None	unlimited (self-hosted)	Meta (all engines)	Requires own instance

Total daily quota (all keys configured): 8400+ requests/day

Credential model (important)

No mandatory API key — DuckDuckGo + Bing RSS + Mojeek + Wikipedia work out of the box.
API-key providers fail gracefully if key is missing (AuthError → skip, no quota consumed, no latency):
- BRAVE_API_KEY
- EXA_API_KEY
- TAVILY_API_KEY
- SERPER_API_KEY
- SEARCHAPI_API_KEY
- GOOGLE_API_KEY + GOOGLE_CX
- BAIDU_API_KEY
- MOJEEK_API_KEY (optional — without it uses HTML scraping)

Core capabilities

1. Search failover

Default provider order:

brave → exa → tavily → duckduckgo → bing_html → mojeek → serper → searchapi → google_cse → baidu → wikipedia

First successful non-empty result returns immediately.

2. Task-level multi-query search

Expands one goal into multiple targeted queries
Aggregates + deduplicates results
Prefix presets:
- default: workers=1
- @dual ... → workers=2
- @deep ... → workers=3 + deeper query coverage

3. Quota intelligence

Per-provider daily tracking
Real quota retrieval where supported (Tavily, SearchAPI, Brave via probe)
Auto concurrency reduction at 80% quota saturation

4. Provider health monitoring

Tracks success rate, latency, and error types per provider over time
Computes health scores (success 50%, latency 30%, freshness 20%)
Smart ordering: auto-promotes healthy providers, demotes degraded ones
View dashboard: python -m free_search health

5. Result quality optimization

Relevance scoring (query-title-snippet token overlap)
Enhanced dedup: URL + title similarity (Jaccard threshold)
Domain diversity: limits same-domain results (default max 3)
Automatic filtering of low-quality results (short titles, missing URLs)

6. Source auto-discovery

Probes all configured providers for availability
Scans candidate search engines (Marginalia, Wiby, public SearXNG instances)
Validates response format, latency, and result quality
Generates recommendations for new sources to integrate
Run: python -m free_search discover

7. Managed persistence

memory/search-cache/YYYY-MM-DD/*.json
memory/search-index/search-index.jsonl
memory/search-reports/YYYY-MM-DD/*.md

Quick commands

# Normal search
scripts/search "latest AI agent frameworks 2026" --max-results 5

# Task search (multi-query, parallel)
scripts/search task "@dual Compare Claude vs GPT-4 for code generation" --max-results 5

# Deep research mode
scripts/search task "@deep autonomous vehicle safety 2026" --max-results 8 --max-queries 10

# Quota status
scripts/status

# Real quota from provider APIs
scripts/remaining --real

# Cleanup cache
python3 -m free_search gc --cache-days 14

# Provider health dashboard
python3 -m free_search health

# Discover new search sources
python3 -m free_search discover

# System diagnostics
python3 -m free_search doctor

# Setup status & recommendations
python3 -m free_search setup

Provider setup guides

Bing RSS (`bing_html`) — No key needed

Uses Bing's built-in RSS endpoint (format=rss) — bypasses bot detection. Works out of the box.

Mojeek — No key needed (API key optional)

Out-of-the-box HTML scraping. For higher quotas/stability:

Register at https://www.mojeek.com/services/search/api/
Set MOJEEK_API_KEY → automatically switches to JSON API mode

Wikipedia — No key needed

Multilingual support — change lang in providers.yaml:

wikipedia:
  lang: it   # en | zh | it | de | fr | ja ...

Exa.ai — API key required

Register at https://exa.ai/
Set EXA_API_KEY
Free tier: 1000 searches/month (~33/day)

Google Custom Search — API key + CX required

Get API key: https://developers.google.com/custom-search/v1/introduction
Create search engine: https://programmablesearchengine.google.com/
Set GOOGLE_API_KEY and GOOGLE_CX
Free tier: 100 queries/day

Baidu Qianfan — API key required

Register at https://cloud.baidu.com/
Set BAIDU_API_KEY
Best for Chinese-language content

SearXNG — Self-hosted instance required

Public instances rate-limit server-to-server requests. Use your own:

docker run -d -p 8080:8080 searxng/searxng

Then in providers.yaml:

searxng:
  endpoint: http://localhost:8080
  enabled: true

Post-install self-check

# 1) Confirm provider load
scripts/status --compact

# 2) Smoke test (uses duckduckgo/bing/mojeek out of the box)
scripts/search "openclaw" --max-results 3 --compact

# 3) Verify storage paths
ls -la /home/openclaw/.openclaw/workspace/memory/search-cache/ | tail -n 5

# 4) Check real quota (optional)
scripts/remaining --real --compact

Output contract (stable)

Search: query, provider, results[], meta.attempted, meta.quota
Task search: task, queries[], grouped_results[], merged_results[], meta
Quota: date, providers[], totals; with --real: real_quota.providers[]

Operator notes

Default mode: workers=1 — conservative for cost control
Use @dual / @deep only for research tasks
SearXNG and YaCy are enabled: false by default (self-hosted only)
MOJEEK_API_KEY is optional — provider gracefully falls back to HTML scraping
Provider health data stored in memory/provider-health/health.jsonl
Discovery results stored in memory/provider-discovery/discovery.jsonl
Run python -m free_search doctor after setup to verify everything works
Run python -m free_search discover periodically to find new search sources

Usage Guidance

This skill appears to be what it says: a multi‑provider web search aggregator that will perform outbound HTTP requests to configured providers and public candidate search instances and store results locally under memory/. Before installing: 1) Review config/providers.yaml and disable or remove providers you don't trust (public SearXNG instances are probed by default in discovery). 2) Be mindful that queries (which may contain sensitive text) are sent to external services — do not send secrets or PII. 3) If you supply API keys, only provide keys for services you trust; keys are optional for several providers. 4) Ensure the runtime has the Python dependencies (requests, beautifulsoup4, PyYAML) since no automatic installer is specified. 5) If you are concerned about autonomous invocation or network access, restrict the skill's invocation or run it in a network‑restricted sandbox; the skill can run discovery/health probes automatically when invoked and will persist data under the workspace memory/. 6) Periodically inspect and configure retention for memory/ (gc command is available).

Capability Analysis

Type: OpenClaw Skill Name: free-search-aggregator Version: 1.3.0 The 'free-search-aggregator' bundle is a legitimate and well-engineered search tool for OpenClaw. It provides a robust multi-provider search interface with automatic failover, health monitoring (health.py), and quota management (router.py). The code includes several security-conscious features, such as response size limits to prevent memory exhaustion (providers.py) and path validation to ensure storage remains within the workspace (storage.py). No evidence of data exfiltration, malicious execution, or prompt injection was found; the tool functions exactly as described in its documentation (SKILL.md and README.md).

Capability Assessment

✓ Purpose & Capability

Name/description, README, SKILL.md, config/providers.yaml and the Python provider implementations all describe and implement a quota‑aware multi‑provider web search with failover, discovery, health tracking and local persistence. Environment variables mentioned (BRAVE_API_KEY, EXA_API_KEY, etc.) are provider API keys appropriate to the stated providers.

ℹ Instruction Scope

Runtime instructions and the CLI (doctor, discover, search, remaining, health, gc, setup) stay within search/diagnostic scope. The discovery feature actively probes many third‑party endpoints (public SearXNG instances and several candidate search engines) and will issue network requests with the user query; results are persisted under memory/. This network probing and persistence are expected but worth noting for privacy and outbound network policy considerations.

ℹ Install Mechanism

There is no install spec (lowest install risk). The skill bundle does include Python source and a requirements.txt (requests, BeautifulSoup, PyYAML). The package assumes a Python runtime with these libraries available; dependencies are not automatically installed by the skill manifest, so the operator must ensure required packages are present. No external arbitrary downloads or URL-based installers were found.

✓ Credentials

No mandatory credentials are declared. The code conditionally reads many provider API keys from environment variables (and treats missing keys as 'skip' behavior). The requested env vars are proportional to a multi‑provider search aggregator and are limited to provider API keys; no unrelated secrets or cloud credentials are requested.

✓ Persistence & Privilege

The skill writes logs, health records, quota state and search artifacts under a local memory/ workspace (e.g., memory/search-cache, memory/search-index, provider‑health). It does not request always:true, does not modify other skills, and does not require system‑wide privileges. Local persistence and quota state are consistent with the described caching and health features.

Version History

v1.3.0

新增健康监控与智能路由（按成功率/延迟自动调序）；新增质量优化（相关性评分、模糊去重、域名多样性）；新增来源自动发现（探测候选搜索引擎与SearXNG）；补充自诊断命令 doctor/setup 与测试覆盖，文档同步更新。

v1.2.1

Sync English README with 12-provider update, setup guidance, and smoke-test checklist.

v1.2.0

Expand to 12 providers; add Exa/Bing RSS/Mojeek/Wikipedia/Google CSE/Baidu/SearXNG; improve quota introspection and docs refresh.

v0.2.3

文档增强：新增安装后自检清单（配置、smoke search、存储路径、配额检查）

v0.2.2

安全修复：凭据模型与配置一致（key-provider默认禁用）；增加FREE_SEARCH_MEMORY_DIR路径保护与显式越权开关

v0.2.1

优化ClawHub商店文案：价值主张、能力分层、快速上手与运维说明重构

v0.2.0

task search升级、配额查询增强、结果统一存储与清理、README与配图更新

Metadata

Slug free-search-aggregator

Version 1.3.0

License —

All-time Installs 5

Active Installs 5

Total Versions 7

Frequently Asked Questions

What is Free Search Aggregator?

Quota-aware multi-provider web search for OpenClaw. Supports 12 search providers with automatic failover, task-level deep search (@dual/@deep), real quota ch... It is an AI Agent Skill for Claude Code / OpenClaw, with 601 downloads so far.

How do I install Free Search Aggregator?

Run "/install free-search-aggregator" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Free Search Aggregator free?

Yes, Free Search Aggregator is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Free Search Aggregator support?

Free Search Aggregator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Free Search Aggregator?

It is built and maintained by Tianqu (@vulcanusalex); the current version is v1.3.0.

More Skills

Free Search Aggregator

Free Search Aggregator

Why use this skill

Provider Overview

Credential model (important)

Core capabilities

1. Search failover

2. Task-level multi-query search

3. Quota intelligence

4. Provider health monitoring

5. Result quality optimization

6. Source auto-discovery

7. Managed persistence

Quick commands

Provider setup guides

Bing RSS (bing_html) — No key needed

Mojeek — No key needed (API key optional)

Wikipedia — No key needed

Exa.ai — API key required

Google Custom Search — API key + CX required

Baidu Qianfan — API key required

SearXNG — Self-hosted instance required

Post-install self-check

Output contract (stable)

Operator notes

What is Free Search Aggregator?

How do I install Free Search Aggregator?

Is Free Search Aggregator free?

Which platforms does Free Search Aggregator support?

Who created Free Search Aggregator?

💬 Comments

Bing RSS (`bing_html`) — No key needed