← 返回 Skills 市场

Crawl By Desearch

Name: Crawl By Desearch
Author: okradze

作者 okradze · GitHub ↗ · v1.0.1

cross-platform ✓ 安全检测通过

877

总下载

当前安装

版本数

在 OpenClaw 中安装

/install desearch-crawl

功能描述

Crawl/scrape and extract content from any webpage URL. Returns the page content as clean text or raw HTML. Use this when you need to read the full contents o...

使用说明 (SKILL.md)

Crawl Webpage By Desearch

Extract content from any webpage URL. Returns clean text or raw HTML.

Quick Start

Get an API key from https://console.desearch.ai
Set environment variable: export DESEARCH_API_KEY='your-key-here'

Usage

# Crawl a webpage (returns clean text by default)
scripts/desearch.py crawl "https://en.wikipedia.org/wiki/Artificial_intelligence"

# Get raw HTML
scripts/desearch.py crawl "https://example.com" --crawl-format html

Options

Option	Description
`--crawl-format`	Output content format: `text` (default) or `html`

Examples

Read a documentation page

scripts/desearch.py crawl "https://docs.python.org/3/tutorial/index.html"

Get raw HTML for analysis

scripts/desearch.py crawl "https://example.com/page" --crawl-format html

Response

Example (`format=text`, truncated, default)

Artificial intelligence (AI) is the capability of computational systems to perform tasks that typically require human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making...

Example (`format=html`, truncated)

\x3C!DOCTYPE html>
\x3Chtml>
  \x3Chead>\x3Ctitle>Artificial intelligence - Wikipedia\x3C/title>\x3C/head>
  \x3Cbody>
    \x3Cp>Artificial intelligence (AI) is the capability of computational systems...\x3C/p>
  \x3C/body>
\x3C/html>

Notes

Response is plain text or raw HTML — not JSON.
Default format is text. Use --crawl-format html only when you need to inspect page structure.
Prefer text format to avoid bloating the agent context with markup.

Errors

Status 401, Unauthorized (e.g., missing/invalid API key)

{
  "detail": "Invalid or missing API key"
}

Status 402, Payment Required (e.g., balance depleted)

{
  "detail": "Insufficient balance, please add funds to your account to continue using the service."
}

Resources

安全使用建议

This skill is a thin client for the Desearch API and appears coherent. Before installing, confirm you trust desearch.ai and that you are comfortable sending target URLs (and their contents) to that external service. Treat the DESEARCH_API_KEY like any API secret: use least-privilege keys if supported, rotate keys periodically, and avoid using a key that has broader account permissions than necessary. Note the SKILL.md says the response is plain text/HTML but the script may return JSON objects from the API and pretty-print them — this is informational only. If you need offline/local crawling or want guarantees about sensitive content, do not send private pages to a third-party API.

功能分析

Type: OpenClaw Skill Name: desearch-crawl Version: 1.0.1 The skill bundle is benign. The `SKILL.md` provides clear, non-malicious instructions for using a web crawling service, including setting an API key. The `scripts/desearch.py` script acts as a client for the `api.desearch.ai` service, securely retrieving the `DESEARCH_API_KEY` from environment variables and directing all network traffic to the legitimate Desearch API endpoint. There is no evidence of data exfiltration, malicious execution, persistence mechanisms, or prompt injection attempts against the agent.

能力评估

✓ Purpose & Capability

Name/description (crawl/scrape pages) match the included CLI script which calls https://api.desearch.ai/web/crawl. The only required secret is DESEARCH_API_KEY, which is appropriate for a hosted crawl API.

ℹ Instruction Scope

SKILL.md instructs the agent to call the Desearch API and set DESEARCH_API_KEY; the included script does exactly that. Minor inconsistency: SKILL.md states responses are plain text or raw HTML (not JSON), but the script will pretty-print any JSON object returned by the API. This is a benign mismatch in how results are presented.

✓ Install Mechanism

There is no install step; the skill is instruction-only with a small included Python script that uses only the standard library (urllib). No downloads or archive extraction are performed.

✓ Credentials

Only one environment variable (DESEARCH_API_KEY) is required and is directly used to authorize requests to the stated API. No unrelated secrets, config paths, or excessive permissions are requested.

✓ Persistence & Privilege

The skill does not request always:true and does not modify other skills or system-wide settings. Default autonomous invocation is allowed (platform default) but not combined with other concerning privileges.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install desearch-crawl
安装完成后，直接呼叫该 Skill 的名称或使用 /desearch-crawl 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.1

- Expanded documentation with new "Quick Start", "Response", and "Errors" sections. - Added example outputs for both text and HTML formats. - Clarified that the response is plain text or raw HTML, not JSON. - Included error response examples for common status codes. - Added links to the API reference and Desearch Console for further resources.

v1.0.0

desearch-crawl 1.0.0 initial release - Introduces the ability to crawl or scrape any webpage URL. - Returns webpage content as either clean text (default) or raw HTML. - Simple command-line interface and usage documented. - Requires a DESEARCH_API_KEY environment variable for authentication.

元数据

Slug desearch-crawl

版本 1.0.1

许可证 —

累计安装 5

当前安装数 4

历史版本数 2

常见问题