← 返回 Skills 市场

Tavily Crawl

Name: Tavily Crawl
Author: matthew77

作者 Liang · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

550

总下载

当前安装

版本数

在 OpenClaw 中安装

/install liang-tavily-crawl

功能描述

Crawl any website and save pages as local markdown files. Ideal for downloading documentation, knowledge bases, or web content for offline access or analysis.

使用说明 (SKILL.md)

Tavily Crawl

Crawl websites to extract content from multiple pages. Ideal for documentation, knowledge bases, and site-wide content extraction.

Authentication

Get your API key at https://tavily.com and add to your OpenClaw config:

{
  "skills": {
    "entries": {
      "tavily-crawl": {
        "enabled": true,
        "apiKey": "tvly-YOUR_API_KEY_HERE"
      }
    }
  }
}

Or set in environment variable:

export TAVILY_API_KEY="tvly-YOUR_API_KEY_HERE"

Quick Start

Using the Script

node {baseDir}/scripts/crawl.mjs "https://docs.example.com"
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" --output ./docs
node {baseDir}/scripts/crawl.mjs "https://example.com" --depth 2 --limit 50

Examples

# Basic crawl
node {baseDir}/scripts/crawl.mjs "https://docs.example.com"

# Deeper crawl with limits
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" --depth 2 --limit 50

# Save to files
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" --depth 2 --output ./docs

# Focused crawl with path filters
node {baseDir}/scripts/crawl.mjs "https://example.com" --depth 2 \
  --select "/docs/.*" --exclude "/blog/.*"

# With semantic instructions
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" \
  --instructions "Find API documentation" --chunks 3

Options

Option	Description	Default
`--depth \x3Cn>`	Crawl depth (1-5)	1
`--breadth \x3Cn>`	Links per page	20
`--limit \x3Cn>`	Total pages cap	50
`--output \x3Cdir>`	Save pages to directory	-
`--instructions \x3Ctext>`	Natural language guidance	-
`--chunks \x3Cn>`	Chunks per page (1-5, requires instructions)	-
`--depth-mode \x3Cmode>`	Extract depth: `basic` or `advanced`	`basic`
`--select \x3Cpattern>`	Regex pattern to include	-
`--exclude \x3Cpattern>`	Regex pattern to exclude	-
`--timeout \x3Csec>`	Max wait time (10-150 seconds)	150
`--json`	Output raw JSON	false

Depth vs Performance

Depth	Typical Pages	Time
1	10-50	Seconds
2	50-500	Minutes
3	500-5000	Many minutes

Start with --depth 1 and increase only if needed.

Crawl for Context vs Data Collection

For agentic use (feeding results into context): Always use --instructions + --chunks. This returns only relevant chunks instead of full pages, preventing context window explosion.

For data collection (saving to files): Omit --chunks to get full page content.

Tips

Always use --chunks for agentic workflows - prevents context explosion when feeding results to LLMs
Omit --chunks only for data collection - when saving full pages to files
Start conservative (--depth 1, --limit 20) and scale up
Use path patterns to focus on relevant sections
Always set a --limit to prevent runaway crawls

安全使用建议

This skill is internally coherent, but note that running it sends the target URL and any natural-language instructions to Tavily's API (https://api.tavily.com/crawl) along with your TAVILY_API_KEY. Only use it with sites you are allowed to crawl and avoid sending private/internal URLs or secrets. Ensure the API key has appropriate scope/rotation policy. When saving output, explicitly set --output to a safe directory and set conservative --limit/--depth values to avoid large or unintended crawls. If you need higher assurance, review the included scripts/crawl.mjs yourself and confirm the Tavily domain and API behavior match your expectations before installing.

功能分析

Type: OpenClaw Skill Name: liang-tavily-crawl Version: 1.0.0 The skill's primary function is web crawling and saving content locally, which aligns with its description. However, the `scripts/crawl.mjs` script contains a path traversal vulnerability. The `--output` argument, which specifies the directory for saving files, is used directly in `fs.mkdirSync` and `path.join` without sanitization. This allows an attacker to specify paths like `../../` to write files to arbitrary locations on the host filesystem, posing a significant RCE risk. There is no evidence of intentional malicious behavior such as data exfiltration to unauthorized endpoints or backdoor installation.

能力评估

✓ Purpose & Capability

Name/description match required items: it requires node and a TAVILY_API_KEY, and the script POSTs crawl requests to https://api.tavily.com/crawl. There are no unrelated credentials, binaries, or surprising config paths.

✓ Instruction Scope

SKILL.md and the script are narrowly scoped to crawling: they instruct calling the included Node script which sends the target URL and options to the Tavily API and then prints or writes returned content. The instructions do not request other system files or additional environment variables.

✓ Install Mechanism

No install spec; the skill is instruction-plus-script and relies on the node binary already present. The included script is plain, readable JS — there are no external downloads or archives executed at install time.

✓ Credentials

Only one credential is required (TAVILY_API_KEY), and it directly aligns with the script's Authorization header. No other secrets or unrelated env vars are requested.

✓ Persistence & Privilege

always is false and the skill does not attempt to modify other skills or system-wide settings. It writes files only to a user-specified output directory (when provided).

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install liang-tavily-crawl
安装完成后，直接呼叫该 Skill 的名称或使用 /liang-tavily-crawl 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of Tavily Crawl skill. - Crawl any website and save pages as local markdown files for offline use or analysis. - Supports options for crawl depth, page limits, output directory, and path filters. - Allows guidance with natural language instructions and semantic chunking. - Flexible configurations for targeted or broad website extraction. - Requires a Tavily API key for authentication.

元数据

Slug liang-tavily-crawl

版本 1.0.0

许可证 —

累计安装 1

当前安装数 1

历史版本数 1

常见问题

Tavily Crawl 是什么？

Crawl any website and save pages as local markdown files. Ideal for downloading documentation, knowledge bases, or web content for offline access or analysis. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 550 次。

如何安装 Tavily Crawl？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install liang-tavily-crawl」即可一键安装，无需额外配置。

Tavily Crawl 是免费的吗？

是的，Tavily Crawl 完全免费（开源免费），可自由下载、安装和使用。

Tavily Crawl 支持哪些平台？

Tavily Crawl 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Tavily Crawl？

由 Liang（@matthew77）开发并维护，当前版本 v1.0.0。

Tavily Crawl

Tavily Crawl

Authentication

Quick Start

Using the Script

Examples

Options

Depth vs Performance

Crawl for Context vs Data Collection

Tips

Tavily Crawl 是什么？

如何安装 Tavily Crawl？

Tavily Crawl 是免费的吗？

Tavily Crawl 支持哪些平台？

谁开发了 Tavily Crawl？

💬 留言讨论