/install liang-tavily-crawl
Tavily Crawl
Crawl websites to extract content from multiple pages. Ideal for documentation, knowledge bases, and site-wide content extraction.
Authentication
Get your API key at https://tavily.com and add to your OpenClaw config:
{
"skills": {
"entries": {
"tavily-crawl": {
"enabled": true,
"apiKey": "tvly-YOUR_API_KEY_HERE"
}
}
}
}
Or set in environment variable:
export TAVILY_API_KEY="tvly-YOUR_API_KEY_HERE"
Quick Start
Using the Script
node {baseDir}/scripts/crawl.mjs "https://docs.example.com"
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" --output ./docs
node {baseDir}/scripts/crawl.mjs "https://example.com" --depth 2 --limit 50
Examples
# Basic crawl
node {baseDir}/scripts/crawl.mjs "https://docs.example.com"
# Deeper crawl with limits
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" --depth 2 --limit 50
# Save to files
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" --depth 2 --output ./docs
# Focused crawl with path filters
node {baseDir}/scripts/crawl.mjs "https://example.com" --depth 2 \
--select "/docs/.*" --exclude "/blog/.*"
# With semantic instructions
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" \
--instructions "Find API documentation" --chunks 3
Options
| Option | Description | Default |
|---|---|---|
--depth \x3Cn> |
Crawl depth (1-5) | 1 |
--breadth \x3Cn> |
Links per page | 20 |
--limit \x3Cn> |
Total pages cap | 50 |
--output \x3Cdir> |
Save pages to directory | - |
--instructions \x3Ctext> |
Natural language guidance | - |
--chunks \x3Cn> |
Chunks per page (1-5, requires instructions) | - |
--depth-mode \x3Cmode> |
Extract depth: basic or advanced |
basic |
--select \x3Cpattern> |
Regex pattern to include | - |
--exclude \x3Cpattern> |
Regex pattern to exclude | - |
--timeout \x3Csec> |
Max wait time (10-150 seconds) | 150 |
--json |
Output raw JSON | false |
Depth vs Performance
| Depth | Typical Pages | Time |
|---|---|---|
| 1 | 10-50 | Seconds |
| 2 | 50-500 | Minutes |
| 3 | 500-5000 | Many minutes |
Start with --depth 1 and increase only if needed.
Crawl for Context vs Data Collection
For agentic use (feeding results into context): Always use --instructions + --chunks. This returns only relevant chunks instead of full pages, preventing context window explosion.
For data collection (saving to files): Omit --chunks to get full page content.
Tips
- Always use
--chunksfor agentic workflows - prevents context explosion when feeding results to LLMs - Omit
--chunksonly for data collection - when saving full pages to files - Start conservative (
--depth 1,--limit 20) and scale up - Use path patterns to focus on relevant sections
- Always set a
--limitto prevent runaway crawls
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install liang-tavily-crawl - 安装完成后,直接呼叫该 Skill 的名称或使用
/liang-tavily-crawl触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Tavily Crawl 是什么?
Crawl any website and save pages as local markdown files. Ideal for downloading documentation, knowledge bases, or web content for offline access or analysis. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 550 次。
如何安装 Tavily Crawl?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install liang-tavily-crawl」即可一键安装,无需额外配置。
Tavily Crawl 是免费的吗?
是的,Tavily Crawl 完全免费(开源免费),可自由下载、安装和使用。
Tavily Crawl 支持哪些平台?
Tavily Crawl 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Tavily Crawl?
由 Liang(@matthew77)开发并维护,当前版本 v1.0.0。