← 返回 Skills 市场
317
总下载
0
收藏
2
当前安装
1
版本数
在 OpenClaw 中安装
/install web-crawler
功能描述
网页爬虫工具,支持静态和动态页面爬取、媒体下载、反爬虫规避。激活条件:用户提到爬虫、爬取、crawler、scraper、抓取网页、下载媒体
安全使用建议
Do not enable or run this skill until the author/source is verified and missing pieces are resolved. Ask for: (1) source code (the referenced ./src/index.js and package.json), (2) an install spec or explicit list of runtime dependencies (Node version, Puppeteer, Chrome) and how Puppeteer/Chrome will be provided, (3) explanation and justification for the hardcoded proxies (why those IPs, who controls them), and (4) where outputs are stored, retention policy, and limits on media downloads. If you must test it, run it in an isolated VM or sandbox with restricted network access, remove/replace hardcoded proxies, and avoid granting broad autonomous invocation or access to sensitive internal networks. Because the package is instruction-only and inconsistent, proceed cautiously.
功能分析
Type: OpenClaw Skill
Name: web-crawler
Version: 1.0.0
The skill bundle describes a web crawler with Puppeteer and proxy rotation capabilities. It is classified as suspicious due to the inclusion of hardcoded internal IP addresses (192.168.10.222) for proxy configurations in SKILL.md, which could be used for internal network pivoting. Furthermore, the actual implementation logic in src/index.js and configuration in config/default.json are missing from the provided files, preventing verification of the crawler's behavior.
能力评估
Purpose & Capability
The described capability (static/dynamic crawling, media download, anti-bot) matches the SKILL.md content. However the skill requires a local Node module ('./src/index.js'), Puppeteer and a system Chrome binary, and expects config files under the workspace — none of these artifacts or required binaries are declared in the registry metadata. That mismatch suggests incomplete packaging or sloppy metadata.
Instruction Scope
The SKILL.md instructs the agent to cd into /home/node/.openclaw/workspace/web-crawler, require local code, read config/default.json, use proxy lists (including hardcoded 192.168.x.x addresses), and write scraped HTML/media/screenshots into outputs/. Those are file-system and network operations that go beyond a simple instruction: they create persistent output directories and rely on local binaries and proxies. The skill does not include safeguards or explain consent/permissions for writing or large data downloads.
Install Mechanism
There is no install spec (instruction-only), which is low-risk in general. But because the instructions expect Node/Puppeteer/Chrome and local source files, the absence of an install step is an inconsistency: a consumer would need to manually install dependencies and supply the missing code and browser, increasing the chance of misconfiguration or supply-chain risk.
Credentials
No environment variables or credentials are declared, yet the skill expects proxy configuration (antiBot.proxyList) and access to system browser executables and the workspace filesystem. Hardcoded proxies pointing at private IPs are suspicious (they may route traffic through an internal host). The skill will download media and write structured data locally, which could be used for large-scale scraping or exfiltration if misused.
Persistence & Privilege
The skill does not request always:true and does not declare changes to other skills or system-wide settings. It will create output files under its workspace, which is normal for a crawler, but that is not a platform-level persistence privilege.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install web-crawler - 安装完成后,直接呼叫该 Skill 的名称或使用
/web-crawler触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Web Crawler Skill v1.0.0
- Initial release.
- Supports static and dynamic webpage crawling, including JS-rendered pages.
- Automatic media downloading (images, video, audio) to outputs directory.
- Anti-crawling measures: user-agent rotation, request delay, proxy rotation.
- Easy configuration of crawl depth, page limits, media download, and proxies.
- Organized output: HTML, text, screenshots, media files, and structured data.
元数据
常见问题
Web Crawler 是什么?
网页爬虫工具,支持静态和动态页面爬取、媒体下载、反爬虫规避。激活条件:用户提到爬虫、爬取、crawler、scraper、抓取网页、下载媒体. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 317 次。
如何安装 Web Crawler?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install web-crawler」即可一键安装,无需额外配置。
Web Crawler 是免费的吗?
是的,Web Crawler 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Web Crawler 支持哪些平台?
Web Crawler 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Web Crawler?
由 噢福阔斯KANG(@jinkang19940922)开发并维护,当前版本 v1.0.0。
推荐 Skills