← 返回 Skills 市场
guifav

Web Scraper

作者 Guilherme Favaron · GitHub ↗ · v0.1.1 · MIT-0
cross-platform ⚠ suspicious
8181
总下载
3
收藏
125
当前安装
2
版本数
在 OpenClaw 中安装
/install web-scraper
功能描述
Web scraping and content comprehension agent — multi-strategy extraction with cascade fallback, news detection, boilerplate removal, structured metadata, and...
安全使用建议
Install only if you will use it on sites and content you are authorized to scrape. Review generated scripts before running them, avoid the soft-paywall reveal workflow, and do not use optional OpenRouter entity extraction on confidential or access-controlled content unless that external data flow is acceptable.
功能分析
Type: OpenClaw Skill Name: web-scraper Version: 0.1.1 The 'web-scraper' skill is a well-structured and documented tool for multi-stage web content extraction and analysis. It follows industry best practices such as rate limiting, robots.txt compliance, and a cascade approach to resource usage, while explicitly instructing the agent to avoid sensitive files like .env and to handle API keys securely via environment variables (SKILL.md, claw.json).
能力评估
Purpose & Capability
Network scraping, local output files, Playwright rendering, and optional LLM entity extraction fit the stated web-scraper purpose, but the soft-paywall workflow goes beyond normal public-content extraction by directing DOM manipulation to expose hidden subscriber-gated material.
Instruction Scope
The skill is generally scoped with planning, rate limiting, robots.txt guidance, and credential-file avoidance, but its paywall instructions are under-scoped because they permit revealing hidden paywalled text without a clear authorization requirement.
Install Mechanism
The package contains markdown instructions, a changelog, and claw.json metadata only; no executable installer, hidden script, or automatic startup mechanism was found.
Credentials
Filesystem and network permissions, Python/pip/npx requirements, and generated scraping scripts are proportionate for this purpose; optional Stage 5 sends cleaned article text to OpenRouter when used.
Persistence & Privilege
The skill creates scripts, YAML configs, JSON outputs, and checkpoints, and it says not to read or modify .env or credential files; no background persistence or privilege escalation is evident.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install web-scraper
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /web-scraper 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.1
web-scraper 0.1.1 - Added a CHANGELOG.md file for better tracking of updates. - Clarified credential handling: the skill itself never makes direct API calls, and only generated scripts reference `OPENROUTER_API_KEY` if LLM extraction is used. - Updated environment survey instructions to check for presence of `OPENROUTER_API_KEY` (for generated code), not the value. - Improved documentation for credential scope and clarified planning protocol steps.
v0.1.0
web-scraper 0.1.0 — Initial release - Introduces a web scraping and content comprehension agent with multi-strategy extraction and cascade fallback. - Implements a mandatory planning protocol for all scraping requests, emphasizing environment and target analysis before action. - Features a 5-stage pipeline: News/Article detection, multi-strategy extraction, cleaning/normalization, structured metadata extraction, and optional entity extraction using LLMs. - Ensures safe credential handling by referencing `OPENROUTER_API_KEY` only in template code. - Outputs structured JSON files with detailed content and quality metadata.
元数据
Slug web-scraper
版本 0.1.1
许可证 MIT-0
累计安装 287
当前安装数 125
历史版本数 2
常见问题

Web Scraper 是什么?

Web scraping and content comprehension agent — multi-strategy extraction with cascade fallback, news detection, boilerplate removal, structured metadata, and... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 8181 次。

如何安装 Web Scraper?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install web-scraper」即可一键安装,无需额外配置。

Web Scraper 是免费的吗?

是的,Web Scraper 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Web Scraper 支持哪些平台?

Web Scraper 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Web Scraper?

由 Guilherme Favaron(@guifav)开发并维护,当前版本 v0.1.1。

💬 留言讨论