← 返回 Skills 市场

Smart Web Scraper

Name: Smart Web Scraper
Author: mariusfit

作者 mariusfit · GitHub ↗ · v1.0.0

cross-platform ✓ 安全检测通过

2354

总下载

当前安装

版本数

在 OpenClaw 中安装

/install smart-web-scraper

功能描述

Extract structured data from any web page. Supports CSS selectors, auto-detection of tables and lists, JSON/CSV output formats. Use when asked to scrape a we...

安全使用建议

This skill appears coherent and implements a normal static-HTML scraper. Before installing or running: (1) Review the full scripts/scraper.py file yourself (the provided view was truncated here) to confirm there are no hidden network callbacks or unexpected behavior; (2) Run it in a sandbox or limited environment first to ensure it only fetches the target sites and does not contact unknown endpoints; (3) Be mindful of legal/terms-of-service and robots.txt — the tool can override robots rules with --ignore-robots; (4) Note that runtime dependency installation (e.g., via `uv run --with` or pip) will fetch code from PyPI — only install packages you trust; (5) Do not supply unrelated credentials (none are required). If you need higher assurance, ask the publisher for a source repository or sign-off and verify the remaining (truncated) portion of the script.

功能分析

Type: OpenClaw Skill Name: smart-web-scraper Version: 1.0.0 The OpenClaw AgentSkills bundle 'smart-web-scraper' is a web scraping tool that uses standard Python libraries (`urllib`, `BeautifulSoup`, `lxml`) to extract data from user-specified URLs. The `SKILL.md` and `README.md` clearly describe its functionality, and the `scripts/scraper.py` code implements this as described, including features like CSS selectors, table extraction, link analysis, and multi-page crawling. While the script performs network requests to arbitrary URLs and writes output to user-specified file paths, these are core functionalities of a web scraper and do not show evidence of intentional malicious behavior such as data exfiltration to unauthorized endpoints, persistence mechanisms, or arbitrary code execution beyond the skill's stated purpose. The instructions in the markdown files are straightforward and do not contain any prompt injection attempts.

能力评估

✓ Purpose & Capability

Name, description, README, SKILL.md examples, and the included Python script all align: they implement HTML scraping, table detection, link/structure extraction, and crawling. There are no unrelated environment variables, binaries, or config paths requested.

ℹ Instruction Scope

SKILL.md instructs running the included script (e.g. `uv run ... python scripts/scraper.py`) and documents options like respecting robots.txt, delay, and --ignore-robots. The instructions do not ask for unrelated system reads or credentials. Note: examples use `uv run --with` to auto-install dependencies at runtime — this will pull packages (beautifulsoup4, lxml) from package sources when executed.

✓ Install Mechanism

No install spec is present (instruction-only install), and the script relies on common Python libraries. No downloads from unknown URLs or archive extraction are present in the provided code. The only install-like behavior implied is runtime package install via the example `uv run --with`, which is expected for Python dependencies.

✓ Credentials

The skill requires no environment variables, credentials, or config paths. The script performs network requests to target URLs (expected for a scraper) and does not reference other system secrets in the visible code.

✓ Persistence & Privilege

The skill is not always-enabled and uses normal model invocation defaults. It does not request permanent presence or modify other skills; its operations are local (fetching remote pages and printing or writing outputs).

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install smart-web-scraper
安装完成后，直接呼叫该 Skill 的名称或使用 /smart-web-scraper 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of smart-web-scraper: - Extract structured data from any web page using CSS selectors or auto-detection. - Supports multiple output formats: text, JSON, CSV, and Markdown. - Includes commands for extracting tables, links, structured page data, and multi-page crawling. - Respects robots.txt and includes configurable rate limiting. - CLI documentation and usage examples provided for quick start.

元数据

Slug smart-web-scraper

版本 1.0.0

许可证 —

累计安装 14

当前安装数 14

历史版本数 1

常见问题

Smart Web Scraper 是什么？

Extract structured data from any web page. Supports CSS selectors, auto-detection of tables and lists, JSON/CSV output formats. Use when asked to scrape a we... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 2354 次。

如何安装 Smart Web Scraper？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install smart-web-scraper」即可一键安装，无需额外配置。

Smart Web Scraper 是免费的吗？

是的，Smart Web Scraper 完全免费（开源免费），可自由下载、安装和使用。

Smart Web Scraper 支持哪些平台？

Smart Web Scraper 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Smart Web Scraper？

由 mariusfit（@mariusfit）开发并维护，当前版本 v1.0.0。