← Back to Skills Marketplace
mariusfit

Smart Web Scraper

by mariusfit · GitHub ↗ · v1.0.0
cross-platform ✓ Security Clean
2354
Downloads
0
Stars
14
Active Installs
1
Versions
Install in OpenClaw
/install smart-web-scraper
Description
Extract structured data from any web page. Supports CSS selectors, auto-detection of tables and lists, JSON/CSV output formats. Use when asked to scrape a we...
Usage Guidance
This skill appears coherent and implements a normal static-HTML scraper. Before installing or running: (1) Review the full scripts/scraper.py file yourself (the provided view was truncated here) to confirm there are no hidden network callbacks or unexpected behavior; (2) Run it in a sandbox or limited environment first to ensure it only fetches the target sites and does not contact unknown endpoints; (3) Be mindful of legal/terms-of-service and robots.txt — the tool can override robots rules with --ignore-robots; (4) Note that runtime dependency installation (e.g., via `uv run --with` or pip) will fetch code from PyPI — only install packages you trust; (5) Do not supply unrelated credentials (none are required). If you need higher assurance, ask the publisher for a source repository or sign-off and verify the remaining (truncated) portion of the script.
Capability Analysis
Type: OpenClaw Skill Name: smart-web-scraper Version: 1.0.0 The OpenClaw AgentSkills bundle 'smart-web-scraper' is a web scraping tool that uses standard Python libraries (`urllib`, `BeautifulSoup`, `lxml`) to extract data from user-specified URLs. The `SKILL.md` and `README.md` clearly describe its functionality, and the `scripts/scraper.py` code implements this as described, including features like CSS selectors, table extraction, link analysis, and multi-page crawling. While the script performs network requests to arbitrary URLs and writes output to user-specified file paths, these are core functionalities of a web scraper and do not show evidence of intentional malicious behavior such as data exfiltration to unauthorized endpoints, persistence mechanisms, or arbitrary code execution beyond the skill's stated purpose. The instructions in the markdown files are straightforward and do not contain any prompt injection attempts.
Capability Assessment
Purpose & Capability
Name, description, README, SKILL.md examples, and the included Python script all align: they implement HTML scraping, table detection, link/structure extraction, and crawling. There are no unrelated environment variables, binaries, or config paths requested.
Instruction Scope
SKILL.md instructs running the included script (e.g. `uv run ... python scripts/scraper.py`) and documents options like respecting robots.txt, delay, and --ignore-robots. The instructions do not ask for unrelated system reads or credentials. Note: examples use `uv run --with` to auto-install dependencies at runtime — this will pull packages (beautifulsoup4, lxml) from package sources when executed.
Install Mechanism
No install spec is present (instruction-only install), and the script relies on common Python libraries. No downloads from unknown URLs or archive extraction are present in the provided code. The only install-like behavior implied is runtime package install via the example `uv run --with`, which is expected for Python dependencies.
Credentials
The skill requires no environment variables, credentials, or config paths. The script performs network requests to target URLs (expected for a scraper) and does not reference other system secrets in the visible code.
Persistence & Privilege
The skill is not always-enabled and uses normal model invocation defaults. It does not request permanent presence or modify other skills; its operations are local (fetching remote pages and printing or writing outputs).
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install smart-web-scraper
  3. After installation, invoke the skill by name or use /smart-web-scraper
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of smart-web-scraper: - Extract structured data from any web page using CSS selectors or auto-detection. - Supports multiple output formats: text, JSON, CSV, and Markdown. - Includes commands for extracting tables, links, structured page data, and multi-page crawling. - Respects robots.txt and includes configurable rate limiting. - CLI documentation and usage examples provided for quick start.
Metadata
Slug smart-web-scraper
Version 1.0.0
License
All-time Installs 14
Active Installs 14
Total Versions 1
Frequently Asked Questions

What is Smart Web Scraper?

Extract structured data from any web page. Supports CSS selectors, auto-detection of tables and lists, JSON/CSV output formats. Use when asked to scrape a we... It is an AI Agent Skill for Claude Code / OpenClaw, with 2354 downloads so far.

How do I install Smart Web Scraper?

Run "/install smart-web-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Smart Web Scraper free?

Yes, Smart Web Scraper is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Smart Web Scraper support?

Smart Web Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Smart Web Scraper?

It is built and maintained by mariusfit (@mariusfit); the current version is v1.0.0.

💬 Comments