← Back to Skills Marketplace

Smart Web Scraper

Name: Smart Web Scraper
Author: mariusfit

by mariusfit · GitHub ↗ · v1.0.0

cross-platform ✓ Security Clean

2354

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install smart-web-scraper

Description

Extract structured data from any web page. Supports CSS selectors, auto-detection of tables and lists, JSON/CSV output formats. Use when asked to scrape a we...

Usage Guidance

This skill appears coherent and implements a normal static-HTML scraper. Before installing or running: (1) Review the full scripts/scraper.py file yourself (the provided view was truncated here) to confirm there are no hidden network callbacks or unexpected behavior; (2) Run it in a sandbox or limited environment first to ensure it only fetches the target sites and does not contact unknown endpoints; (3) Be mindful of legal/terms-of-service and robots.txt — the tool can override robots rules with --ignore-robots; (4) Note that runtime dependency installation (e.g., via `uv run --with` or pip) will fetch code from PyPI — only install packages you trust; (5) Do not supply unrelated credentials (none are required). If you need higher assurance, ask the publisher for a source repository or sign-off and verify the remaining (truncated) portion of the script.

Capability Analysis

Type: OpenClaw Skill Name: smart-web-scraper Version: 1.0.0 The OpenClaw AgentSkills bundle 'smart-web-scraper' is a web scraping tool that uses standard Python libraries (`urllib`, `BeautifulSoup`, `lxml`) to extract data from user-specified URLs. The `SKILL.md` and `README.md` clearly describe its functionality, and the `scripts/scraper.py` code implements this as described, including features like CSS selectors, table extraction, link analysis, and multi-page crawling. While the script performs network requests to arbitrary URLs and writes output to user-specified file paths, these are core functionalities of a web scraper and do not show evidence of intentional malicious behavior such as data exfiltration to unauthorized endpoints, persistence mechanisms, or arbitrary code execution beyond the skill's stated purpose. The instructions in the markdown files are straightforward and do not contain any prompt injection attempts.

Capability Assessment

✓ Purpose & Capability

Name, description, README, SKILL.md examples, and the included Python script all align: they implement HTML scraping, table detection, link/structure extraction, and crawling. There are no unrelated environment variables, binaries, or config paths requested.

ℹ Instruction Scope

SKILL.md instructs running the included script (e.g. `uv run ... python scripts/scraper.py`) and documents options like respecting robots.txt, delay, and --ignore-robots. The instructions do not ask for unrelated system reads or credentials. Note: examples use `uv run --with` to auto-install dependencies at runtime — this will pull packages (beautifulsoup4, lxml) from package sources when executed.

✓ Install Mechanism

No install spec is present (instruction-only install), and the script relies on common Python libraries. No downloads from unknown URLs or archive extraction are present in the provided code. The only install-like behavior implied is runtime package install via the example `uv run --with`, which is expected for Python dependencies.

✓ Credentials

The skill requires no environment variables, credentials, or config paths. The script performs network requests to target URLs (expected for a scraper) and does not reference other system secrets in the visible code.

✓ Persistence & Privilege

The skill is not always-enabled and uses normal model invocation defaults. It does not request permanent presence or modify other skills; its operations are local (fetching remote pages and printing or writing outputs).

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install smart-web-scraper
After installation, invoke the skill by name or use /smart-web-scraper
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release of smart-web-scraper: - Extract structured data from any web page using CSS selectors or auto-detection. - Supports multiple output formats: text, JSON, CSV, and Markdown. - Includes commands for extracting tables, links, structured page data, and multi-page crawling. - Respects robots.txt and includes configurable rate limiting. - CLI documentation and usage examples provided for quick start.

Metadata

Slug smart-web-scraper

Version 1.0.0

License —

All-time Installs 14

Active Installs 14

Total Versions 1

Frequently Asked Questions

What is Smart Web Scraper?

Extract structured data from any web page. Supports CSS selectors, auto-detection of tables and lists, JSON/CSV output formats. Use when asked to scrape a we... It is an AI Agent Skill for Claude Code / OpenClaw, with 2354 downloads so far.

How do I install Smart Web Scraper?

Run "/install smart-web-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Smart Web Scraper free?

Yes, Smart Web Scraper is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Smart Web Scraper support?

Smart Web Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Smart Web Scraper?

It is built and maintained by mariusfit (@mariusfit); the current version is v1.0.0.

More Skills