← Back to Skills Marketplace
agistack

Scraper

by AGIstack · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
738
Downloads
0
Stars
9
Active Installs
1
Versions
Install in OpenClaw
/install scraper
Description
Structured extraction and cleanup for public, user-authorized web pages. Use when the user wants to collect, clean, summarize, or transform content from acce...
Usage Guidance
This skill appears to do what it says: fetch public pages, extract text, and save results locally. Before installing or enabling it for autonomous use, consider: (1) the scripts will fetch any URL you or the agent give them — add URL validation or an allowlist if you need to block internal/IP ranges (SSRF risk); (2) there is no enforcement of 'public/user-authorized' rules — rely on agent policies or operator oversight to prevent misuse (paywall/login bypass, private endpoints); (3) outputs are stored at ~/.openclaw/workspace/memory/scraper — check and clean that directory if sensitive data might be saved. If you only plan manual, user-initiated runs and trust the callers, the skill is coherent and appropriate.
Capability Analysis
Type: OpenClaw Skill Name: scraper Version: 1.0.0 The scraper skill is a standard utility for fetching and cleaning public web content, with all operations restricted to local storage (~/.openclaw/workspace/). The Python scripts (fetch_page.py, extract_text.py, save_output.py) use the standard urllib library and basic regex for HTML stripping, and the SKILL.md instructions include explicit safety boundaries against bypassing access controls or collecting credentials.
Capability Assessment
Purpose & Capability
Name/description match the included scripts: fetching pages, extracting text, saving outputs locally. No unrelated credentials, binaries, or installs are requested.
Instruction Scope
SKILL.md and scripts restrict work to public/user-authorized pages and local-only storage. However, there is no runtime enforcement of those rules: the scripts will fetch any URL provided (including internal IPs/localhost), and there is no robots/paywall/captcha checking, rate limiting, or URL validation. That is expected for a small helper but is a security consideration rather than an incoherence.
Install Mechanism
No install spec and no remote downloads; the skill is instruction-only with bundled Python scripts, which minimizes install risk.
Credentials
The skill requires no environment variables or credentials and only writes under ~/.openclaw/workspace/memory/scraper, consistent with the declared purpose.
Persistence & Privilege
The skill is not always-enabled and can be invoked by the user. It does create persistent local state (jobs.json and output files) under the user's home — this is coherent but users should be aware of stored files and cleanup policy.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install scraper
  3. After installation, invoke the skill by name or use /scraper
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
[email protected]: Local-first extraction for public, user-authorized pages. Added page fetch, text extraction, local output saving, and job tracking.
Metadata
Slug scraper
Version 1.0.0
License MIT-0
All-time Installs 9
Active Installs 9
Total Versions 1
Frequently Asked Questions

What is Scraper?

Structured extraction and cleanup for public, user-authorized web pages. Use when the user wants to collect, clean, summarize, or transform content from acce... It is an AI Agent Skill for Claude Code / OpenClaw, with 738 downloads so far.

How do I install Scraper?

Run "/install scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Scraper free?

Yes, Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Scraper support?

Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Scraper?

It is built and maintained by AGIstack (@agistack); the current version is v1.0.0.

💬 Comments