← Back to Skills Marketplace

WebScraper

Name: WebScraper
Author: lesliepie

by LesliePie · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

242

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install webscraper

Description

Extract readable content from web pages. Use when: user wants to read article content, fetch documentation, grab product info, or get text from URLs. NOT for...

Usage Guidance

This skill appears to do what it says (fetch and extract page content), but the runtime instructions include unsafe command patterns and operational advice you should not run verbatim. Before installing or using: (1) avoid the node -e $(curl ...) pattern — instead download HTML to a file (curl -s URL > page.html) and run a safe parser that reads the file, or fetch directly from Node using libraries (axios/fetch) to avoid shell injection; (2) prefer installing libraries per-project rather than npm -g to reduce global risk; (3) do not use proxies or UA tricks to bypass site protections unless you have explicit permission — this may violate terms of service and law; (4) be cautious piping remote content into executables (curl | program) because that can execute untrusted data. If the maintainer can sanitize the examples (safe Node fetch, avoid shell interpolation of remote data, clarify proxy guidance), the skill would be coherent and much safer.

Capability Analysis

Type: OpenClaw Skill Name: webscraper Version: 1.0.0 The webscraper skill bundle provides standard instructions and command-line examples (using curl and Node.js) for an AI agent to extract content from web pages. The code and markdown files (SKILL.md, package.json) align strictly with the stated purpose of web scraping and include best practices such as rate limiting and error handling without any signs of malicious intent or data exfiltration.

Capability Assessment

✓ Purpose & Capability

Name, description, and declared binaries (curl, node) are appropriate for a web content extraction skill. No unrelated credentials, config paths, or unexpected system access are requested.

⚠ Instruction Scope

Most instructions stay within scraping/extraction scope, but several recommendations are risky or inconsistent: (1) the provided Node one-liner embeds $(curl ...) into a node -e string, which causes shell substitution of untrusted HTML into executable JS and can enable command injection or execution if the fetched content contains quotes/backticks; (2) suggestions to 'use proxy' and to set UA to avoid bot detection encourage evasion of anti-bot measures (conflicts with the 'Respect robots.txt' admonition); (3) use of curl | readability and piping remote content into local commands is suggested without caution about executing untrusted data. These are sloppy/insecure operational patterns that could lead to accidental code execution or misuse.

ℹ Install Mechanism

The skill is instruction-only (no install spec) which is low-risk, but the doc recommends installing global tools (npm -g cheerio, readability-cli, html2text). Global npm installs and suggested third-party CLIs are normal for this task but increase attack surface and require user discretion; no install URLs or obscure downloads are present in the package metadata.

✓ Credentials

The skill does not request environment variables, credentials, or config paths. Recommendations (e.g., using proxies) might imply credential usage in practice, but nothing is declared or required by the skill itself.

✓ Persistence & Privilege

Flags are default (always:false, user-invocable:true, autonomous invocation allowed). The skill does not request permanent system presence nor modify other skills or system configs.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install webscraper
After installation, invoke the skill by name or use /webscraper
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release: web content extraction

Metadata

Slug webscraper

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is WebScraper?

Extract readable content from web pages. Use when: user wants to read article content, fetch documentation, grab product info, or get text from URLs. NOT for... It is an AI Agent Skill for Claude Code / OpenClaw, with 242 downloads so far.

How do I install WebScraper?

Run "/install webscraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is WebScraper free?

Yes, WebScraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does WebScraper support?

WebScraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created WebScraper?

It is built and maintained by LesliePie (@lesliepie); the current version is v1.0.0.

More Skills