← Back to Skills Marketplace

Web Crawl

Name: Web Crawl
Author: nowhitestar

by 不白 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

235

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install web-crawl

Description

Advanced web crawling and content extraction tool with multiple extraction modes

Usage Guidance

This skill appears to do what it says (crawl and extract web content), but take the following precautions before installing or allowing it to run autonomously: - Review the full web_crawl.py file in your environment (the provided manifest shows the file content truncated), because hidden code or unexpected behavior could be present in the omitted portion. - Be aware that crawling arbitrary URLs can access internal services (metadata endpoints, intranet, admin consoles) if the agent has network access — avoid allowing this skill to scan sensitive hosts or provide a restricted allowlist of target domains. - Examples include an exec-style command that runs Python from the skill workspace. Do not run shell execs or workspace-local scripts without explicit review and user consent. - Ensure required Python dependencies (requests, beautifulsoup4) are installed from trusted sources. If you want higher assurance, ask the skill author for a full unobfuscated source, or run it in a sandboxed environment and limit its outbound network access and allowed target domains.

Capability Analysis

Type: OpenClaw Skill Name: web-crawl Version: 1.0.0 The 'web-crawl' skill is a legitimate and well-structured tool designed for advanced web content extraction and research. The core logic in `web_crawl.py` uses standard libraries (requests, BeautifulSoup) to convert HTML to various formats like Markdown and JSON, including support for CSS selectors and parallel crawling. The `research.py` script provides templates and orchestration logic to help an AI agent perform multi-step research. No evidence of data exfiltration, malicious execution, or harmful prompt injection was found; the code and instructions are entirely consistent with the stated purpose of web crawling and information synthesis.

Capability Assessment

✓ Purpose & Capability

The name/description (web crawling and extraction) match the included code (web_crawl.py, research.py), README, and examples. No unrelated credentials, binaries, or config paths are requested.

ℹ Instruction Scope

SKILL.md and examples instruct the agent to run searches, crawl URLs, and synthesize results — this is expected. However, the skill enables fetching arbitrary URLs, which can reach internal or otherwise sensitive endpoints if the agent has network access (SSRF-like risk). EXAMPLES.md also shows an exec-style local python invocation that, if executed, runs code from the skill workspace; the agent should not run arbitrary shell execs without user consent.

✓ Install Mechanism

No install spec is provided (instruction-only installation). The package contains Python source files and documents pip dependencies (requests, beautifulsoup4). No remote downloads or unusual install steps were found.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. Its network access is inherent to crawling and is proportional to its stated purpose.

✓ Persistence & Privilege

No elevated persistence or 'always' flag is requested. The skill is user-invocable and allows autonomous invocation by default (normal for skills) but does not appear to modify other skills or system-wide settings.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install web-crawl
After installation, invoke the skill by name or use /web-crawl
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

- Initial release of web-crawl skill providing advanced web content extraction. - Supports multiple extraction modes: text, markdown, links, structured, and full. - Includes three main tools: web_crawl (single URL), parallel_crawl (multiple URLs), and research_topic (multi-step research). - Activated by keywords related to web research, crawling, and analysis in English and Chinese.

Metadata

Slug web-crawl

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Web Crawl?

Advanced web crawling and content extraction tool with multiple extraction modes. It is an AI Agent Skill for Claude Code / OpenClaw, with 235 downloads so far.

How do I install Web Crawl?

Run "/install web-crawl" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Web Crawl free?

Yes, Web Crawl is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Web Crawl support?

Web Crawl is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Web Crawl?

It is built and maintained by 不白 (@nowhitestar); the current version is v1.0.0.

More Skills