← 返回 Skills 市场
1080
总下载
0
收藏
7
当前安装
1
版本数
在 OpenClaw 中安装
/install openclaw-scrapling
功能描述
Advanced web scraping with anti-bot bypass, JavaScript support, and adaptive selectors. Use when scraping websites with Cloudflare protection, dynamic conten...
安全使用建议
This package is internally coherent for an advanced web scraper — it will run Python code, download browser binaries, and can access arbitrary URLs (including internal network addresses). Before installing:
- Only install if you trust the source (GitHub repo) and you accept that the skill will run code and download ~500MB of browser binaries.
- Expect session cookies and selector caches to be written under the skill directory (~/.openclaw/skills/scrapling/sessions and selector_cache.json). Remove those files if they may contain sensitive tokens.
- Do not pass secrets or site credentials to the tool unless you trust it and the host environment; CLI args (username/password) are stored only if you save sessions.
- If you are concerned about exfiltration or internal network access, run the skill in a restricted environment (network policies, sandbox, or VM) and inspect scrape.py and the installed scrapling package source before use.
- If you need to ensure minimal privilege, avoid enabling stealth/dynamic modes that start a browser or save sessions, and prefer one-off basic HTTP fetches with explicit safe target URLs.
功能分析
Type: OpenClaw Skill
Name: openclaw-scrapling
Version: 1.0.0
The skill is a powerful web scraping tool with legitimate functionality, but it exposes several high-risk capabilities that could be exploited by a compromised AI agent through prompt injection. Specifically, the `scrape.py` script allows arbitrary file writes via the `--output` argument, enables extensive control over network requests (URL, proxy, headers) which could lead to SSRF, and the `SKILL.md` documentation explicitly details how to run custom Python scripts, creating a potential RCE vector if an agent can be prompted to write and execute arbitrary code. While no direct malicious intent or prompt injection is found in the provided files, these capabilities present significant vulnerabilities.
能力评估
Purpose & Capability
Name/description match the code and docs: scrape.py, examples, requirements.txt and skill.json all implement a scraper with stealth/dynamic/adaptive features. Declared required binaries (python3, pip) and Python package dependency (scrapling) are appropriate for the described functionality. Minor version differences in metadata (>=0.3.0 vs >=0.4.0) are not a red flag by themselves.
Instruction Scope
SKILL.md and scrape.py instruct the agent to run local scraping commands and to store sessions/selectors in the skill directory. The instructions allow scraping arbitrary URLs (external or internal), performing logins (username/password passed as CLI args), saving session files, screenshots, and writing outputs. They do not instruct reading unrelated system files or environment variables, but they do permit sending credentials via CLI args and persisting session tokens/cookies to disk.
Install Mechanism
There is no built-in install spec in the registry entry, but the repo includes requirements.txt and documented install steps that call 'pip install -r requirements.txt' and 'scrapling install' which will download browser binaries (~500MB). This is expected for a browser-driven scraper; no obscure external download URLs or shorteners are used in the package itself. Browser downloads will occur at runtime when the helper command is run.
Credentials
The skill declares no required environment variables or credentials, which fits the described purpose. However, the tool accepts credentials via CLI arguments (username/password) and will persist session state (session files and selector_cache.json) in the skill directory. Those behaviors are reasonable for a scraper but mean the skill can store sensitive tokens/credentials if provided.
Persistence & Privilege
always:false (no forced installation). The skill writes files into its own directory (sessions/, selector_cache.json) and downloads browsers into standard caches during 'scrapling install'. This is normal for this class of tool but means data and cookies will persist on disk under the skill and browser cache directories unless cleaned.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install openclaw-scrapling - 安装完成后,直接呼叫该 Skill 的名称或使用
/openclaw-scrapling触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of Scrapling skill for advanced, resilient web scraping.
- Supports scraping sites with anti-bot protection, JavaScript-rendered content, and frequent UI changes.
- Features stealth mode to bypass Cloudflare, bot detection, and use browser fingerprint spoofing.
- Handles dynamic content via Playwright-based automation and adaptive selectors for robust scraping across redesigns.
- Includes session management for login-required sites and support for proxies, rate limiting, and custom headers.
- Offers multiple extraction modes (text, markdown, attributes, multi-field) and output formats (JSON, CSV, TXT, MD, HTML).
- Provides both CLI commands and a Python API for flexibility.
元数据
常见问题
OpenClaw Scrapling 是什么?
Advanced web scraping with anti-bot bypass, JavaScript support, and adaptive selectors. Use when scraping websites with Cloudflare protection, dynamic conten... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1080 次。
如何安装 OpenClaw Scrapling?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install openclaw-scrapling」即可一键安装,无需额外配置。
OpenClaw Scrapling 是免费的吗?
是的,OpenClaw Scrapling 完全免费(开源免费),可自由下载、安装和使用。
OpenClaw Scrapling 支持哪些平台?
OpenClaw Scrapling 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 OpenClaw Scrapling?
由 cryptos3c(@cryptos3c)开发并维护,当前版本 v1.0.0。
推荐 Skills