← 返回 Skills 市场
Zoomin Docs Portal Scraper Tool
作者
Justin Paul
· GitHub ↗
· v1.0.2
758
总下载
0
收藏
0
当前安装
3
版本数
在 OpenClaw 中安装
/install zoomin-scraper-recklessop
功能描述
Scrape documentation content from Zoomin Software portals using Playwright browser automation to handle dynamic content loading. Use when standard web fetchi...
安全使用建议
This skill appears to be a straightforward Playwright-based scraper and contains no requests for credentials or hidden network exfiltration. However, before installing or running it:
- Note the mismatch: the SKILL.md claims 'Zoomin' but the code contains many Zerto-specific assumptions (default output names, URL sanitization and content cleanup). If you expect a generic Zoomin scraper, test on a small set of URLs first.
- You must run `pip install playwright` and `playwright install chromium` yourself; these commands download browser binaries. Prefer doing this in an isolated virtualenv or sandbox.
- The wrapper example in SKILL.md uses named parameters but the script expects positional args; call run_scraper.sh with: ./run_scraper.sh <urls_file> <output_dir> <venv_path>.
- The scraper will visit arbitrary URLs you supply and write text files to disk. Only provide URLs you are permitted to scrape (observe robots.txt / terms of service) and run the script in a directory where writing files is acceptable.
- If you want to be extra cautious, review/modify the code to remove or adapt Zerto-specific patterns, and run the scraper on a controlled test list before bulk use.
Given the repackaging inconsistencies and guidance mismatches, treat this skill as safe-but-suspicious until you've validated it in your environment.
功能分析
Type: OpenClaw Skill
Name: zoomin-scraper-recklessop
Version: 1.0.2
The skill bundle is classified as suspicious due to several critical vulnerabilities that could lead to arbitrary code execution, local file read/write, and data exfiltration. Specifically, `scripts/run_scraper.sh` is vulnerable to shell injection by directly sourcing a user-provided `VENV_PATH` without validation. Additionally, `scripts/scrape_zoomin.py` and `scripts/analyze_docs_batch.py` accept file paths and output directories directly from command-line arguments without sanitization, enabling an attacker to read arbitrary local files (e.g., via `urls_file_path`) or write scraped content to arbitrary locations (e.g., via `output_dir`). While the skill's stated purpose is legitimate web scraping, these vulnerabilities present significant attack surfaces.
能力评估
Purpose & Capability
The skill's stated purpose is scraping Zoomin-powered docs using Playwright, which matches the inclusion of Playwright-based scraper code. However, the code is heavily tailored to Zerto (default filenames and directories reference zerto_hyperv, sanitization removes 'help_zerto_com' prefixes, and regex cleans up 'From the Zerto User Interface'), indicating the package was repurposed from a Zerto-specific scraper. The example CLI in SKILL.md (named parameters) does not match run_scraper.sh (positional args). These are coherence issues (likely sloppy repackaging) but do not by themselves indicate extra malicious capability.
Instruction Scope
SKILL.md instructs the user to manually install Playwright and to run the provided wrapper script which activates a virtualenv and runs the scraper. The runtime instructions only perform web navigation of user-supplied URLs, extract page content, and write text files to the specified output directory. The scripts do not attempt to read unrelated local files, access extra environment variables, or transmit scraped content to external endpoints other than the pages being scraped. Note: the script will visit arbitrary URLs provided by the user — only supply trusted/allowed targets and be mindful of legal/robots constraints.
Install Mechanism
There is no automated install spec; SKILL.md asks the user to run `pip install playwright` and `playwright install chromium`. That manual step downloads Playwright and browser binaries from upstream, which is expected for Playwright but does involve fetching executables over the network. The skill itself does not perform automatic remote installs or fetch arbitrary remote code.
Credentials
The skill requires no environment variables or credentials and the scripts do not access secrets. The only runtime requirement is a Python virtual environment path to activate. No disproportionate credential requests were found.
Persistence & Privilege
The skill is not always-enabled and does not alter other skills or global agent settings. It writes only to the output directory you provide and prints results to stdout; it does not persist credentials or attempt to install itself persistently.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install zoomin-scraper-recklessop - 安装完成后,直接呼叫该 Skill 的名称或使用
/zoomin-scraper-recklessop触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.2
Updated SKILL.md to generalize all descriptions and examples, removing product-specific names and user-specific paths.
v1.0.1
Updated SKILL.md to generalize descriptions and examples, removing specific product names and user paths as requested.
v1.0.0
Initial version, modified to exclude version dropdown from scraped text.
元数据
常见问题
Zoomin Docs Portal Scraper Tool 是什么?
Scrape documentation content from Zoomin Software portals using Playwright browser automation to handle dynamic content loading. Use when standard web fetchi... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 758 次。
如何安装 Zoomin Docs Portal Scraper Tool?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install zoomin-scraper-recklessop」即可一键安装,无需额外配置。
Zoomin Docs Portal Scraper Tool 是免费的吗?
是的,Zoomin Docs Portal Scraper Tool 完全免费(开源免费),可自由下载、安装和使用。
Zoomin Docs Portal Scraper Tool 支持哪些平台?
Zoomin Docs Portal Scraper Tool 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Zoomin Docs Portal Scraper Tool?
由 Justin Paul(@recklessop)开发并维护,当前版本 v1.0.2。
推荐 Skills