← 返回 Skills 市场
opsun

Deep Scraper

作者 opsun · GitHub ↗ · v1.0.1
cross-platform ✓ 安全检测通过
11680
总下载
10
收藏
76
当前安装
2
版本数
在 OpenClaw 中安装
/install deep-scraper
功能描述
Performs deep scraping of complex sites like YouTube using containerized Crawlee, extracting validated, ad-free transcripts and content as JSON output.
安全使用建议
Install only if you are comfortable running a Dockerized browser scraper. Review or supply the missing Dockerfile before building, pin dependencies if reproducibility matters, and use it only on public or explicitly authorized pages where scraping is allowed by law and site policy.
功能分析
Type: OpenClaw Skill Name: deep-scraper Version: 1.0.1 The skill bundle is classified as benign. It implements a web scraper using Docker, Crawlee, and Playwright, which aligns with its stated purpose of 'deep web scraping' and 'penetrating protections' on complex websites. While it utilizes Docker and runs Playwright with `--no-sandbox` (a common requirement for Playwright in Docker that reduces in-container security), these capabilities are plausibly needed for its function. The code in `assets/main_handler.js` and `assets/youtube_handler.js` focuses on scraping public web content, outputs results to stdout, clears cookies to prevent session leakage, and does not exhibit any signs of data exfiltration, malicious execution, persistence, prompt injection against the agent, or obfuscation. The `SKILL.md` explicitly forbids scraping password-protected or non-public personal information, indicating a consideration for privacy.
能力评估
Purpose & Capability
The stated purpose is deep scraping of dynamic sites, and the code matches that by running Crawlee/Playwright against a user-supplied URL and printing extracted page, transcript, or description text to stdout. The bypass-oriented phrasing is caution-worthy but disclosed.
Instruction Scope
The skill forbids scraping password-protected or non-public personal information and labels output types in the documented CLI path. It would be safer with clearer authorization, terms-of-service, rate-limit, and anti-bypass warnings.
Install Mechanism
Installation requires Docker and a locally built image, but the reviewed artifact does not include a Dockerfile despite the package changelog saying one was included. Dependencies are version-ranged without a lockfile, so users should review and pin the build environment.
Credentials
A headless browser in Docker is proportionate for dynamic scraping. Chromium is launched with no-sandbox flags, which is common for Dockerized Playwright but weakens in-container isolation.
Persistence & Privilege
No credential-store access, background service, host-wide filesystem mount, persistence mechanism, destructive action, or outbound sink beyond visiting the requested target and writing results to stdout was found. The handlers clear browser cookies before scraping.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install deep-scraper
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /deep-scraper 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
Included Dockerfile in the package
v1.0.0
Initial release
元数据
Slug deep-scraper
版本 1.0.1
许可证
累计安装 440
当前安装数 76
历史版本数 2
常见问题

Deep Scraper 是什么?

Performs deep scraping of complex sites like YouTube using containerized Crawlee, extracting validated, ad-free transcripts and content as JSON output. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 11680 次。

如何安装 Deep Scraper?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install deep-scraper」即可一键安装,无需额外配置。

Deep Scraper 是免费的吗?

是的,Deep Scraper 完全免费(开源免费),可自由下载、安装和使用。

Deep Scraper 支持哪些平台?

Deep Scraper 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Deep Scraper?

由 opsun(@opsun)开发并维护,当前版本 v1.0.1。

💬 留言讨论