← Back to Skills Marketplace
opsun

Deep Scraper

by opsun · GitHub ↗ · v1.0.1
cross-platform ✓ Security Clean
11680
Downloads
10
Stars
76
Active Installs
2
Versions
Install in OpenClaw
/install deep-scraper
Description
Performs deep scraping of complex sites like YouTube using containerized Crawlee, extracting validated, ad-free transcripts and content as JSON output.
Usage Guidance
Install only if you are comfortable running a Dockerized browser scraper. Review or supply the missing Dockerfile before building, pin dependencies if reproducibility matters, and use it only on public or explicitly authorized pages where scraping is allowed by law and site policy.
Capability Analysis
Type: OpenClaw Skill Name: deep-scraper Version: 1.0.1 The skill bundle is classified as benign. It implements a web scraper using Docker, Crawlee, and Playwright, which aligns with its stated purpose of 'deep web scraping' and 'penetrating protections' on complex websites. While it utilizes Docker and runs Playwright with `--no-sandbox` (a common requirement for Playwright in Docker that reduces in-container security), these capabilities are plausibly needed for its function. The code in `assets/main_handler.js` and `assets/youtube_handler.js` focuses on scraping public web content, outputs results to stdout, clears cookies to prevent session leakage, and does not exhibit any signs of data exfiltration, malicious execution, persistence, prompt injection against the agent, or obfuscation. The `SKILL.md` explicitly forbids scraping password-protected or non-public personal information, indicating a consideration for privacy.
Capability Assessment
Purpose & Capability
The stated purpose is deep scraping of dynamic sites, and the code matches that by running Crawlee/Playwright against a user-supplied URL and printing extracted page, transcript, or description text to stdout. The bypass-oriented phrasing is caution-worthy but disclosed.
Instruction Scope
The skill forbids scraping password-protected or non-public personal information and labels output types in the documented CLI path. It would be safer with clearer authorization, terms-of-service, rate-limit, and anti-bypass warnings.
Install Mechanism
Installation requires Docker and a locally built image, but the reviewed artifact does not include a Dockerfile despite the package changelog saying one was included. Dependencies are version-ranged without a lockfile, so users should review and pin the build environment.
Credentials
A headless browser in Docker is proportionate for dynamic scraping. Chromium is launched with no-sandbox flags, which is common for Dockerized Playwright but weakens in-container isolation.
Persistence & Privilege
No credential-store access, background service, host-wide filesystem mount, persistence mechanism, destructive action, or outbound sink beyond visiting the requested target and writing results to stdout was found. The handlers clear browser cookies before scraping.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install deep-scraper
  3. After installation, invoke the skill by name or use /deep-scraper
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
Included Dockerfile in the package
v1.0.0
Initial release
Metadata
Slug deep-scraper
Version 1.0.1
License
All-time Installs 440
Active Installs 76
Total Versions 2
Frequently Asked Questions

What is Deep Scraper?

Performs deep scraping of complex sites like YouTube using containerized Crawlee, extracting validated, ad-free transcripts and content as JSON output. It is an AI Agent Skill for Claude Code / OpenClaw, with 11680 downloads so far.

How do I install Deep Scraper?

Run "/install deep-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Deep Scraper free?

Yes, Deep Scraper is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Deep Scraper support?

Deep Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Deep Scraper?

It is built and maintained by opsun (@opsun); the current version is v1.0.1.

💬 Comments