← Back to Skills Marketplace
gu-yunyu

Coco Playwright Stealth 1.0.0

by Gu-yunyu · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
42
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install coco-playwright-stealth-1-0-0
Description
Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk.
README (SKILL.md)

Playwright Scraper Skill

A Playwright-based web scraping OpenClaw Skill with anti-bot protection. Choose the best approach based on the target website's anti-bot level.


🎯 Use Case Matrix

Target Website Anti-Bot Level Recommended Method Script
Regular Sites Low web_fetch tool N/A (built-in)
Dynamic Sites Medium Playwright Simple scripts/playwright-simple.js
Cloudflare Protected High Playwright Stealth scripts/playwright-stealth.js
YouTube Special deep-scraper Install separately
Reddit Special reddit-scraper Install separately

📦 Installation

cd playwright-scraper-skill
npm install
npx playwright install chromium

🚀 Quick Start

1️⃣ Simple Sites (No Anti-Bot)

Use OpenClaw's built-in web_fetch tool:

# Invoke directly in OpenClaw
Hey, fetch me the content from https://example.com

2️⃣ Dynamic Sites (Requires JavaScript)

Use Playwright Simple:

node scripts/playwright-simple.js "https://example.com"

Example output:

{
  "url": "https://example.com",
  "title": "Example Domain",
  "content": "...",
  "elapsedSeconds": "3.45"
}

3️⃣ Anti-Bot Protected Sites (Cloudflare etc.)

Use Playwright Stealth:

node scripts/playwright-stealth.js "https://m.discuss.com.hk/#hot"

Features:

  • Hide automation markers (navigator.webdriver = false)
  • Realistic User-Agent (iPhone, Android)
  • Random delays to mimic human behavior
  • Screenshot and HTML saving support

4️⃣ YouTube Video Transcripts

Use deep-scraper (install separately):

# Install deep-scraper skill
npx clawhub install deep-scraper

# Use it
cd skills/deep-scraper
node assets/youtube_handler.js "https://www.youtube.com/watch?v=VIDEO_ID"

📖 Script Descriptions

scripts/playwright-simple.js

  • Use Case: Regular dynamic websites
  • Speed: Fast (3-5 seconds)
  • Anti-Bot: None
  • Output: JSON (title, content, URL)

scripts/playwright-stealth.js

  • Use Case: Sites with Cloudflare or anti-bot protection
  • Speed: Medium (5-20 seconds)
  • Anti-Bot: Medium-High (hides automation, realistic UA)
  • Output: JSON + Screenshot + HTML file
  • Verified: 100% success on Discuss.com.hk

🎓 Best Practices

1. Try web_fetch First

If the site doesn't have dynamic loading, use OpenClaw's web_fetch tool—it's fastest.

2. Need JavaScript? Use Playwright Simple

If you need to wait for JavaScript rendering, use playwright-simple.js.

3. Getting Blocked? Use Stealth

If you encounter 403 or Cloudflare challenges, use playwright-stealth.js.

4. Special Sites Need Specialized Skills

  • YouTube → deep-scraper
  • Reddit → reddit-scraper
  • Twitter → bird skill

🔧 Customization

All scripts support environment variables:

# Set screenshot path
SCREENSHOT_PATH=/path/to/screenshot.png node scripts/playwright-stealth.js URL

# Set wait time (milliseconds)
WAIT_TIME=10000 node scripts/playwright-simple.js URL

# Enable headful mode (show browser)
HEADLESS=false node scripts/playwright-stealth.js URL

# Save HTML
SAVE_HTML=true node scripts/playwright-stealth.js URL

# Custom User-Agent
USER_AGENT="Mozilla/5.0 ..." node scripts/playwright-stealth.js URL

📊 Performance Comparison

Method Speed Anti-Bot Success Rate (Discuss.com.hk)
web_fetch ⚡ Fastest ❌ None 0%
Playwright Simple 🚀 Fast ⚠️ Low 20%
Playwright Stealth ⏱️ Medium ✅ Medium 100%
Puppeteer Stealth ⏱️ Medium ✅ Medium-High ~80%
Crawlee (deep-scraper) 🐢 Slow ❌ Detected 0%
Chaser (Rust) ⏱️ Medium ❌ Detected 0%

🛡️ Anti-Bot Techniques Summary

Lessons learned from our testing:

✅ Effective Anti-Bot Measures

  1. Hide navigator.webdriver — Essential
  2. Realistic User-Agent — Use real devices (iPhone, Android)
  3. Mimic Human Behavior — Random delays, scrolling
  4. Avoid Framework Signatures — Crawlee, Selenium are easily detected
  5. Use addInitScript (Playwright) — Inject before page load

❌ Ineffective Anti-Bot Measures

  1. Only changing User-Agent — Not enough
  2. Using high-level frameworks (Crawlee) — More easily detected
  3. Docker isolation — Doesn't help with Cloudflare

🔍 Troubleshooting

Issue: 403 Forbidden

Solution: Use playwright-stealth.js

Issue: Cloudflare Challenge Page

Solution:

  1. Increase wait time (10-15 seconds)
  2. Try headless: false (headful mode sometimes has higher success rate)
  3. Consider using proxy IPs

Issue: Blank Page

Solution:

  1. Increase waitForTimeout
  2. Use waitUntil: 'networkidle' or 'domcontentloaded'
  3. Check if login is required

📝 Memory & Experience

2026-02-07 Discuss.com.hk Test Conclusions

  • Pure Playwright + Stealth succeeded (5s, 200 OK)
  • ❌ Crawlee (deep-scraper) failed (403)
  • ❌ Chaser (Rust) failed (Cloudflare)
  • ❌ Puppeteer standard failed (403)

Best Solution: Pure Playwright + anti-bot techniques (framework-independent)


🚧 Future Improvements

  • Add proxy IP rotation
  • Implement cookie management (maintain login state)
  • Add CAPTCHA handling (2captcha / Anti-Captcha)
  • Batch scraping (parallel URLs)
  • Integration with OpenClaw's browser tool

📚 References

Usage Guidance
Install only if you intend to run a scraping tool and have permission to access the target sites. Use deterministic installs, run it in an isolated environment, avoid logged-in or sensitive pages, review any screenshots or saved HTML before sharing, and be aware that stealth scraping or proxy use may violate site rules or laws.
Capability Assessment
Purpose & Capability
The code and documentation coherently match the stated purpose: user-directed Playwright web scraping, including stealth behavior for anti-bot-protected sites. I found no artifact-backed evidence of exfiltration, destructive actions, credential theft, or hidden background execution.
Instruction Scope
The documentation explicitly recommends hiding automation markers, mimicking human behavior, handling Cloudflare blocks, considering proxies, delaying to avoid IP blocking, and future CAPTCHA handling, but does not clearly limit use to authorized or policy-compliant targets.
Install Mechanism
Installation uses npm and Playwright browser installation. package.json allows a broad Playwright range, but the included lockfile resolves Playwright to 1.58.2 with integrity hashes, so the scanner's exact 1.40.0 dependency claim is not supported by the lockfile.
Credentials
The stealth and SMZDM scripts launch Chromium with --no-sandbox while visiting arbitrary user-supplied URLs, reducing browser exploit containment for untrusted web content. This is under-disclosed in the docs.
Persistence & Privilege
The stealth script saves a screenshot by default and can save full HTML when SAVE_HTML=true; examples write captures under /tmp. This persistence is disclosed as functionality, but the artifacts do not warn that captures may contain sensitive, copyrighted, or proprietary page content.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install coco-playwright-stealth-1-0-0
  3. After installation, invoke the skill by name or use /coco-playwright-stealth-1-0-0
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
**Playwright Scraper Skill v1.2.0 – Major documentation and best practices update** - Expanded and detailed SKILL.md covering usage, setup, customization, and troubleshooting. - Added a clear use case matrix and best practice recommendations for regular, dynamic, and anti-bot-protected sites. - Described individual scripts and features, including anti-bot strategies. - Provided performance comparisons and lessons learned from real-world tests. - Outlined future improvement plans and included reference links for further reading.
Metadata
Slug coco-playwright-stealth-1-0-0
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Coco Playwright Stealth 1.0.0?

Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk. It is an AI Agent Skill for Claude Code / OpenClaw, with 42 downloads so far.

How do I install Coco Playwright Stealth 1.0.0?

Run "/install coco-playwright-stealth-1-0-0" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Coco Playwright Stealth 1.0.0 free?

Yes, Coco Playwright Stealth 1.0.0 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Coco Playwright Stealth 1.0.0 support?

Coco Playwright Stealth 1.0.0 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Coco Playwright Stealth 1.0.0?

It is built and maintained by Gu-yunyu (@gu-yunyu); the current version is v1.0.0.

💬 Comments