← Back to Skills Marketplace
39
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install stitch-playwright-scraper
Description
使用 Playwright + Stealth 插件绕过反爬机制抓取页面。
README (SKILL.md)
Playwright Stealth Scraper 🕷️
当 web_fetch 工具拿不到目标页面内容时(403、JS渲染、反爬),用这个技能。
前置条件
需要本地安装了 Playwright 和 Chromium。
# 在技能目录下安装
cd ~/.openclaw/workspace/skills/playwright-scraper
npm install playwright puppeteer-extra-plugin-stealth
npx playwright install chromium
注意:这会下载约 300MB 的 Chromium 二进制文件。
什么时候用这个
| 场景 | 用什么 |
|---|---|
| 普通网页、API 返回 | web_fetch(内置,轻量) |
| 被 Cloudflare 等反爬拦截 | playwright-scraper |
| 页面需要 JS 渲染才能显示内容 | playwright-scraper |
| 需要登录后抓取 | playwright-scraper + 手动处理 cookie |
使用方式
方式一:写一个独立脚本(推荐)
const { chromium } = require('playwright-extra');
const stealth = require('puppeteer-extra-plugin-stealth')();
chromium.use(stealth);
(async () => {
const browser = await chromium.launch({ headless: true });
const page = await browser.newPage();
await page.goto('目标URL', { waitUntil: 'networkidle' });
const content = await page.content();
// 提取你需要的内容
console.log(content);
await browser.close();
})();
方式二:直接用 OpenClaw 的 browser 工具
# 启动浏览器
browser action=start profile=openclaw
# 打开页面
browser action=open url=目标URL
# 获取快照
browser action=snapshot targetId=XXX
内置 browser 工具通常够用,playwright-scraper 主要应对特殊反爬场景。
注意
- 不要高频爬取,尊重 robots.txt
- 如果目标网站有登录墙,不要自动填凭据,请示老板
- 爬下来的数据存到工作区对应目录,不要丢在 /tmp
Usage Guidance
Review before installing. Use only in an isolated browser/profile, do not attach it to your normal logged-in Chrome session, and fix or remove the execSync-based tool entry point. Remove the bundled Goofish search scripts if you only need a general scraper, and avoid using the skill to bypass access controls, scrape against site rules, or search for cheating/fraud services.
Capability Assessment
Purpose & Capability
The stated purpose is Playwright plus stealth scraping for pages blocked by normal fetch tools, which fits some dependencies and examples. However, bundled scripts are Goofish-specific, hard-code marketplace searches including academic/writing-service terms, and read content from a live browser session, which is broader and less clearly aligned than the general scraper description.
Instruction Scope
The user-facing instructions mention ethical scraping and manual cookie handling, but they do not disclose the fixed Chrome DevTools endpoint, existing-tab enumeration, live-session reuse, or the unsafe MCP command path. There is no clear consent, domain allowlist, or isolation requirement for the higher-risk scripts.
Install Mechanism
Installation uses ordinary npm dependencies and Chromium setup, and no install-time persistence or install scripts were found. The runtime entry point is still problematic because it builds a shell command from the user-supplied URL and references a missing scripts/playwright-stealth.js file.
Credentials
Stealth browser automation is expected for this kind of scraper, but attaching to a hard-coded existing Chrome DevTools session can inherit cookies, authenticated tabs, and unrelated browsing state. That level of access is not proportionate without explicit user control and isolation.
Persistence & Privilege
No background service, self-starting persistence, destructive file operation, or credential exfiltration was found. The concern is privilege: live browser-session access and shell command execution can affect or expose much more than a single requested scrape.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install stitch-playwright-scraper - After installation, invoke the skill by name or use
/stitch-playwright-scraper - Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.0.0
playwright-scraper v2.0.0
- Major update: Adds Playwright + Stealth support to bypass anti-bot mechanisms and handle JS-rendered pages.
- Guides users for setup, including required dependencies and Chromium installation.
- Clarifies when to use the skill versus standard fetch tools.
- Provides example usage scripts and OpenClaw integration steps.
- Emphasizes ethical scraping practices and storage recommendations.
Metadata
Frequently Asked Questions
What is Playwright Scraper?
使用 Playwright + Stealth 插件绕过反爬机制抓取页面。 It is an AI Agent Skill for Claude Code / OpenClaw, with 39 downloads so far.
How do I install Playwright Scraper?
Run "/install stitch-playwright-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Playwright Scraper free?
Yes, Playwright Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Playwright Scraper support?
Playwright Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Playwright Scraper?
It is built and maintained by sdT328606 (@sdt328606); the current version is v2.0.0.
More Skills