← 返回 Skills 市场
39
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install stitch-playwright-scraper
功能描述
使用 Playwright + Stealth 插件绕过反爬机制抓取页面。
使用说明 (SKILL.md)
Playwright Stealth Scraper 🕷️
当 web_fetch 工具拿不到目标页面内容时(403、JS渲染、反爬),用这个技能。
前置条件
需要本地安装了 Playwright 和 Chromium。
# 在技能目录下安装
cd ~/.openclaw/workspace/skills/playwright-scraper
npm install playwright puppeteer-extra-plugin-stealth
npx playwright install chromium
注意:这会下载约 300MB 的 Chromium 二进制文件。
什么时候用这个
| 场景 | 用什么 |
|---|---|
| 普通网页、API 返回 | web_fetch(内置,轻量) |
| 被 Cloudflare 等反爬拦截 | playwright-scraper |
| 页面需要 JS 渲染才能显示内容 | playwright-scraper |
| 需要登录后抓取 | playwright-scraper + 手动处理 cookie |
使用方式
方式一:写一个独立脚本(推荐)
const { chromium } = require('playwright-extra');
const stealth = require('puppeteer-extra-plugin-stealth')();
chromium.use(stealth);
(async () => {
const browser = await chromium.launch({ headless: true });
const page = await browser.newPage();
await page.goto('目标URL', { waitUntil: 'networkidle' });
const content = await page.content();
// 提取你需要的内容
console.log(content);
await browser.close();
})();
方式二:直接用 OpenClaw 的 browser 工具
# 启动浏览器
browser action=start profile=openclaw
# 打开页面
browser action=open url=目标URL
# 获取快照
browser action=snapshot targetId=XXX
内置 browser 工具通常够用,playwright-scraper 主要应对特殊反爬场景。
注意
- 不要高频爬取,尊重 robots.txt
- 如果目标网站有登录墙,不要自动填凭据,请示老板
- 爬下来的数据存到工作区对应目录,不要丢在 /tmp
安全使用建议
Review before installing. Use only in an isolated browser/profile, do not attach it to your normal logged-in Chrome session, and fix or remove the execSync-based tool entry point. Remove the bundled Goofish search scripts if you only need a general scraper, and avoid using the skill to bypass access controls, scrape against site rules, or search for cheating/fraud services.
能力评估
Purpose & Capability
The stated purpose is Playwright plus stealth scraping for pages blocked by normal fetch tools, which fits some dependencies and examples. However, bundled scripts are Goofish-specific, hard-code marketplace searches including academic/writing-service terms, and read content from a live browser session, which is broader and less clearly aligned than the general scraper description.
Instruction Scope
The user-facing instructions mention ethical scraping and manual cookie handling, but they do not disclose the fixed Chrome DevTools endpoint, existing-tab enumeration, live-session reuse, or the unsafe MCP command path. There is no clear consent, domain allowlist, or isolation requirement for the higher-risk scripts.
Install Mechanism
Installation uses ordinary npm dependencies and Chromium setup, and no install-time persistence or install scripts were found. The runtime entry point is still problematic because it builds a shell command from the user-supplied URL and references a missing scripts/playwright-stealth.js file.
Credentials
Stealth browser automation is expected for this kind of scraper, but attaching to a hard-coded existing Chrome DevTools session can inherit cookies, authenticated tabs, and unrelated browsing state. That level of access is not proportionate without explicit user control and isolation.
Persistence & Privilege
No background service, self-starting persistence, destructive file operation, or credential exfiltration was found. The concern is privilege: live browser-session access and shell command execution can affect or expose much more than a single requested scrape.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install stitch-playwright-scraper - 安装完成后,直接呼叫该 Skill 的名称或使用
/stitch-playwright-scraper触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v2.0.0
playwright-scraper v2.0.0
- Major update: Adds Playwright + Stealth support to bypass anti-bot mechanisms and handle JS-rendered pages.
- Guides users for setup, including required dependencies and Chromium installation.
- Clarifies when to use the skill versus standard fetch tools.
- Provides example usage scripts and OpenClaw integration steps.
- Emphasizes ethical scraping practices and storage recommendations.
元数据
常见问题
Playwright Scraper 是什么?
使用 Playwright + Stealth 插件绕过反爬机制抓取页面。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 39 次。
如何安装 Playwright Scraper?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install stitch-playwright-scraper」即可一键安装,无需额外配置。
Playwright Scraper 是免费的吗?
是的,Playwright Scraper 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Playwright Scraper 支持哪些平台?
Playwright Scraper 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Playwright Scraper?
由 sdT328606(@sdt328606)开发并维护,当前版本 v2.0.0。
推荐 Skills