← Back to Skills Marketplace

Playwright Scraper

Name: Playwright Scraper
Author: sdt328606

by sdT328606 · GitHub ↗ · v2.0.0 · MIT-0

cross-platform ⚠ suspicious

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install stitch-playwright-scraper

Description

使用 Playwright + Stealth 插件绕过反爬机制抓取页面。

README (SKILL.md)

Playwright Stealth Scraper 🕷️

当 web_fetch 工具拿不到目标页面内容时（403、JS渲染、反爬），用这个技能。

前置条件

需要本地安装了 Playwright 和 Chromium。

# 在技能目录下安装
cd ~/.openclaw/workspace/skills/playwright-scraper
npm install playwright puppeteer-extra-plugin-stealth
npx playwright install chromium

注意：这会下载约 300MB 的 Chromium 二进制文件。

什么时候用这个

场景	用什么
普通网页、API 返回	`web_fetch`（内置，轻量）
被 Cloudflare 等反爬拦截	playwright-scraper
页面需要 JS 渲染才能显示内容	playwright-scraper
需要登录后抓取	playwright-scraper + 手动处理 cookie

使用方式

方式一：写一个独立脚本（推荐）

const { chromium } = require('playwright-extra');
const stealth = require('puppeteer-extra-plugin-stealth')();
chromium.use(stealth);

(async () => {
  const browser = await chromium.launch({ headless: true });
  const page = await browser.newPage();
  await page.goto('目标URL', { waitUntil: 'networkidle' });
  const content = await page.content();
  // 提取你需要的内容
  console.log(content);
  await browser.close();
})();

方式二：直接用 OpenClaw 的 browser 工具

# 启动浏览器
browser action=start profile=openclaw
# 打开页面
browser action=open url=目标URL
# 获取快照
browser action=snapshot targetId=XXX

内置 browser 工具通常够用，playwright-scraper 主要应对特殊反爬场景。

注意

不要高频爬取，尊重 robots.txt
如果目标网站有登录墙，不要自动填凭据，请示老板
爬下来的数据存到工作区对应目录，不要丢在 /tmp

Usage Guidance

Review before installing. Use only in an isolated browser/profile, do not attach it to your normal logged-in Chrome session, and fix or remove the execSync-based tool entry point. Remove the bundled Goofish search scripts if you only need a general scraper, and avoid using the skill to bypass access controls, scrape against site rules, or search for cheating/fraud services.

Capability Assessment

⚠ Purpose & Capability

The stated purpose is Playwright plus stealth scraping for pages blocked by normal fetch tools, which fits some dependencies and examples. However, bundled scripts are Goofish-specific, hard-code marketplace searches including academic/writing-service terms, and read content from a live browser session, which is broader and less clearly aligned than the general scraper description.

⚠ Instruction Scope

The user-facing instructions mention ethical scraping and manual cookie handling, but they do not disclose the fixed Chrome DevTools endpoint, existing-tab enumeration, live-session reuse, or the unsafe MCP command path. There is no clear consent, domain allowlist, or isolation requirement for the higher-risk scripts.

ℹ Install Mechanism

Installation uses ordinary npm dependencies and Chromium setup, and no install-time persistence or install scripts were found. The runtime entry point is still problematic because it builds a shell command from the user-supplied URL and references a missing scripts/playwright-stealth.js file.

⚠ Credentials

Stealth browser automation is expected for this kind of scraper, but attaching to a hard-coded existing Chrome DevTools session can inherit cookies, authenticated tabs, and unrelated browsing state. That level of access is not proportionate without explicit user control and isolation.

⚠ Persistence & Privilege

No background service, self-starting persistence, destructive file operation, or credential exfiltration was found. The concern is privilege: live browser-session access and shell command execution can affect or expose much more than a single requested scrape.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install stitch-playwright-scraper
After installation, invoke the skill by name or use /stitch-playwright-scraper
Provide required inputs per the skill's parameter spec and get structured output

Version History

v2.0.0

playwright-scraper v2.0.0 - Major update: Adds Playwright + Stealth support to bypass anti-bot mechanisms and handle JS-rendered pages. - Guides users for setup, including required dependencies and Chromium installation. - Clarifies when to use the skill versus standard fetch tools. - Provides example usage scripts and OpenClaw integration steps. - Emphasizes ethical scraping practices and storage recommendations.

Metadata

Slug stitch-playwright-scraper

Version 2.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Playwright Scraper?

使用 Playwright + Stealth 插件绕过反爬机制抓取页面。 It is an AI Agent Skill for Claude Code / OpenClaw, with 39 downloads so far.

How do I install Playwright Scraper?

Run "/install stitch-playwright-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Playwright Scraper free?

Yes, Playwright Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Playwright Scraper support?

Playwright Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Playwright Scraper?

It is built and maintained by sdT328606 (@sdt328606); the current version is v2.0.0.

More Skills