← Back to Skills Marketplace

Persistent Browser Scraper

Name: Persistent Browser Scraper
Author: ckncg

by ckncg · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

219

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install persistent-browser

Description

用 Playwright 持久化上下文（main-identity）抓取需要登录态的网站（YouTube、GitHub、HuggingFace、Reddit、Kaggle、X/Twitter）。当用户要求外网搜索或指定这些网站时自动触发。

Usage Guidance

This skill instructs the agent to open and modify a hard-coded local browser profile (including deleting a lock file) to scrape logged-in content. That profile can contain cookies, sessions, and other sensitive data. Before installing: (1) do not install unless you fully trust the skill author and understand why it needs your browser profile; (2) prefer a design that uses explicit API tokens or a dedicated, sandboxed browser profile rather than your main identity; (3) require the skill to declare dependencies (Playwright, browser) and the config path it will use, and change the path to a profile you control; (4) avoid running with --no-sandbox and headful flags on untrusted code; (5) if you test it, run inside an isolated VM/container with a throwaway profile and monitor file access; (6) consider disabling autonomous invocation so the skill runs only with explicit user approval.

Capability Analysis

Type: OpenClaw Skill Name: persistent-browser Version: 1.0.0 The skill is designed to access a persistent browser profile located at a hardcoded path (/home/kncao/.openclaw/browser-profiles/main-identity), which contains sensitive session data and login credentials for platforms like GitHub, X/Twitter, and YouTube. It explicitly disables security features (--no-sandbox) and requires non-headless mode to bypass bot detection. While the stated purpose is scraping, the access to a persistent 'main-identity' profile poses a high risk of session hijacking or unauthorized access to private user data if the agent is misdirected. (SKILL.md)

Capability Assessment

⚠ Purpose & Capability

The described purpose (scraping sites that require login via a persistent Playwright context) is coherent with using launch_persistent_context. However, the SKILL.md hardcodes a specific user_data_dir (/home/kncao/.openclaw/browser-profiles/main-identity) and instructs manipulating its files. The skill metadata declares no required config paths or credentials, so the instructions demand access beyond what was declared and beyond a typical scraper's minimal needs.

⚠ Instruction Scope

The runtime instructions explicitly tell the agent to read/write a local browser profile and to rm -f the SingletonLock before each run. That is file-system access and modification of another profile on disk (potentially containing cookies, sessions, credentials). The instructions also require running Playwright headful with args (including --no-sandbox and anti-detection flags), which expands runtime privileges and evasion tactics. These actions go beyond simple page fetching and implicate sensitive local data and destructive operations.

ℹ Install Mechanism

This is instruction-only (no install spec), which limits installation-time risk. However, the SKILL.md assumes Playwright (and a browser) are available but does not declare required binaries or packages. The missing dependency declarations are an incoherence: the skill will fail or behave unpredictably unless Playwright and appropriate browsers are present.

⚠ Credentials

No environment variables or credentials are requested, yet the skill asks to use a persistent browser profile that likely contains cookies, tokens, and session state. Access to that profile is disproportionate and privacy-sensitive. The skill gives no guidance for using a dedicated/sandboxed profile or requesting explicit user consent for accessing such data.

⚠ Persistence & Privilege

The skill is not marked always:true, but disable-model-invocation is false (normal), meaning the agent could autonomously invoke this skill when triggered by web-search intents. Autonomous invocation combined with the ability to read/modify a local logged-in browser profile increases the blast radius and privacy risk if invoked without explicit user confirmation.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install persistent-browser
After installation, invoke the skill by name or use /persistent-browser
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release: Enables persistent browser context scraping for login-required websites. - Uses Playwright with a persistent user profile for sites needing authentication (YouTube, GitHub, HuggingFace, Reddit, Kaggle, X/Twitter). - Automatically triggers when accessing specified sites or when "external web search" is requested. - Ensures headless mode is disabled for proper page rendering, especially on X/Twitter. - Cleans up browser profile lock files before scraping to avoid errors. - Customizes waiting times per site for reliable JavaScript/SPAs rendering. - Extracts plain text content, avoiding screenshots.

Metadata

Slug persistent-browser

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Persistent Browser Scraper?

用 Playwright 持久化上下文（main-identity）抓取需要登录态的网站（YouTube、GitHub、HuggingFace、Reddit、Kaggle、X/Twitter）。当用户要求外网搜索或指定这些网站时自动触发。 It is an AI Agent Skill for Claude Code / OpenClaw, with 219 downloads so far.

How do I install Persistent Browser Scraper?

Run "/install persistent-browser" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Persistent Browser Scraper free?

Yes, Persistent Browser Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Persistent Browser Scraper support?

Persistent Browser Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Persistent Browser Scraper?

It is built and maintained by ckncg (@ckncg); the current version is v1.0.0.

More Skills