← Back to Skills Marketplace
ckncg

Persistent Browser Scraper

by ckncg · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
219
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install persistent-browser
Description
用 Playwright 持久化上下文(main-identity)抓取需要登录态的网站(YouTube、GitHub、HuggingFace、Reddit、Kaggle、X/Twitter)。当用户要求外网搜索或指定这些网站时自动触发。
Usage Guidance
This skill instructs the agent to open and modify a hard-coded local browser profile (including deleting a lock file) to scrape logged-in content. That profile can contain cookies, sessions, and other sensitive data. Before installing: (1) do not install unless you fully trust the skill author and understand why it needs your browser profile; (2) prefer a design that uses explicit API tokens or a dedicated, sandboxed browser profile rather than your main identity; (3) require the skill to declare dependencies (Playwright, browser) and the config path it will use, and change the path to a profile you control; (4) avoid running with --no-sandbox and headful flags on untrusted code; (5) if you test it, run inside an isolated VM/container with a throwaway profile and monitor file access; (6) consider disabling autonomous invocation so the skill runs only with explicit user approval.
Capability Analysis
Type: OpenClaw Skill Name: persistent-browser Version: 1.0.0 The skill is designed to access a persistent browser profile located at a hardcoded path (/home/kncao/.openclaw/browser-profiles/main-identity), which contains sensitive session data and login credentials for platforms like GitHub, X/Twitter, and YouTube. It explicitly disables security features (--no-sandbox) and requires non-headless mode to bypass bot detection. While the stated purpose is scraping, the access to a persistent 'main-identity' profile poses a high risk of session hijacking or unauthorized access to private user data if the agent is misdirected. (SKILL.md)
Capability Assessment
Purpose & Capability
The described purpose (scraping sites that require login via a persistent Playwright context) is coherent with using launch_persistent_context. However, the SKILL.md hardcodes a specific user_data_dir (/home/kncao/.openclaw/browser-profiles/main-identity) and instructs manipulating its files. The skill metadata declares no required config paths or credentials, so the instructions demand access beyond what was declared and beyond a typical scraper's minimal needs.
Instruction Scope
The runtime instructions explicitly tell the agent to read/write a local browser profile and to rm -f the SingletonLock before each run. That is file-system access and modification of another profile on disk (potentially containing cookies, sessions, credentials). The instructions also require running Playwright headful with args (including --no-sandbox and anti-detection flags), which expands runtime privileges and evasion tactics. These actions go beyond simple page fetching and implicate sensitive local data and destructive operations.
Install Mechanism
This is instruction-only (no install spec), which limits installation-time risk. However, the SKILL.md assumes Playwright (and a browser) are available but does not declare required binaries or packages. The missing dependency declarations are an incoherence: the skill will fail or behave unpredictably unless Playwright and appropriate browsers are present.
Credentials
No environment variables or credentials are requested, yet the skill asks to use a persistent browser profile that likely contains cookies, tokens, and session state. Access to that profile is disproportionate and privacy-sensitive. The skill gives no guidance for using a dedicated/sandboxed profile or requesting explicit user consent for accessing such data.
Persistence & Privilege
The skill is not marked always:true, but disable-model-invocation is false (normal), meaning the agent could autonomously invoke this skill when triggered by web-search intents. Autonomous invocation combined with the ability to read/modify a local logged-in browser profile increases the blast radius and privacy risk if invoked without explicit user confirmation.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install persistent-browser
  3. After installation, invoke the skill by name or use /persistent-browser
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release: Enables persistent browser context scraping for login-required websites. - Uses Playwright with a persistent user profile for sites needing authentication (YouTube, GitHub, HuggingFace, Reddit, Kaggle, X/Twitter). - Automatically triggers when accessing specified sites or when "external web search" is requested. - Ensures headless mode is disabled for proper page rendering, especially on X/Twitter. - Cleans up browser profile lock files before scraping to avoid errors. - Customizes waiting times per site for reliable JavaScript/SPAs rendering. - Extracts plain text content, avoiding screenshots.
Metadata
Slug persistent-browser
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Persistent Browser Scraper?

用 Playwright 持久化上下文(main-identity)抓取需要登录态的网站(YouTube、GitHub、HuggingFace、Reddit、Kaggle、X/Twitter)。当用户要求外网搜索或指定这些网站时自动触发。 It is an AI Agent Skill for Claude Code / OpenClaw, with 219 downloads so far.

How do I install Persistent Browser Scraper?

Run "/install persistent-browser" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Persistent Browser Scraper free?

Yes, Persistent Browser Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Persistent Browser Scraper support?

Persistent Browser Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Persistent Browser Scraper?

It is built and maintained by ckncg (@ckncg); the current version is v1.0.0.

💬 Comments