← Back to Skills Marketplace
huaerye23

browser-automation-skills

by huaerye23 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
269
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install browser-automation-skills
Description
Browser automation skills for AI models — navigate, screenshot, interact, scrape, debug, test, and record browser sessions. Controls local Google Chrome via...
README (SKILL.md)

Browser Automation Skills / 浏览器自动化技能包

A set of Skills that teach AI models how to control the local Google Chrome browser. Works with any model — via built-in browser_subagent or the bundled Playwright CLI script.

一组教会 AI 模型控制本地 Google Chrome 的技能包。 适用于任何模型 — 通过内置 browser_subagent 或自带 Playwright CLI 脚本。

Included Skills / 包含的技能

Skill Description / 说明 Invocation / 调用
navigate Open URLs, read page content / 打开网页、读取内容 Auto + /navigate
screenshot Capture visuals / 截图截屏 Auto + /screenshot
interact Click, type, fill forms / 点击、输入、表单 Auto + /interact
scrape Extract structured data / 数据抓取 Auto + /scrape
debug Inspect network, console / 网页调试 Auto + /debug
test Automated QA / 自动化测试 /test only
record Record sessions / 录制操作 /record only
browser-context API reference / API 参考 Model only

Prerequisites / 前提条件

  • Google Chrome installed locally / 本地安装 Chrome
  • For Playwright CLI: pip install playwright
  • Start Chrome: chrome.exe --remote-debugging-port=9222

Quick Start / 快速开始

python scripts/browser.py status
python scripts/browser.py navigate https://example.com
python scripts/browser.py screenshot
python scripts/browser.py lock      # Visual overlay, block user input
python scripts/browser.py unlock    # Remove overlay, restore input

See docs/api-reference.md for full API documentation.

Usage Guidance
This skill is functionally coherent for browser automation, but it contains high-impact actions you should be aware of before installing or enabling automatic use. Specific points to consider: - Review the Python script yourself before running; it will connect to a Chrome DevTools (CDP) endpoint and drive your existing Chrome instance. - The script reads DOM, lists network requests and console logs: these can disclose sensitive data (tokens, cookies, secrets embedded in pages). Avoid running on pages with private data unless you trust the skill and runtime. - The skill supports login flows and typing into pages — do not let it handle credentials unless you explicitly provide them in a controlled, ephemeral way you trust. - The script honors a BROWSER_CDP_ENDPOINT env var (not declared in the manifest). Make sure this is set to a local-only endpoint (e.g., http://localhost:9222) and not exposed to the network; exposing CDP to a network address can let other systems control your browser. - The 'lock' command injects a full-screen overlay that blocks user input; this is intrusive and could be abused. Favor manual invocation (do not enable broad autonomous invocation) and test in a safe environment first. - If you decide to use it: run the script in an isolated account/VM, limit automatic/auto-invocation privileges, and only grant terminal access to trusted models. If unsure, do not install or run the Playwright adapter until you can audit the code and runtime in a sandbox.
Capability Analysis
Type: OpenClaw Skill Name: browser-automation-skills Version: 1.0.0 The bundle provides comprehensive browser automation capabilities by connecting to a local Chrome instance via the Chrome DevTools Protocol (CDP) using Playwright. This grants an AI agent full control over the user's active browser session, including access to logged-in accounts and sensitive data, which is an inherently high-risk capability. A notable feature in `scripts/browser.py` is the `lock` command, which injects a full-screen JavaScript overlay to block user input while the agent is active. While the code appears to be a legitimate implementation of its stated purpose and includes a user-accessible 'Stop' button in the overlay, the combination of browser session access and input suppression qualifies as a meaningful high-risk behavior.
Capability Assessment
Purpose & Capability
Name/description match the delivered artifacts: SKILL.md, README, API docs and scripts/browser.py implement navigation, screenshot, interact, scrape, debug, test and record via CDP/Playwright. The Playwright CLI adapter is a coherent implementation of the claimed capability.
Instruction Scope
Runtime instructions and the included script direct the agent to read full DOM, capture screenshots, list network requests, capture console logs, click/type (including login flows), and lock user input with a full-screen overlay. These behaviors go beyond passive observation: they permit active interaction with and inspection of arbitrary websites and can capture sensitive in-page data. The code also references environment variables and skill directory paths (e.g., BROWSER_CDP_ENDPOINT, CLAUDE_SKILL_DIR) that are not declared in the skill metadata.
Install Mechanism
No install spec in registry (instruction-only skill plus one Python script). The script requires the user to run 'pip install playwright' themselves; nothing is fetched automatically during installation. This lowers supply-chain risk but means runtime dependency installation and execution are manual and must be audited by the user.
Credentials
The manifest lists no required env vars, but the code uses BROWSER_CDP_ENDPOINT (defaulting to http://localhost:9222) and documentation references CLAUDE_SKILL_DIR. Allowing the CDP endpoint to be set by env var means the skill can be pointed at an arbitrary CDP host (including a remote host) without that being declared. Additionally, network/console inspection via CDP can expose sensitive tokens or page-injected secrets. The skill does not request credentials itself, but instructions (and the 'interact' skill) explicitly describe performing login flows, which could cause credentials to be entered or captured.
Persistence & Privilege
always:false (good) and autonomous invocation is allowed (normal), but combined with the skill's ability to lock user input, persist CLI session state, inspect network/console data, and perform multi-step interactions, autonomous execution increases potential for undesired or surprising actions. The 'lock' feature (injecting an overlay to block user input) is especially intrusive and increases the blast radius if the skill is invoked without careful controls.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install browser-automation-skills
  3. After installation, invoke the skill by name or use /browser-automation-skills
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of browser-automation-skills: - Introduces a skill pack for browser automation with Google Chrome. - Enables navigation, screenshots, interaction, scraping, debugging, automated testing, and session recording. - Provides both built-in subagent and Playwright CLI script for flexibility. - Features visual overlay to lock/unlock browser input during automation. - Includes multilingual documentation (English and Chinese).
Metadata
Slug browser-automation-skills
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is browser-automation-skills?

Browser automation skills for AI models — navigate, screenshot, interact, scrape, debug, test, and record browser sessions. Controls local Google Chrome via... It is an AI Agent Skill for Claude Code / OpenClaw, with 269 downloads so far.

How do I install browser-automation-skills?

Run "/install browser-automation-skills" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is browser-automation-skills free?

Yes, browser-automation-skills is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does browser-automation-skills support?

browser-automation-skills is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created browser-automation-skills?

It is built and maintained by huaerye23 (@huaerye23); the current version is v1.0.0.

💬 Comments