← 返回 Skills 市场

browser-automation-skills

Name: browser-automation-skills
Author: huaerye23

作者 huaerye23 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

269

总下载

当前安装

版本数

在 OpenClaw 中安装

/install browser-automation-skills

功能描述

Browser automation skills for AI models — navigate, screenshot, interact, scrape, debug, test, and record browser sessions. Controls local Google Chrome via...

使用说明 (SKILL.md)

Browser Automation Skills / 浏览器自动化技能包

A set of Skills that teach AI models how to control the local Google Chrome browser. Works with any model — via built-in browser_subagent or the bundled Playwright CLI script.

一组教会 AI 模型控制本地 Google Chrome 的技能包。适用于任何模型 — 通过内置 browser_subagent 或自带 Playwright CLI 脚本。

Included Skills / 包含的技能

Skill	Description / 说明	Invocation / 调用
`navigate`	Open URLs, read page content / 打开网页、读取内容	Auto + `/navigate`
`screenshot`	Capture visuals / 截图截屏	Auto + `/screenshot`
`interact`	Click, type, fill forms / 点击、输入、表单	Auto + `/interact`
`scrape`	Extract structured data / 数据抓取	Auto + `/scrape`
`debug`	Inspect network, console / 网页调试	Auto + `/debug`
`test`	Automated QA / 自动化测试	`/test` only
`record`	Record sessions / 录制操作	`/record` only
`browser-context`	API reference / API 参考	Model only

Prerequisites / 前提条件

Google Chrome installed locally / 本地安装 Chrome
For Playwright CLI: pip install playwright
Start Chrome: chrome.exe --remote-debugging-port=9222

Quick Start / 快速开始

python scripts/browser.py status
python scripts/browser.py navigate https://example.com
python scripts/browser.py screenshot
python scripts/browser.py lock      # Visual overlay, block user input
python scripts/browser.py unlock    # Remove overlay, restore input

See docs/api-reference.md for full API documentation.

安全使用建议

This skill is functionally coherent for browser automation, but it contains high-impact actions you should be aware of before installing or enabling automatic use. Specific points to consider: - Review the Python script yourself before running; it will connect to a Chrome DevTools (CDP) endpoint and drive your existing Chrome instance. - The script reads DOM, lists network requests and console logs: these can disclose sensitive data (tokens, cookies, secrets embedded in pages). Avoid running on pages with private data unless you trust the skill and runtime. - The skill supports login flows and typing into pages — do not let it handle credentials unless you explicitly provide them in a controlled, ephemeral way you trust. - The script honors a BROWSER_CDP_ENDPOINT env var (not declared in the manifest). Make sure this is set to a local-only endpoint (e.g., http://localhost:9222) and not exposed to the network; exposing CDP to a network address can let other systems control your browser. - The 'lock' command injects a full-screen overlay that blocks user input; this is intrusive and could be abused. Favor manual invocation (do not enable broad autonomous invocation) and test in a safe environment first. - If you decide to use it: run the script in an isolated account/VM, limit automatic/auto-invocation privileges, and only grant terminal access to trusted models. If unsure, do not install or run the Playwright adapter until you can audit the code and runtime in a sandbox.

功能分析

Type: OpenClaw Skill Name: browser-automation-skills Version: 1.0.0 The bundle provides comprehensive browser automation capabilities by connecting to a local Chrome instance via the Chrome DevTools Protocol (CDP) using Playwright. This grants an AI agent full control over the user's active browser session, including access to logged-in accounts and sensitive data, which is an inherently high-risk capability. A notable feature in `scripts/browser.py` is the `lock` command, which injects a full-screen JavaScript overlay to block user input while the agent is active. While the code appears to be a legitimate implementation of its stated purpose and includes a user-accessible 'Stop' button in the overlay, the combination of browser session access and input suppression qualifies as a meaningful high-risk behavior.

能力评估

✓ Purpose & Capability

Name/description match the delivered artifacts: SKILL.md, README, API docs and scripts/browser.py implement navigation, screenshot, interact, scrape, debug, test and record via CDP/Playwright. The Playwright CLI adapter is a coherent implementation of the claimed capability.

⚠ Instruction Scope

Runtime instructions and the included script direct the agent to read full DOM, capture screenshots, list network requests, capture console logs, click/type (including login flows), and lock user input with a full-screen overlay. These behaviors go beyond passive observation: they permit active interaction with and inspection of arbitrary websites and can capture sensitive in-page data. The code also references environment variables and skill directory paths (e.g., BROWSER_CDP_ENDPOINT, CLAUDE_SKILL_DIR) that are not declared in the skill metadata.

ℹ Install Mechanism

No install spec in registry (instruction-only skill plus one Python script). The script requires the user to run 'pip install playwright' themselves; nothing is fetched automatically during installation. This lowers supply-chain risk but means runtime dependency installation and execution are manual and must be audited by the user.

⚠ Credentials

The manifest lists no required env vars, but the code uses BROWSER_CDP_ENDPOINT (defaulting to http://localhost:9222) and documentation references CLAUDE_SKILL_DIR. Allowing the CDP endpoint to be set by env var means the skill can be pointed at an arbitrary CDP host (including a remote host) without that being declared. Additionally, network/console inspection via CDP can expose sensitive tokens or page-injected secrets. The skill does not request credentials itself, but instructions (and the 'interact' skill) explicitly describe performing login flows, which could cause credentials to be entered or captured.

⚠ Persistence & Privilege

always:false (good) and autonomous invocation is allowed (normal), but combined with the skill's ability to lock user input, persist CLI session state, inspect network/console data, and perform multi-step interactions, autonomous execution increases potential for undesired or surprising actions. The 'lock' feature (injecting an overlay to block user input) is especially intrusive and increases the blast radius if the skill is invoked without careful controls.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install browser-automation-skills
安装完成后，直接呼叫该 Skill 的名称或使用 /browser-automation-skills 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of browser-automation-skills: - Introduces a skill pack for browser automation with Google Chrome. - Enables navigation, screenshots, interaction, scraping, debugging, automated testing, and session recording. - Provides both built-in subagent and Playwright CLI script for flexibility. - Features visual overlay to lock/unlock browser input during automation. - Includes multilingual documentation (English and Chinese).

元数据

Slug browser-automation-skills

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题