功能描述

通过 Chrome Debug 模式（CDP）自动化操作真实浏览器——导航、点击、填表、提取数据、截图、执行多步流程。当用户说"帮我在网页上操作"、"打开浏览器"、"帮我点"、"帮我填"、"帮我抓取"、"帮我截图"、"auto-browser"时使用。

使用说明 (SKILL.md)

auto-browser

Name: auto-browser
Author: scottliu007

你的浏览器遥控器。导航、点击、填表、抓数据、截图——说一句话就行。

使用方式

/auto-browser 打开 GitHub 看看我的 notifications
/auto-browser 去 Amazon 搜索 mechanical keyboard 截图前三个结果
/auto-browser 登录后台，把订单列表导出来
/auto-browser 帮我在这个页面点「下一步」然后填写地址表单

自然语言描述意图即可，不需要写代码。

必须使用 playwright-cdp 工具集

所有浏览器操作使用 user-playwright-cdp 的工具（连接真实 Chrome，保留登录态）。

✅ 用：user-playwright-cdp 的 browser_navigate、browser_snapshot、browser_click 等
❌ 禁止：cursor-ide-browser 的同名工具（沙盒浏览器，无登录态）

核心工作流

0. 确保 Chrome Debug 在线

curl -s http://127.0.0.1:9222/json/version

✅ 有响应 → 继续
❌ 无响应 → 直接启动，不问用户：

nohup /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
  --remote-debugging-port=9222 \
  --user-data-dir="/tmp/chrome_debug_profile" \
  > /tmp/chrome_debug.log 2>&1 &

等 2 秒后重新 curl 确认。

1. 感知：snapshot 先行

每次操作前都先 browser_snapshot，获取页面无障碍树。这是你的「眼睛」。

用 snapshot 中的 ref 来定位元素
不确定页面状态时，先 snapshot 再决定下一步

2. 行动：选择合适的操作

根据用户意图选择操作，可自由组合：

意图	工具	说明
打开网页	`browser_navigate`	推断 URL，直接导航
后退	`browser_navigate_back`
点击按钮/链接	`browser_click`	用 snapshot 的 ref
填写输入框	`browser_type`（追加）/ `browser_fill_form`（清空后填）
选择下拉框	`browser_select_option`
上传文件	`browser_file_upload`	需绝对路径
按键	`browser_press_key`	Enter、Escape、Tab 等
悬停	`browser_hover`	展开菜单、tooltip
拖拽	`browser_drag`	startRef → endRef
处理弹窗	`browser_handle_dialog`	alert/confirm/prompt
等待加载	`browser_wait_for`	等时间或等文本出现/消失
执行 JS	`browser_evaluate`	页面没有暴露 UI 时的后备手段
管理标签页	`browser_tabs`	list/new/close/select
调整窗口	`browser_resize`	测试响应式布局

3. 确认：截图反馈

操作完成后 browser_take_screenshot 截图给用户确认结果。

操作原则

URL 推断

根据用户描述直接推断 URL 并导航，不要反问：

「打开 Google」→ https://www.google.com
「去 GitHub」→ https://github.com
「看看 V2EX」→ https://www.v2ex.com
模糊描述 → 用常识判断，导航后截图确认

只有完全无法推断时才问。

多步操作

复杂任务拆成步骤，每步遵循：snapshot → 操作 → 等待 → snapshot 确认。

示例：「登录后台导出订单」
1. navigate 到登录页 → snapshot
2. 如果已登录跳过，否则填写账号密码 → click 登录
3. wait_for 页面加载 → snapshot
4. click 导航到订单页 → snapshot
5. click 导出按钮 → 截图确认

等待策略

页面变化后（导航、点击、提交），用短间隔等待 + snapshot 确认：

browser_wait_for time=2 → snapshot → 检查是否就绪
→ 没好？再 wait_for time=2 → snapshot

不要一次等太久。2-3 秒一轮，最多重试 3 次。

数据提取

需要从页面抓取数据时：

browser_snapshot 获取页面结构
从无障碍树中提取所需信息
信息不够时用 browser_evaluate 执行 JS 提取
结果太长写入文件，回复给摘要 + 路径

表单填写

填表场景遵循 auto-fill 的规则：

语义匹配字段，不要求精确
不确定的字段列出来问用户
密码类字段填前确认
多字段优先用 browser_fill_form 批量填写
填完截图确认

安全边界

操作	Agent 直接做	需用户确认
导航、浏览、截图	✅
点击普通按钮/链接	✅
填写非敏感字段	✅
读取/提取页面数据	✅
填写密码		✅
点击「提交」「付款」「删除」		✅（除非用户明确说"帮我提交"）
发送消息/邮件		✅
关闭标签页		✅（除非用户要求）

原则：只读操作自由做，写入/不可逆操作先确认。

错误处理

问题	处理
元素找不到	重新 snapshot，ref 可能变了
页面加载慢	`browser_wait_for` 等待，不要盲目重试
弹窗阻断	`browser_handle_dialog` 处理
需要登录	告诉用户，等用户手动登录后继续
验证码/人机验证	截图告知用户，等用户处理
JS 报错	`browser_console_messages` 查看错误日志

环境配置（首次）

如果 ~/.cursor/mcp.json 里没有 playwright-cdp 配置，添加：

"playwright-cdp": {
  "command": "npx",
  "args": ["-y", "@playwright/mcp@latest", "--cdp-endpoint", "http://127.0.0.1:9222"]
}

添加后提示用户重载 MCP（Cursor 设置 → MCP → Reload）。

安全使用建议

This skill can control a real Chrome instance and modify your Cursor MCP config and may trigger network installs via npx. Before installing or using it, consider: 1) The SKILL.md will try to start Chrome automatically (it uses a hardcoded macOS binary path) — decide if you want the agent to run system commands without asking. 2) CDP access (localhost:9222) can expose cookies, tokens, and any page data; only use this skill when you trust the agent and have no sensitive pages open. 3) The skill instructs adding an npx/@playwright/mcp command to ~/.cursor/mcp.json, which will cause transient downloads from npm when invoked — review that package and prefer manual installation if you want to vet the code. 4) The metadata does not declare required tools (curl, nohup, npx, Chrome), so expect to provide or permit these at runtime. If you proceed, require explicit confirmations for sensitive actions (password fills, submits, payments) and review any config edits before they are applied.

功能分析

Type: OpenClaw Skill Name: auto-browser Version: 1.0.0 The 'auto-browser' skill is classified as suspicious because its instructions (SKILL.md) direct the agent to execute a shell command to launch Chrome with remote debugging enabled (CDP) without seeking user confirmation ('直接启动，不问用户'). Furthermore, it instructs the agent to modify the sensitive IDE configuration file '~/.cursor/mcp.json' to add a new MCP tool. While these actions are functionally related to browser automation, the explicit instruction to bypass user consent for process execution and the modification of system-level configuration files represent high-risk behaviors.

能力评估

ℹ Purpose & Capability

The skill's name/description (real-browser automation via CDP) matches the runtime instructions: navigation, clicks, form-filling, snapshots, JS evaluation, screenshots. However the metadata claims no required binaries or env vars while SKILL.md expects curl, nohup (shell), a local Chrome installation, and npx/@playwright/mcp — a mismatch between declared requirements and actual operational needs.

⚠ Instruction Scope

Instructions tell the agent to check http://127.0.0.1:9222 and, if no response, automatically start Chrome using a hardcoded macOS path and nohup (without asking the user). It also instructs modifying ~/.cursor/mcp.json to add a command that will run npx @playwright/mcp. The skill allows executing arbitrary page JS (browser_evaluate) and reading/extracting page data and writing results to disk. These are within a browser-automation use case but the automatic process-start and config-edit behaviors expand scope and pose privacy/permission risks.

⚠ Install Mechanism

There is no formal install spec, but the skill instructs to configure MCP to call npx @playwright/mcp (which will fetch code from npm at runtime). That means the agent may cause network installs and execute third-party code via npx. The skill also assumes standard shell tools (curl, nohup) and direct execution of the Chrome binary; these implicit dependencies are not declared, increasing risk.

⚠ Credentials

The skill declares no required credentials, yet connecting to a local CDP endpoint (127.0.0.1:9222) gives powerful access to the user's browser session (cookies, localStorage, active logins). The SKILL.md does partially mitigate by requiring user confirmation for sensitive write/irreversible actions (password fills, submits, payments), but the ability to automatically start and control a browser without explicit per-action consent is high privilege and should be treated cautiously.

ℹ Persistence & Privilege

The skill directs adding a 'playwright-cdp' entry to ~/.cursor/mcp.json (a persistent change to MCP configuration). Writing its own agent config is expected for this type of tool, but the instruction recommends doing so programmatically and then asking the user to reload — users should be aware this modifies a local config file.

版本历史

v1.0.0

auto-browser 1.0.0 - 首个版本上线，支持通过 Chrome 调试协议自动化真实浏览器操作。 - 支持自然语言描述一键完成导航、点击、填表、抓取数据、截图等多步网页操作 - 所有操作基于 Playwright-CDP 工具，保持真实浏览器登录态，禁止用沙盒环境 - 自动检测并启动 Chrome Debug 模式，无需用户干预 - 每步操作前自动获取页面快照，实现稳健的元素定位与结果确认 - 内置多步流程编排、安全边界控制和常见错误处理机制 - 提供详细中文使用说明和环境初次配置指引

元数据

Slug auto-browser

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

auto-browser 是什么？

通过 Chrome Debug 模式（CDP）自动化操作真实浏览器——导航、点击、填表、提取数据、截图、执行多步流程。当用户说"帮我在网页上操作"、"打开浏览器"、"帮我点"、"帮我填"、"帮我抓取"、"帮我截图"、"auto-browser"时使用。它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 315 次。

如何安装 auto-browser？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install auto-browser」即可一键安装，无需额外配置。

auto-browser 是免费的吗？

是的，auto-browser 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

auto-browser 支持哪些平台？

auto-browser 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 auto-browser？

由 fanzhuo（@scottliu007）开发并维护，当前版本 v1.0.0。

auto-browser

auto-browser

使用方式

必须使用 playwright-cdp 工具集

核心工作流

0. 确保 Chrome Debug 在线

1. 感知：snapshot 先行

2. 行动：选择合适的操作

3. 确认：截图反馈

操作原则

URL 推断

多步操作

等待策略

数据提取

表单填写

安全边界

错误处理

环境配置（首次）

auto-browser 是什么？

如何安装 auto-browser？

auto-browser 是免费的吗？

auto-browser 支持哪些平台？

谁开发了 auto-browser？

💬 留言讨论