← 返回 Skills 市场
buttercannfly

broswer use skill

作者 ropzislaw · GitHub ↗ · v1.0.0 · MIT-0
macoslinuxwindows ⚠ suspicious
108
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install browser-cli-skills
功能描述
Control Chrome browsers from the terminal via the AIPex extension. Use this skill when the agent needs to manage browser tabs, search page elements, click bu...
使用说明 (SKILL.md)

browser-cli — Terminal Browser Control

browser-cli is a command-line tool that controls Chrome browsers through the AIPex extension's WebSocket daemon. It translates shell commands into browser actions — managing tabs, clicking elements, filling forms, capturing screenshots, and more.

Architecture:

browser-cli ──WebSocket──▶ aipex-daemon ──WebSocket──▶ AIPex Chrome Extension ──▶ Browser APIs

The daemon auto-spawns on first use and self-terminates when idle. No manual setup beyond initial extension connection.


When to Use This Skill

Use this skill when the user wants to:

  • Control a Chrome browser from the terminal without an MCP client
  • Open, close, switch, or organize browser tabs via CLI
  • Search for page elements and interact with them (click, fill, hover)
  • Capture screenshots of browser tabs
  • Automate browser workflows in shell scripts or CI pipelines
  • Download page content as markdown or images
  • Request human input during automated browser tasks
  • Manage AIPex skills from the command line

Trigger phrases: "browser-cli", "control browser from terminal", "browser automation CLI", "click element from shell", "terminal browser control", "command line browser", "shell browser automation"


Prerequisites

  • Node.js >= 18 installed
  • AIPex Chrome extension installed and connected to the daemon
  • browser-cli installed globally: npm install -g browser-cli

First-time Setup

After installing, connect the AIPex extension to the daemon:

  1. Open Chrome → AIPex extension icon → Options
  2. Set WebSocket URL to ws://localhost:9223/extension
  3. Click Connect
  4. Verify: browser-cli status

Command Groups

tab — Manage browser tabs

browser-cli tab list                         # List all open tabs
browser-cli tab current                      # Get the active tab
browser-cli tab new https://example.com      # Open a new tab
browser-cli tab switch 42                    # Switch to tab by ID
browser-cli tab close 42                     # Close a tab
browser-cli tab info 42                      # Get tab details
browser-cli tab organize                     # AI-powered tab grouping
browser-cli tab ungroup                      # Remove all tab groups

page — Inspect and interact with page content

browser-cli page search "button*" --tab 123              # Search elements by glob pattern
browser-cli page search "{input,textarea}*" --tab 123    # Search multiple element types
browser-cli page screenshot                               # Screenshot active tab
browser-cli page screenshot-tab 123 --send-to-llm true   # Screenshot with LLM analysis
browser-cli page metadata --tab 123                       # Get page metadata
browser-cli page scroll-to "#main-content"                # Scroll to element
browser-cli page highlight "button.submit"                # Highlight element
browser-cli page highlight-text "p" "important"           # Highlight text in content

interact — Click, fill, hover, and type

browser-cli interact click btn-42 --tab 123                         # Click by UID
browser-cli interact fill input-5 "hello world" --tab 123           # Fill input by UID
browser-cli interact hover menu-3 --tab 123                         # Hover by UID
browser-cli interact form --tab 123 --elements '[{"uid":"in-1","value":"foo"}]'  # Batch fill
browser-cli interact editor editor-1 --tab 123                     # Get editor content
browser-cli interact upload --tab 123 --file-path /path/to/file    # Upload file
browser-cli interact computer --action left_click --coordinate "[500,300]"  # Pixel-level click

download — Save content locally

browser-cli download markdown --text "# Notes" --filename notes    # Save as markdown
browser-cli download image --data "data:image/png;base64,..."      # Save image
browser-cli download chat-images --messages '[...]' --folder imgs   # Batch save images

intervention — Request human input

browser-cli intervention list                                       # List intervention types
browser-cli intervention info voice-input                           # Get type details
browser-cli intervention request voice-input --reason "Need input"  # Request intervention
browser-cli intervention cancel                                     # Cancel active request

skill — Manage AIPex skills

browser-cli skill list                                    # List all skills
browser-cli skill load my-skill                           # Load skill content
browser-cli skill info my-skill                           # Skill details
browser-cli skill run my-skill scripts/init.js            # Execute skill script
browser-cli skill ref my-skill references/guide.md        # Read skill reference
browser-cli skill asset my-skill assets/icon.png          # Get skill asset

Standalone commands

browser-cli status    # Check daemon + extension connection
browser-cli update    # Self-update to latest version

Workflow: Search, Interact, Verify

The recommended pattern for browser automation:

# 1. Discover tabs
browser-cli tab list

# 2. Search for elements (fast, no screenshot needed)
browser-cli page search "{button,input,link}*" --tab 123

# 3. Interact using UIDs from search results
browser-cli interact click btn-submit --tab 123
# or
browser-cli interact fill input-email "[email protected]" --tab 123

# 4. Verify visually (only when needed)
browser-cli page screenshot

Login form example

browser-cli page search "{input,textbox}*" --tab 123
browser-cli interact fill input-email "[email protected]" --tab 123
browser-cli interact fill input-pass "secret" --tab 123
browser-cli page search "*[Ll]ogin*" --tab 123
browser-cli interact click btn-login --tab 123

Shell script example

#!/bin/bash
browser-cli tab new https://example.com
sleep 2
TAB_ID=$(browser-cli tab current | jq '.data.id')
browser-cli page search "link*" --tab "$TAB_ID"
browser-cli page screenshot

Global Options

Option Default Description
--port \x3Cn> 9223 Daemon WebSocket port
--host \x3Ch> 127.0.0.1 Daemon host address

Environment Variables

Variable Default Description
BROWSER_CLI_WS_URL ws://127.0.0.1:9223/cli Override daemon WebSocket URL
BROWSER_CLI_CONNECT_TIMEOUT 60000 Connection timeout (ms)

Troubleshooting

Symptom Fix
Daemon not running Run any command to auto-spawn, or check with browser-cli status
Extension is not connected Open AIPex Options → WebSocket URL ws://localhost:9223/extension → Connect
Port 9223 in use Use --port 9224 and update extension URL
Timeout after 60s Verify extension is connected. Increase with BROWSER_CLI_CONNECT_TIMEOUT=120000
0 search results Try different patterns. Fall back to page screenshot --send-to-llm true + interact computer
Wrong results Verify tab ID with browser-cli tab list
安全使用建议
This skill appears to be what it claims (a Node-based CLI controlling Chrome via an AIPex extension), but its documented commands let an agent read local files, upload them to pages, capture and send screenshots, and execute skill scripts. Before installing: (1) review the browser-cli npm package and AIPex extension source on GitHub to verify authorship and code, (2) avoid granting autonomous agent invocation for this skill or require user confirmation for any file-access or script-run actions, (3) test in an isolated environment (VM/container) first, (4) audit any 'skill run' scripts you intend to allow, and (5) monitor network and filesystem activity while using it. If you don't trust the upstream code or cannot limit file/script access, do not install.
功能分析
Type: OpenClaw Skill Name: browser-cli-skills Version: 1.0.0 The browser-cli skill provides extensive browser automation capabilities, including high-risk features such as executing arbitrary scripts (browser-cli skill run), self-updating the tool (browser-cli update), and pixel-level interaction. While these capabilities are plausibly required for the stated purpose of terminal-based browser control, they represent a significant attack surface and potential for misuse. No evidence of intentional malice, data exfiltration, or prompt injection was found in SKILL.md or _meta.json.
能力评估
Purpose & Capability
Name/description, required binaries (node, npm), and the CLI-centric commands are coherent: a Node CLI controlling Chrome via an extension reasonably needs Node and npm and the listed tab/page/interact features.
Instruction Scope
SKILL.md instructs use of commands that can read local paths (upload --file-path), save files, capture screenshots, and run skill scripts ('skill run ... scripts/init.js'). Those capabilities let an agent access arbitrary local files and execute code on the host if invoked — scope is broader than simple browser control and could be used to exfiltrate data or run local scripts.
Install Mechanism
There is no install spec in the package (instruction-only). The document instructs a standard 'npm install -g browser-cli' which is expected for a Node CLI; no download-from-arbitrary-URL or archive extraction is specified by the skill itself.
Credentials
The skill does not request credentials or environment variables (proportionate), but the documented commands allow reading and uploading arbitrary filesystem paths and invoking local skill scripts — the lack of declared secrets is good, but the CLI-level file access is an implicit privilege that the SKILL.md does not constrain.
Persistence & Privilege
always is false (good). The skill is allowed to be invoked autonomously by default (platform normal). Combined with the broad file/script operations in the instructions, autonomous invocation increases blast radius — consider requiring user confirmation before file access or script execution.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install browser-cli-skills
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /browser-cli-skills 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of browser-cli: Terminal-based Chrome browser automation via the AIPex extension. - Control Chrome tabs, interact with page elements, fill forms, and capture screenshots entirely from the command line. - Integrate browser automation with shell scripts, CI pipelines, and manual workflows — no MCP client required. - Request human interventions during automation, download page content, and organize tabs with AI. - CLI includes tab, page, interact, download, intervention, skill management, and standalone utility commands. - Requires Node.js ≥18, AIPex Chrome extension, and a simple WebSocket setup.
元数据
Slug browser-cli-skills
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

broswer use skill 是什么?

Control Chrome browsers from the terminal via the AIPex extension. Use this skill when the agent needs to manage browser tabs, search page elements, click bu... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 108 次。

如何安装 broswer use skill?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install browser-cli-skills」即可一键安装,无需额外配置。

broswer use skill 是免费的吗?

是的,broswer use skill 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

broswer use skill 支持哪些平台?

broswer use skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(macos, linux, windows)。

谁开发了 broswer use skill?

由 ropzislaw(@buttercannfly)开发并维护,当前版本 v1.0.0。

💬 留言讨论