功能描述

Control Chrome browsers from the terminal via the AIPex extension. Use this skill when the agent needs to manage browser tabs, search page elements, click bu...

使用说明 (SKILL.md)

browser-cli — Terminal Browser Control

Name: broswer use skill
Author: buttercannfly

browser-cli is a command-line tool that controls Chrome browsers through the AIPex extension's WebSocket daemon. It translates shell commands into browser actions — managing tabs, clicking elements, filling forms, capturing screenshots, and more.

Architecture:

browser-cli ──WebSocket──▶ aipex-daemon ──WebSocket──▶ AIPex Chrome Extension ──▶ Browser APIs

The daemon auto-spawns on first use and self-terminates when idle. No manual setup beyond initial extension connection.

When to Use This Skill

Use this skill when the user wants to:

Control a Chrome browser from the terminal without an MCP client
Open, close, switch, or organize browser tabs via CLI
Search for page elements and interact with them (click, fill, hover)
Capture screenshots of browser tabs
Automate browser workflows in shell scripts or CI pipelines
Download page content as markdown or images
Request human input during automated browser tasks
Manage AIPex skills from the command line

Trigger phrases: "browser-cli", "control browser from terminal", "browser automation CLI", "click element from shell", "terminal browser control", "command line browser", "shell browser automation"

Prerequisites

Node.js >= 18 installed
AIPex Chrome extension installed and connected to the daemon
browser-cli installed globally: npm install -g browser-cli

First-time Setup

After installing, connect the AIPex extension to the daemon:

Open Chrome → AIPex extension icon → Options
Set WebSocket URL to ws://localhost:9223/extension
Click Connect
Verify: browser-cli status

Command Groups

tab — Manage browser tabs

browser-cli tab list                         # List all open tabs
browser-cli tab current                      # Get the active tab
browser-cli tab new https://example.com      # Open a new tab
browser-cli tab switch 42                    # Switch to tab by ID
browser-cli tab close 42                     # Close a tab
browser-cli tab info 42                      # Get tab details
browser-cli tab organize                     # AI-powered tab grouping
browser-cli tab ungroup                      # Remove all tab groups

page — Inspect and interact with page content

browser-cli page search "button*" --tab 123              # Search elements by glob pattern
browser-cli page search "{input,textarea}*" --tab 123    # Search multiple element types
browser-cli page screenshot                               # Screenshot active tab
browser-cli page screenshot-tab 123 --send-to-llm true   # Screenshot with LLM analysis
browser-cli page metadata --tab 123                       # Get page metadata
browser-cli page scroll-to "#main-content"                # Scroll to element
browser-cli page highlight "button.submit"                # Highlight element
browser-cli page highlight-text "p" "important"           # Highlight text in content

interact — Click, fill, hover, and type

browser-cli interact click btn-42 --tab 123                         # Click by UID
browser-cli interact fill input-5 "hello world" --tab 123           # Fill input by UID
browser-cli interact hover menu-3 --tab 123                         # Hover by UID
browser-cli interact form --tab 123 --elements '[{"uid":"in-1","value":"foo"}]'  # Batch fill
browser-cli interact editor editor-1 --tab 123                     # Get editor content
browser-cli interact upload --tab 123 --file-path /path/to/file    # Upload file
browser-cli interact computer --action left_click --coordinate "[500,300]"  # Pixel-level click

download — Save content locally

browser-cli download markdown --text "# Notes" --filename notes    # Save as markdown
browser-cli download image --data "data:image/png;base64,..."      # Save image
browser-cli download chat-images --messages '[...]' --folder imgs   # Batch save images

intervention — Request human input

browser-cli intervention list                                       # List intervention types
browser-cli intervention info voice-input                           # Get type details
browser-cli intervention request voice-input --reason "Need input"  # Request intervention
browser-cli intervention cancel                                     # Cancel active request

skill — Manage AIPex skills

browser-cli skill list                                    # List all skills
browser-cli skill load my-skill                           # Load skill content
browser-cli skill info my-skill                           # Skill details
browser-cli skill run my-skill scripts/init.js            # Execute skill script
browser-cli skill ref my-skill references/guide.md        # Read skill reference
browser-cli skill asset my-skill assets/icon.png          # Get skill asset

Standalone commands

browser-cli status    # Check daemon + extension connection
browser-cli update    # Self-update to latest version

Workflow: Search, Interact, Verify

The recommended pattern for browser automation:

# 1. Discover tabs
browser-cli tab list

# 2. Search for elements (fast, no screenshot needed)
browser-cli page search "{button,input,link}*" --tab 123

# 3. Interact using UIDs from search results
browser-cli interact click btn-submit --tab 123
# or
browser-cli interact fill input-email "[email protected]" --tab 123

# 4. Verify visually (only when needed)
browser-cli page screenshot

Login form example

browser-cli page search "{input,textbox}*" --tab 123
browser-cli interact fill input-email "[email protected]" --tab 123
browser-cli interact fill input-pass "secret" --tab 123
browser-cli page search "*[Ll]ogin*" --tab 123
browser-cli interact click btn-login --tab 123

Shell script example

#!/bin/bash
browser-cli tab new https://example.com
sleep 2
TAB_ID=$(browser-cli tab current | jq '.data.id')
browser-cli page search "link*" --tab "$TAB_ID"
browser-cli page screenshot

Global Options

Option	Default	Description
`--port \x3Cn>`	`9223`	Daemon WebSocket port
`--host \x3Ch>`	`127.0.0.1`	Daemon host address

Environment Variables

Variable	Default	Description
`BROWSER_CLI_WS_URL`	`ws://127.0.0.1:9223/cli`	Override daemon WebSocket URL
`BROWSER_CLI_CONNECT_TIMEOUT`	`60000`	Connection timeout (ms)

Troubleshooting

Symptom	Fix
`Daemon not running`	Run any command to auto-spawn, or check with `browser-cli status`
`Extension is not connected`	Open AIPex Options → WebSocket URL `ws://localhost:9223/extension` → Connect
Port 9223 in use	Use `--port 9224` and update extension URL
Timeout after 60s	Verify extension is connected. Increase with `BROWSER_CLI_CONNECT_TIMEOUT=120000`
0 search results	Try different patterns. Fall back to `page screenshot --send-to-llm true` + `interact computer`
Wrong results	Verify tab ID with `browser-cli tab list`

安全使用建议

This skill appears to be what it claims (a Node-based CLI controlling Chrome via an AIPex extension), but its documented commands let an agent read local files, upload them to pages, capture and send screenshots, and execute skill scripts. Before installing: (1) review the browser-cli npm package and AIPex extension source on GitHub to verify authorship and code, (2) avoid granting autonomous agent invocation for this skill or require user confirmation for any file-access or script-run actions, (3) test in an isolated environment (VM/container) first, (4) audit any 'skill run' scripts you intend to allow, and (5) monitor network and filesystem activity while using it. If you don't trust the upstream code or cannot limit file/script access, do not install.

功能分析

Type: OpenClaw Skill Name: browser-cli-skills Version: 1.0.0 The browser-cli skill provides extensive browser automation capabilities, including high-risk features such as executing arbitrary scripts (browser-cli skill run), self-updating the tool (browser-cli update), and pixel-level interaction. While these capabilities are plausibly required for the stated purpose of terminal-based browser control, they represent a significant attack surface and potential for misuse. No evidence of intentional malice, data exfiltration, or prompt injection was found in SKILL.md or _meta.json.

能力评估

✓ Purpose & Capability

Name/description, required binaries (node, npm), and the CLI-centric commands are coherent: a Node CLI controlling Chrome via an extension reasonably needs Node and npm and the listed tab/page/interact features.

⚠ Instruction Scope

SKILL.md instructs use of commands that can read local paths (upload --file-path), save files, capture screenshots, and run skill scripts ('skill run ... scripts/init.js'). Those capabilities let an agent access arbitrary local files and execute code on the host if invoked — scope is broader than simple browser control and could be used to exfiltrate data or run local scripts.

✓ Install Mechanism

There is no install spec in the package (instruction-only). The document instructs a standard 'npm install -g browser-cli' which is expected for a Node CLI; no download-from-arbitrary-URL or archive extraction is specified by the skill itself.

ℹ Credentials

The skill does not request credentials or environment variables (proportionate), but the documented commands allow reading and uploading arbitrary filesystem paths and invoking local skill scripts — the lack of declared secrets is good, but the CLI-level file access is an implicit privilege that the SKILL.md does not constrain.

ℹ Persistence & Privilege

always is false (good). The skill is allowed to be invoked autonomously by default (platform normal). Combined with the broad file/script operations in the instructions, autonomous invocation increases blast radius — consider requiring user confirmation before file access or script execution.

版本历史

v1.0.0

Initial release of browser-cli: Terminal-based Chrome browser automation via the AIPex extension. - Control Chrome tabs, interact with page elements, fill forms, and capture screenshots entirely from the command line. - Integrate browser automation with shell scripts, CI pipelines, and manual workflows — no MCP client required. - Request human interventions during automation, download page content, and organize tabs with AI. - CLI includes tab, page, interact, download, intervention, skill management, and standalone utility commands. - Requires Node.js ≥18, AIPex Chrome extension, and a simple WebSocket setup.

元数据

Slug browser-cli-skills

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

broswer use skill 是什么？

Control Chrome browsers from the terminal via the AIPex extension. Use this skill when the agent needs to manage browser tabs, search page elements, click bu... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 108 次。

如何安装 broswer use skill？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install browser-cli-skills」即可一键安装，无需额外配置。

broswer use skill 是免费的吗？

是的，broswer use skill 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

broswer use skill 支持哪些平台？

broswer use skill 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（macos, linux, windows）。

谁开发了 broswer use skill？

由 ropzislaw（@buttercannfly）开发并维护，当前版本 v1.0.0。

broswer use skill