功能描述

Automate browser tasks from the command line with browser-use, including navigation, interaction, screenshots, tabs, cookies, JavaScript, cloud browsing, and...

使用说明 (SKILL.md)

browser-use CLI Skill

Name: browser-cli
Author: zdstudios

What This Skill Covers

Use this skill any time you need to automate a browser from the command line using browser-use. This includes navigating pages, clicking/typing/filling forms, taking screenshots, running JavaScript, managing tabs, handling cookies, driving cloud browsers, and exposing local servers via tunnels.

Installation

Prerequisites

Platform	Requirements
macOS	Python 3.11+
Linux	Python 3.11+
Windows	Git for Windows + Python 3.11+

One-line Install (Recommended)

# macOS / Linux
curl -fsSL https://browser-use.com/cli/install.sh | bash

# Windows (PowerShell)
& "C:\Program Files\Git\bin\bash.exe" -c 'curl -fsSL https://browser-use.com/cli/install.sh | bash'

Manual Install

uv pip install browser-use
browser-use install   # downloads Chromium
browser-use doctor    # validates setup

Post-Install Health Check

browser-use doctor   # prints diagnostics + config
browser-use setup    # optional interactive wizard

Core Mental Model

The workflow is always:

Open a page → browser-use open \x3Curl>
Inspect the page → browser-use state (returns numbered element indices)
Interact using those indices → browser-use click 3, browser-use input 1 "text"
Repeat — the daemon keeps the browser alive between commands (~50ms latency)

A background daemon process starts automatically on first command and stays alive until you browser-use close.

Browser Modes

# Default: headless Chromium (invisible)
browser-use open https://example.com

# Visible window
browser-use --headed open https://example.com

# Use your real Chrome (preserves logins, cookies, extensions)
browser-use connect

# Use a specific Chrome profile
browser-use --profile "Default" open https://gmail.com

# Zero-config cloud browser (requires API key)
browser-use cloud connect

# Connect to existing browser via CDP
browser-use --cdp-url http://localhost:9222 open https://example.com
browser-use --cdp-url ws://localhost:9222/devtools/browser/... state

After connect or cloud connect, all subsequent commands automatically target that browser — no extra flags needed.

All Commands

Navigation

browser-use open \x3Curl>               # Navigate to URL
browser-use back                     # Go back in history
browser-use scroll down              # Scroll down
browser-use scroll up                # Scroll up
browser-use scroll down --amount 1000  # Scroll by pixel amount

Inspection

browser-use state                    # Get URL, title, and numbered clickable elements
browser-use screenshot output.png    # Take screenshot to file
browser-use screenshot               # Screenshot as base64 (stdout)
browser-use screenshot --full page.png  # Full-page screenshot

Interaction

browser-use click \x3Cindex>            # Click element by index (from state)
browser-use click \x3Cx> \x3Cy>           # Click at pixel coordinates
browser-use type "text"              # Type into currently focused element
browser-use input \x3Cindex> "text"     # Click element then type (most common for forms)
browser-use keys "Enter"             # Send keyboard key
browser-use keys "Control+a"         # Send key combination
browser-use select \x3Cindex> "value"   # Select dropdown option
browser-use upload \x3Cindex> /path/to/file  # Upload file to file input
browser-use hover \x3Cindex>            # Hover over element
browser-use dblclick \x3Cindex>         # Double-click element
browser-use rightclick \x3Cindex>       # Right-click element

Tabs

browser-use tab list                 # List all open tabs
browser-use tab new                  # Open blank tab
browser-use tab new https://url.com  # Open tab with URL
browser-use tab switch \x3Cindex>       # Switch to tab by index
browser-use tab close                # Close current tab
browser-use tab close \x3Cindex>        # Close specific tab

Cookies

browser-use cookies get                        # Get all cookies
browser-use cookies get --url https://site.com # Get cookies for URL
browser-use cookies set name value             # Set a cookie
browser-use cookies set name val --domain .example.com --secure
browser-use cookies set name val --same-site Strict   # Strict | Lax | None
browser-use cookies set name val --expires 1735689600 # Unix timestamp
browser-use cookies clear                      # Clear all cookies
browser-use cookies clear --url https://site.com
browser-use cookies export cookies.json        # Export to JSON
browser-use cookies import cookies.json        # Import from JSON

Waiting

browser-use wait selector ".btn"               # Wait for element to be visible
browser-use wait selector ".loading" --state hidden  # Wait for element to disappear
browser-use wait text "Success"                # Wait for text to appear on page
browser-use wait selector "h1" --timeout 5000 # Custom timeout in ms

Get (Information Retrieval)

browser-use get title                  # Get page title
browser-use get html                   # Get full page HTML
browser-use get html --selector "main" # Get HTML of specific element
browser-use get text \x3Cindex>           # Get text content of element
browser-use get value \x3Cindex>          # Get value of input/textarea
browser-use get attributes \x3Cindex>     # Get all element attributes
browser-use get bbox \x3Cindex>           # Get bounding box (x, y, width, height)

JavaScript

browser-use eval "document.title"
browser-use eval "Array.from(document.querySelectorAll('a')).map(a => a.href)"
browser-use eval "window.scrollTo(0, document.body.scrollHeight)"

Python (Persistent Session)

browser-use python "x = 42"              # Set a variable
browser-use python "print(x)"            # Access variable (prints: 42)
browser-use python "print(browser.url)"  # Access browser object
browser-use python --vars                # Show all defined variables
browser-use python --reset               # Clear namespace
browser-use python --file script.py      # Run a Python file

Session Management

Each --session gets its own daemon, socket, and browser instance.

# Default session (implicit)
browser-use open https://example.com
browser-use state

# Named sessions
browser-use --session work open https://example.com
browser-use --session personal open https://gmail.com
browser-use --session work state    # targets work browser

# List all active sessions
browser-use sessions

# Close a specific session
browser-use --session work close

# Close all sessions
browser-use close --all

# Via environment variable
BROWSER_USE_SESSION=work browser-use state

Cloud API

Auth

browser-use cloud login sk-abc123...    # Save API key
browser-use cloud logout                # Remove API key
# Or: export BROWSER_USE_API_KEY=sk-abc123...

Cloud Browser

browser-use cloud connect              # Provision cloud browser and connect
browser-use state                      # Works normally after connect
browser-use close                      # Disconnect AND stop cloud browser
browser-use cloud close                # Also stops cloud browser

REST Passthrough

browser-use cloud v2 GET /browsers
browser-use cloud v2 POST /tasks '{"task":"Search for AI news","url":"https://google.com"}'
browser-use cloud v2 poll \x3Ctask-id>    # Poll until task completes
browser-use cloud v3 POST /path '{"key":"value"}'
browser-use cloud v2 --help            # Show all v2 API endpoints
browser-use cloud v3 --help            # Show all v3 API endpoints

Tunnels (Expose Local Server to Cloud Browser)

browser-use tunnel 3000                # Expose localhost:3000 → public HTTPS URL
browser-use tunnel list                # List active tunnels
browser-use tunnel stop 3000           # Stop tunnel for port
browser-use tunnel stop --all          # Stop all tunnels

# Typical flow for testing local app with a cloud browser:
npm run dev &
browser-use tunnel 3000                # → https://abc.trycloudflare.com
browser-use cloud connect
browser-use open https://abc.trycloudflare.com

Profile Management (Sync Chrome Cookies to Cloud)

browser-use profile                    # Interactive sync wizard
browser-use profile list               # List detected browsers + profiles
browser-use profile sync --all         # Sync all profiles to cloud
browser-use profile sync --browser "Google Chrome" --profile "Default"
browser-use profile auth --apikey \x3Ckey>
browser-use profile inspect --browser "Google Chrome" --profile "Default"
browser-use profile update             # Update the profile-use binary

Global Options

Flag	Description
`--headed`	Show browser window
`--profile [NAME]`	Use real Chrome profile (bare flag = "Default")
`--connect`	Auto-discover running Chrome via CDP
`--cdp-url \x3Curl>`	Connect to existing browser via CDP URL
`--session NAME`	Target named session (default: "default")
`--json`	Output as JSON
`--mcp`	Run as MCP server via stdin/stdout

Configuration

browser-use config list                              # Show all config
browser-use config set cloud_connect_proxy jp        # Set a value
browser-use config get cloud_connect_proxy           # Get a value
browser-use config unset cloud_connect_timeout       # Remove a value

Config file: ~/.browser-use/config.json

Template Generation

browser-use init                          # Interactive template picker
browser-use init --list                   # List all templates
browser-use init --template basic         # Generate specific template
browser-use init --output my_script.py    # Specify output filename
browser-use init --force                  # Overwrite existing files

File Layout

~/.browser-use/
├── config.json          # API key + settings
├── bin/
│   └── profile-use      # Managed Go binary (auto-downloaded)
├── tunnels/
│   ├── {port}.json      # Tunnel metadata
│   └── {port}.log       # Tunnel logs
├── default.state.json   # Daemon lifecycle state
├── default.sock         # Daemon socket (ephemeral)
├── default.pid          # Daemon PID (ephemeral)
└── cli.log              # Daemon log

Override the base dir with BROWSER_USE_HOME.

Common Recipes

Fill and Submit a Form

browser-use open https://example.com/contact
browser-use state
# Output: [0] input "Name", [1] input "Email", [2] button "Submit"
browser-use input 0 "John Doe"
browser-use input 1 "[email protected]"
browser-use click 2
browser-use wait text "Thank you"

Scrape Data with JavaScript

browser-use open https://news.ycombinator.com
browser-use eval "Array.from(document.querySelectorAll('.titleline a')).slice(0,5).map(a => a.textContent)"

Multi-step Python Automation

browser-use open https://example.com
browser-use python "
for i in range(5):
    browser.scroll('down')
    browser.wait(0.5)
browser.screenshot('scrolled.png')
"

Login with Saved Chrome Profile

browser-use --profile "Default" open https://gmail.com
browser-use state
# Gmail inbox loads already logged in

Run Two Browsers in Parallel

browser-use --session a open https://site-a.com
browser-use --session b open https://site-b.com
browser-use --session a state
browser-use --session b state
browser-use close --all

Test Local App via Cloud Browser

npm run dev &
browser-use tunnel 3000
# Copy the printed public URL, e.g. https://xyz.trycloudflare.com
browser-use cloud connect
browser-use open https://xyz.trycloudflare.com
browser-use screenshot check.png

Windows Troubleshooting

ARM64 Windows — install x64 Python for emulation:

winget install Python.Python.3.11 --architecture x64

Multiple Python versions — pin the version:

$env:PY_PYTHON=3.11

PATH not updated — restart terminal, or run via Git Bash:

& "C:\Program Files\Git\bin\bash.exe" -c 'browser-use --help'

Daemon won't start — kill stale processes:

wmic process where "name='python.exe' and commandline like '%browser%use%'" get processid
taskkill /PID \x3Cpid> /F

Stale venv — nuke and reinstall:

wmic process where "name='python.exe' and commandline like '%browser%use%'" call terminate
Remove-Item -Recurse -Force "$env:USERPROFILE\.browser-use-env"
# Re-run the installer

Quick Reference Card

Goal	Command
Start + navigate	`browser-use open \x3Curl>`
See elements	`browser-use state`
Click element	`browser-use click \x3Cindex>`
Fill input	`browser-use input \x3Cindex> "text"`
Press key	`browser-use keys "Enter"`
Screenshot	`browser-use screenshot out.png`
Run JS	`browser-use eval "js here"`
Wait for element	`browser-use wait selector ".cls"`
Get page HTML	`browser-use get html`
Close browser	`browser-use close`
Use real Chrome	`browser-use connect`
Cloud browser	`browser-use cloud connect`
Named session	`browser-use --session NAME \x3Ccmd>`
Show config	`browser-use doctor`

安全使用建议

Before installing or running this skill: 1) Treat the `curl | bash` install line as high risk — inspect the install.sh script on the server (https://browser-use.com/cli/install.sh) before executing it. 2) Prefer installing from a vetted package repository (verify the PyPI package 'browser-use' and its publisher) or run in an isolated VM/container. 3) Be aware the tool can access your real Chrome profile (cookies, logins, extensions) and can export/import cookies and expose local servers via tunnels — do not connect sensitive accounts or data unless you fully trust the publisher and code. 4) The SKILL.md mentions a cloud API key but the skill metadata does not declare it; treat any prompt for cloud credentials as sensitive and confirm where/how they are stored/transmitted. 5) If you proceed, restrict network access, run the daemon under a limited user account, review the installed files and service behavior, and consider using a disposable browser profile or dedicated machine to reduce risk. 6) If you cannot verify the publisher/domain or review the installer, avoid installing it.

能力标签

requires-sensitive-credentials

能力评估

ℹ Purpose & Capability

Name and description match the SKILL.md functionality: CLI-based browser automation (navigation, interaction, screenshots, JS, tabs, cookies, cloud browsers). However, several capabilities (using a user's real Chrome profile, exporting cookies, exposing local servers via tunnels, and cloud browsing requiring an API key) expand scope beyond a minimal 'automation helper' and are not represented in required env/config metadata, creating an alignment gap.

⚠ Instruction Scope

The SKILL.md explicitly instructs running remote install commands (curl | bash), downloading Chromium, starting a background daemon that stays alive, accessing real Chrome profiles (which preserves logins/cookies/extensions), exporting/importing cookies, uploading files, and exposing local servers via tunnels — all of which involve reading/transmitting sensitive local data or opening inbound/outbound network surfaces. The doc also lists 'cloud connect' that requires an API key but the skill does not declare or require that credential. Those instructions go beyond simple CLI guidance and give the agent (or user-followed automation) broad access to local secrets and network endpoints.

⚠ Install Mechanism

Although the skill has no formal install spec, the guidance tells users/agents to run a remote installer via `curl -fsSL https://browser-use.com/cli/install.sh | bash` (high-risk pattern: fetching and executing arbitrary remote script). It also suggests `pip install browser-use` and that the tool will download Chromium. Unverified domain and remote-exec pattern increase supply-chain risk; the instructions directly encourage writing and executing external code and binaries on the host.

⚠ Credentials

Registry metadata shows no required env vars or primary credential, yet the SKILL.md references features that need credentials or sensitive access: 'cloud connect' (requires an API key), connecting to an existing browser profile (access to cookies, saved logins, extensions), and CDP URLs for remote debugging. The lack of declared env vars/config paths is inconsistent with these capabilities and undercuts transparency about what secrets or files the tool will need access to.

⚠ Persistence & Privilege

The instructions state a background daemon starts automatically on first command and remains alive until explicitly closed — creating a persistent process with ongoing access to browser state and the ability to accept further commands. While the skill is not force-included (always:false), starting a persistent local service combined with the ability to use real browser profiles, tunnels, and cloud connections increases the potential blast radius if the installer or daemon is malicious or compromised.

版本历史

v1.0.0

Initial public release of the browser-use CLI Skill. - Automate browser tasks from the command line: navigation, form filling, screenshots, JavaScript execution, tab and cookie management. - Supports running locally (headless/headed), connecting to real Chrome profiles, or driving cloud browsers. - Organized commands for navigation, interaction, inspection, session management, cloud API, and tunneling local servers. - Quick installation and health checks with simple commands. - Interactive workflows with persistent daemon processes and session controls.

元数据

Slug browser-cli

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

browser-cli 是什么？

Automate browser tasks from the command line with browser-use, including navigation, interaction, screenshots, tabs, cookies, JavaScript, cloud browsing, and... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 35 次。

如何安装 browser-cli？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install browser-cli」即可一键安装，无需额外配置。

browser-cli 是免费的吗？

是的，browser-cli 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

browser-cli 支持哪些平台？

browser-cli 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 browser-cli？

由 ZDStudios（@zdstudios）开发并维护，当前版本 v1.0.0。

browser-cli