功能描述

Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from w...

使用说明 (SKILL.md)

Browser Automation

Name: Browserbase
Author: pkiv

Automate browser interactions using the browse CLI with Claude.

Setup check

Before running any browser commands, verify the CLI is available:

which browse || npm install -g @browserbasehq/browse-cli

Environment Selection (Local vs Remote)

The CLI automatically selects between local and remote browser environments based on available configuration:

Local mode (default)

Uses local Chrome — no API keys needed
Best for: development, simple pages, trusted sites with no bot protection

Remote mode (Browserbase)

Activated when BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID are set
Provides: anti-bot stealth, automatic CAPTCHA solving, residential proxies, session persistence
Use remote mode when: the target site has bot detection, CAPTCHAs, IP rate limiting, Cloudflare protection, or requires geo-specific access
Get credentials at https://browserbase.com/settings

When to choose which

Simple browsing (docs, wikis, public APIs): local mode is fine
Protected sites (login walls, CAPTCHAs, anti-scraping): use remote mode
If local mode fails with bot detection or access denied: switch to remote mode

Commands

All commands work identically in both modes. The daemon auto-starts on first command.

Navigation

browse open \x3Curl>                        # Go to URL (aliases: goto)
browse reload                            # Reload current page
browse back                              # Go back in history
browse forward                           # Go forward in history

Page state (prefer snapshot over screenshot)

browse snapshot                          # Get accessibility tree with element refs (fast, structured)
browse screenshot [path]                 # Take visual screenshot (slow, uses vision tokens)
browse get url                           # Get current URL
browse get title                         # Get page title
browse get text \x3Cselector>               # Get text content (use "body" for all text)
browse get html \x3Cselector>               # Get HTML content of element
browse get value \x3Cselector>              # Get form field value

Use browse snapshot as your default for understanding page state — it returns the accessibility tree with element refs you can use to interact. Only use browse screenshot when you need visual context (layout, images, debugging).

Interaction

browse click \x3Cref>                       # Click element by ref from snapshot (e.g., @0-5)
browse type \x3Ctext>                       # Type text into focused element
browse fill \x3Cselector> \x3Cvalue>           # Fill input and press Enter
browse select \x3Cselector> \x3Cvalues...>     # Select dropdown option(s)
browse press \x3Ckey>                       # Press key (Enter, Tab, Escape, Cmd+A, etc.)
browse drag \x3CfromX> \x3CfromY> \x3CtoX> \x3CtoY>  # Drag from one point to another
browse scroll \x3Cx> \x3Cy> \x3CdeltaX> \x3CdeltaY> # Scroll at coordinates
browse highlight \x3Cselector>              # Highlight element on page
browse is visible \x3Cselector>             # Check if element is visible
browse is checked \x3Cselector>             # Check if element is checked
browse wait \x3Ctype> [arg]                 # Wait for: load, selector, timeout

Session management

browse stop                              # Stop the browser daemon
browse status                            # Check daemon status (includes env)
browse env                               # Show current environment (local or remote)
browse env local                         # Switch to local Chrome
browse env remote                        # Switch to Browserbase (requires API keys)
browse pages                             # List all open tabs
browse tab_switch \x3Cindex>                # Switch to tab by index
browse tab_close [index]                 # Close tab

Typical workflow

browse open \x3Curl> — navigate to the page
browse snapshot — read the accessibility tree to understand page structure and get element refs
browse click \x3Cref> / browse type \x3Ctext> / browse fill \x3Cselector> \x3Cvalue> — interact using refs from snapshot
browse snapshot — confirm the action worked
Repeat 3-4 as needed
browse stop — close the browser when done

Quick Example

browse open https://example.com
browse snapshot                          # see page structure + element refs
browse click @0-5                        # click element with ref 0-5
browse get title
browse stop

Mode Comparison

Feature	Local	Browserbase
Speed	Faster	Slightly slower
Setup	Chrome required	API key required
Stealth mode	No	Yes (custom Chromium, anti-bot fingerprinting)
CAPTCHA solving	No	Yes (automatic reCAPTCHA/hCaptcha)
Residential proxies	No	Yes (201 countries, geo-targeting)
Session persistence	No	Yes (cookies/auth persist across sessions)
Best for	Development/simple pages	Protected sites, bot detection, production scraping

Best Practices

Always browse open first before interacting
Use browse snapshot to check page state — it's fast and gives you element refs
Only screenshot when visual context is needed (layout checks, images, debugging)
Use refs from snapshot to click/interact — e.g., browse click @0-5
browse stop when done to clean up the browser session

Troubleshooting

"No active page": Run browse stop, then check browse status. If it still says running, kill the zombie daemon with pkill -f "browse.*daemon", then retry browse open
Chrome not found: Install Chrome or use browse env remote
Action fails: Run browse snapshot to see available elements and their refs
Browserbase fails: Verify API key and project ID are set

Switching to Remote Mode

Switch to remote when you detect: CAPTCHAs (reCAPTCHA, hCaptcha, Turnstile), bot detection pages ("Checking your browser..."), HTTP 403/429, empty pages on sites that should have content, or the user asks for it.

Don't switch for simple sites (docs, wikis, public APIs, localhost).

browse env remote            # switch to Browserbase
browse env local             # switch back to local Chrome

The switch is sticky until you run browse stop or switch again.

For detailed examples, see EXAMPLES.md. For API reference, see REFERENCE.md.

安全使用建议

Install this only if you intentionally want CLI-driven browser automation. Use local mode by default, require explicit approval before remote/stealth/proxy/CAPTCHA use, avoid protected-site scraping unless authorized, do not enter sensitive credentials in remote sessions unless you trust the provider, clear network captures, and stop the browser daemon after each task.

功能分析

Type: OpenClaw Skill Name: browse Version: 2.0.2 The skill bundle provides a legitimate interface for browser automation using the '@browserbasehq/browse-cli'. It includes comprehensive documentation (SKILL.md, REFERENCE.md) for navigating, interacting with, and extracting data from web pages via local Chrome or the Browserbase remote service. While it possesses high-privilege capabilities like JavaScript evaluation and network capture, these are standard for browser automation and are presented transparently without evidence of malicious intent or prompt-injection attacks.

能力评估

⚠ Purpose & Capability

Browser automation is clearly stated and mostly coherent, but the documented capability expands to CAPTCHA solving, anti-bot stealth, residential proxies, and protected-site scraping.

⚠ Instruction Scope

The workflow is user-directed, but the instructions encourage switching to remote mode to bypass bot defenses and expose broad browser-control commands including form filling, clicking, JavaScript evaluation, and network capture.

ℹ Install Mechanism

The skill relies on an external npm CLI package installed globally; this is expected for the purpose but should be treated as a supply-chain dependency.

⚠ Credentials

Local mode is proportionate for ordinary browsing, but remote Browserbase mode uses cloud sessions with proxies, CAPTCHA solving, and session persistence, which is sensitive for logins or protected sites.

⚠ Persistence & Privilege

The CLI daemon persists across commands until stopped, and remote mode documents cookies/auth persistence across sessions without clear retention or clearing boundaries in the provided artifacts.

版本历史

v2.0.2

Remove undeclared openclaw platform references from all skill files

v2.0.1

Remove undeclared openclaw platform references to fix ClawHub security scan

v2.0.0

Use new Stagehand CLI for improved performance & remote browsing capabilities

v1.0.1

Fixed display name

v1.0.0

Initial release: Complete guide for creating and deploying browser automation functions with the stagehand CLI and Browserbase. - Step-by-step workflow from site exploration to deployment - Critical bug warning and fix for generated package.json - Example automation functions and usage patterns included - Covers local testing, production deployment, and API integration - Common scenarios: scraping, authentication, and multi-page workflows

元数据

Slug browse

版本 2.0.2

许可证 —

累计安装 54

当前安装数 52

历史版本数 5

常见问题

Browserbase 是什么？

Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from w... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 9420 次。

如何安装 Browserbase？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install browse」即可一键安装，无需额外配置。

Browserbase 是免费的吗？

是的，Browserbase 完全免费（开源免费），可自由下载、安装和使用。

Browserbase 支持哪些平台？

Browserbase 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Browserbase？

由 pkiv（@pkiv）开发并维护，当前版本 v2.0.2。

Browserbase