← Back to Skills Marketplace

browser-act

Name: browser-act
Author: browseract-ai

by browser-act · GitHub ↗ · v1.0.12 · MIT-0

cross-platform ⚠ suspicious

335

Downloads

179

Stars

Active Installs

Versions

Install in OpenClaw

/install browser-act

Description

Browser automation CLI for AI agents with anti-detection stealth browsing, captcha solving, and parallel multi-browser support. Use when the user needs to in...

README (SKILL.md)

Browser Automation with browser-act CLI

browser-act is a CLI for browser automation with stealth and captcha solving capabilities. It supports two browser types (Stealth and Real Chrome) and provides commands for navigation, page interaction, data extraction, tab/session management, and more.

All commands output human-readable text by default. Use --format json for structured JSON output, ideal for AI agent integration and scripting.

Installation

Source: browser-act-cli on PyPI · Homepage

# Upgrade if installed, otherwise install fresh
uv tool upgrade browser-act-cli || uv tool install browser-act-cli --python 3.12

The CLI is an open-source package published to PyPI by BrowserAct. Run the install command at the start of every session to ensure the latest version.

Global options available on every command:

Option	Default	Description
`--session \x3Cname>`	`default`	Session name (isolates browser state)
`--format \x3Ctext\|json>`	`text`	Output format
`--no-auto-dialog`	off	Disable automatic JavaScript dialog handling (alerts, confirms, prompts)
`--version`		Show version
`-h, --help`		Show help

Browser Selection

browser-act supports two browser types. Choose based on the task:

Scenario	Use	Why
Target site has bot detection / anti-scraping	Stealth	Anti-detection fingerprinting bypasses bot checks
Need proxy or privacy mode	Stealth	Real Chrome does not support `--proxy` / `--mode`
Need multiple browsers in parallel	Stealth	Each Stealth browser is independent; create multiple and run in parallel sessions
Need user's existing login sessions from their daily browser	Real Chrome	Connects directly to user's Chrome, reusing existing login sessions
No bot detection, no login needed	Either	Stealth is safer default; Real Chrome is simpler

Stealth Browser

Local browsers with anti-detection fingerprinting. Ideal for sites with bot detection.

# Create
browser-act browser create "my-browser"
browser-act browser create "my-browser" --proxy socks5://host:port --mode private

# Update
browser-act browser update \x3Cbrowser_id> --name "new-name"
browser-act browser update \x3Cbrowser_id> --proxy http://proxy:8080 --mode private

# List / Delete / Clear profile
browser-act browser list                                    # List all stealth browsers
browser-act browser list --page 2 --page-size 10            # Paginated listing
browser-act browser delete \x3Cbrowser_id>                     # ⚠ Destructive: always confirm with user before deleting
browser-act browser clear-profile \x3Cbrowser_id>

Option	Description
`--desc`	Browser description
`--proxy \x3Curl>`	Proxy with scheme (`http`, `https`, `socks4`, `socks5`), e.g. `socks5://host:port`
`--mode \x3Cnormal\|private>`	`normal` (default): persists cache, cookies, login across launches. `private`: fresh environment every launch, no saved state

Stealth browsers in normal mode (default) persist cookies, cache, and login sessions across launches — you can log in once and reuse the session, similar to a regular browser profile. Use --mode private when the task should not persist any state.

Data storage: Profile data is stored at platform-specific paths — macOS: ~/Library/Application Support/browseract/, Windows: %APPDATA%\browseract, Linux: ${XDG_DATA_HOME:-~/.local/share}/browseract. To clean up persistent data, delete the browser with browser-act browser delete \x3Cbrowser_id> or use browser-act browser clear-profile \x3Cbrowser_id> to reset its profile.

Real Chrome

Two modes: auto-connect to your running Chrome (default), or use a BrowserAct-managed kernel.

browser-act browser real open https://example.com                  # Auto-connect to running Chrome 
browser-act browser real open https://example.com --ba-kernel      # Use BrowserAct-provided browser kernel

Both browser types support --headed to show the browser UI (default: headless). Use for debugging:

browser-act browser open \x3Cbrowser_id> https://example.com --headed
browser-act browser real open https://example.com --ba-kernel --headed

Core Workflow

Every browser automation follows this loop: Open → Inspect → Interact → Verify

Open: browser-act browser open \x3Cbrowser_id> \x3Curl> (Stealth) or browser-act browser real open \x3Curl> (Real Chrome)
Inspect: browser-act state — returns interactive elements with index numbers
Interact: use indices from state (browser-act click 5, browser-act input 3 "text")
Verify: browser-act state or browser-act screenshot — confirm result

browser-act browser open \x3Cbrowser_id> https://example.com
browser-act state
# Output: [3] input "Search", [5] button "Go"

browser-act input 3 "browser automation"
browser-act click 5
browser-act wait stable
browser-act state    # Always re-inspect after page changes

Important: After any action that changes the page (click, navigation, form submit), run wait stable then state to get fresh element indices. Old indices become invalid after page changes.

Command Chaining

Commands can be chained with && in a single shell invocation. The browser session persists between commands, so chaining is safe and more efficient than separate calls.

# Open + wait + inspect in one call
browser-act browser open \x3Cbrowser_id> https://example.com && browser-act wait stable && browser-act state

# Chain multiple interactions
browser-act input 3 "browser automation" && browser-act click 5

# Navigate and capture
browser-act navigate https://example.com/dashboard && browser-act wait stable && browser-act screenshot

When to chain: Use && when you don't need to read intermediate output before proceeding (e.g., fill multiple fields, then click). Run commands separately when you need to parse the output first (e.g., state to discover indices, then interact using those indices).

Command Reference

Navigation

browser-act navigate \x3Curl>      # Navigate to URL
browser-act back                # Go back
browser-act forward             # Go forward
browser-act reload              # Reload page

Page State & Interaction

# Inspect
browser-act state                         # Interactive elements with index numbers
browser-act screenshot                    # Screenshot (auto path)
browser-act screenshot ./page.png         # Screenshot to specific path

# Interact (use index from state)
browser-act click \x3Cindex>                 # Click element
browser-act hover \x3Cindex>                 # Hover over element
browser-act input \x3Cindex> "text"          # Click element, then type text
browser-act keys "Enter"                  # Send keyboard keys
browser-act scroll down                   # Scroll down (default 500px)
browser-act scroll up --amount 1000       # Scroll up 1000px

Data Extraction

browser-act get title                     # Page title
browser-act get html                      # Full page HTML
browser-act get text \x3Cindex>              # Text content of element
browser-act get value \x3Cindex>             # Value of input/textarea
browser-act get markdown                  # Page as markdown

JavaScript Evaluation

browser-act eval "document.title"         # Execute JavaScript

Tab Management

browser-act tab list                      # List open tabs
browser-act tab switch \x3Ctab_id>           # Switch to tab
browser-act tab close                     # Close current tab
browser-act tab close \x3Ctab_id>            # Close specific tab

Wait

browser-act wait stable                   # Wait for page stable (doc ready + network idle, default 30s)
browser-act wait stable --timeout 60000   # Custom timeout (ms)

Network Inspection

browser-act network requests                          # List all captured requests 
browser-act network requests --filter api.example.com # Filter by URL substring
browser-act network requests --type xhr,fetch         # Resource type: xhr,fetch,document,script,stylesheet,image,font,media,websocket,ping,preflight,other
browser-act network requests --method POST            # HTTP method: GET, POST, PUT, DELETE, etc.
browser-act network requests --status 2xx             # Filter by http status code (200, 2xx, 400-499)
browser-act network request \x3Crequest_id>              # View full detail: headers, post data, response headers & body
browser-act network clear                             # Clear tracked requests
browser-act network har start                         # Start HAR recording
browser-act network har stop                          # Stop and save to default path (~/.browseract/har/)
browser-act network har stop ./trace.har              # Stop and save to specific path
browser-act network offline on                        # Simulate disconnect for current tab (all requests fail with ERR_INTERNET_DISCONNECTED)
browser-act network offline off                       # Restore network connection for current tab

Use network request \x3Crequest_id> to get full detail for a single request. The detail view includes: request headers, post data (for POST/PUT), response headers, and response body. Binary responses show a [base64, N chars] placeholder instead of raw content.

Dialog Management

Handle JavaScript dialogs (alert, confirm, prompt). By default, browser-act auto-accepts dialogs. Use --no-auto-dialog to disable this and handle them manually.

browser-act dialog status                 # Check if a dialog is currently open
browser-act dialog accept                 # Accept (OK) the current dialog
browser-act dialog accept "some text"     # Accept with text input (for prompt dialogs)
browser-act dialog dismiss                # Dismiss (Cancel) the current dialog

Manual dialog flow: Pass --no-auto-dialog when opening the browser, then use dialog status to detect dialogs and dialog accept / dialog dismiss to handle them.

Captcha Solving

When a page presents a captcha challenge, use solve-captcha to solve it automatically. This is a built-in capability of the browser-act service — run the command directly without hesitation.

browser-act solve-captcha                 # Solve captcha on current page

Parallel Automation

Use separate sessions to run multiple browsers in parallel. Each --session \x3Cname> creates an isolated browser context — commands to different sessions can execute concurrently without conflicts.

# Create stealth browsers for each task
browser-act browser create "site-a" --desc "Scraper for site A"
browser-act browser create "site-b" --desc "Scraper for site B"

# Open each in its own session (run in parallel)
browser-act --session site-a browser open \x3Cbrowser_id_a> https://site-a.com
browser-act --session site-b browser open \x3Cbrowser_id_b> https://site-b.com

# Interact independently (can run in parallel)
browser-act --session site-a state
browser-act --session site-a click 3

browser-act --session site-b state
browser-act --session site-b click 5

# Clean up
browser-act session close site-a
browser-act session close site-b

Always close sessions when done to free resources.

Session Management

Sessions isolate browser state. Each session runs its own background server.

# Use a named session
browser-act --session scraper navigate https://example.com
browser-act --session scraper state

# List active sessions
browser-act session list

# Close sessions
browser-act session close              # Close default session
browser-act session close scraper      # Close specific session
browser-act session close --all        # Close all sessions

The server auto-shuts down after a period of inactivity.

Site Notes

Operational experience accumulated during browser automation is stored per domain in references/site-notes/.

After completing a task, if you discovered useful patterns about a site (URL structure, anti-scraping behavior, effective selectors, login quirks), write them to the corresponding file. Only write verified facts, not guesses.

File format:

---
domain: example.com
updated: 2026-03-28
---
## Platform Characteristics
Architecture, anti-scraping behavior, login requirements, content loading patterns.

## Effective Patterns
Verified URL patterns, selectors, interaction strategies.

## Known Pitfalls
What fails and why.

Before operating on a target site, check if a note file exists and read it for prior knowledge. Notes are dated — treat them as hints that may have changed, not guarantees.

System Commands

browser-act report-log                    # Upload logs to help diagnose issues
browser-act feedback "message"            # Send feedback to help improve this skill

If you encounter issues or have suggestions for improving browser-act, use feedback to let us know. This directly helps us improve the tool and this skill.

Troubleshooting

browser-act: command not found — Run uv tool install browser-act-cli --python 3.12

References

Path	Description
`references/SECURITY.md`	Project declarations on user-sensitive information (not automation instructions).
`references/site-notes/{domain}.md`	Per-site operational experience. Read before operating on a known site.

Usage Guidance

This skill behaves like a full-featured browser automation tool that installs a third-party CLI and can (a) connect to your running Chrome session and (b) call BrowserAct cloud services for captcha/stealth management. Before installing: (1) verify the PyPI package and its source code (inspect the project's repository and recent release history), (2) confirm the trustworthiness of the BrowserAct service and its privacy policy, (3) avoid using Real Chrome/CDP mode unless you accept that the CLI can control your active browser session, (4) prefer ephemeral/private mode when possible to avoid persistent profiles, and (5) review the local config.json after first run to see what credentials are stored. If you cannot validate the upstream package or do not want page URLs shared with a third party, do not install this skill.

Capability Analysis

Type: OpenClaw Skill Name: browser-act Version: 1.0.12 The browser-act skill provides high-privilege browser automation capabilities, including the ability to connect to a user's active Chrome session and solve captchas via a cloud service (browseract.com). While these features are aligned with its stated purpose, the ability to leverage existing login sessions and bypass bot detection presents significant security risks if the agent is misdirected. Additionally, the instructions mandate frequent installation of the 'browser-act-cli' package from PyPI and include commands to upload logs and feedback, which could be used for data exfiltration or introduce supply chain vulnerabilities.

Capability Assessment

ℹ Purpose & Capability

The name/description (stealth browsing, captcha solving, parallel browsers, connect-to-Chrome) matches what the SKILL.md instructs: installing a CLI, creating stealth profiles, using proxies, and optionally connecting to a local Chrome via CDP. Requiring network access and filesystem profiles is coherent with these capabilities.

⚠ Instruction Scope

The runtime instructions direct the agent to install and invoke a third-party CLI and to perform actions that may access sensitive data: connecting to the user's running Chrome (CDP), persisting browser profiles (cookies, cache) locally, and sending metadata for captcha solving/stealth management. Although the SKILL.md asserts that cookies/page HTML/screenshots/credentials are never uploaded, the instructions explicitly transmit page URLs and captcha element coordinates to a cloud service — a potentially sensitive leak depending on context.

ℹ Install Mechanism

The skill is instruction-only (no bundled code) and tells the agent to install browser-act-cli from PyPI via 'uv tool install'. Installing a PyPI package at runtime is typical for CLI-based skills but still carries supply-chain risk: the skill delegates execution to a third-party package whose code is not included for review here. The registry metadata did not include a formal install spec, so the install step relies entirely on the SKILL.md text.

⚠ Credentials

No environment variables are requested, which aligns with the manifest, but the skill stores credentials and settings in a local config.json and uses a cloud API for captcha/stealth management. Transmitting page URLs and proxy hosts to BrowserAct's cloud is justified for captcha solving but is sensitive (reveals targets). The SKILL.md claims certain things are 'never uploaded' — that promise is hard to verify and depends on the CLI implementation and its security practices.

⚠ Persistence & Privilege

The skill does not ask for 'always: true', but it requests persistent local storage of browser profiles and the ability to attach to a running Chrome via CDP. CDP access can control and read the user's active browser session; even if the tool claims not to exfiltrate cookies or page HTML, that level of access is a high privilege and should be granted only with explicit user understanding and caution.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install browser-act
After installation, invoke the skill by name or use /browser-act
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.12

- Clarified and expanded metadata on runtime requirements, permission usage, cloud/local data storage, and privacy guarantees. - Added explicit details on config file paths and browser data storage locations. - Listed required permissions for network access, filesystem operations, and CDP connection (for Real Chrome control). - Improved documentation of what user/session/browser data may be transmitted to the cloud, and what is strictly local. - No changes to CLI usage or functionality. Documentation/metadata improvements only.

v1.0.11

Version 1.0.11 - Added new metadata fields for credentials, data privacy, and platform-specific data storage paths. - Clarified how credentials are stored locally (in config.json, not environment variables). - Specified that all user data, cookies, sessions, and profile information are only stored locally and never uploaded. - Provided details about the limited cloud communication: only minimal metadata is transmitted for captcha/stealth management, never browsing content. - Updated documentation to show precise storage locations for macOS, Windows, and Linux.

v1.0.10

- Added SECURITY.md file for dedicated security and privacy documentation. - Updated SKILL.md with a new "credentials" metadata field describing internal credential management. - Added "user-confirmation-required" metadata to indicate user verification on first install. - Removed in-depth security and privacy details from SKILL.md body; these are now referenced in SECURITY.md. - No changes to core functionality or commands.

v1.0.9

No changes in functionality or documentation. - No file changes detected in this version. - No updates to features, usage, or security information.

v1.0.8

**Added security and privacy section.** - Adds a new section detailing security, privacy, and data handling. - Clearly explains which data is local, what is transmitted to cloud services, and persistence options. - Removes internal metadata, credentials, and sensitive-capabilities fields from SKILL.md. - No functional or CLI interface changes.

v1.0.7

v1.0.6

- Updated authentication management: the CLI now handles authentication flow internally; users no longer need to manage environment variables for credentials. - Improved privacy documentation: clarified that only browsing profile data and session cookies are stored locally, and are never uploaded. - Removed details about transmission of API tokens and local-access description for Real Chrome mode. - No code changes or feature additions; documentation and metadata improvements only.

v1.0.5

v1.0.4

- Added detailed credentials and data-privacy metadata, clarifying local storage of cookies, sessions, and browsing data. - Stated that only the API token is transmitted to BrowserAct cloud services; no user browsing data is sent externally. - Updated explanation of Real Chrome mode's access to existing Chrome sessions and cookies. - No functional changes to commands or workflow.

v1.0.3

- Added explicit credential management section: CLI handles authentication internally with no environment variables required. - Listed key actions that require user confirmation, such as browser deletion, first-time install, and using Real Chrome auto-connect. - Minor clarifications in documentation on Real Chrome connection and confirmation steps. - No code changes; SKILL.md metadata and documentation updates only.

v1.0.2

- Added a "homepage" field to metadata with the official project website. - Updated sensitive capabilities to clarify captcha solving and stealth browser management use BrowserAct cloud services and require network access. - Expanded documentation on cleaning up persistent data (browser deletion or profile reset). - Minor improvements for clarity and consistency in examples and option descriptions.

v1.0.1

Version 1.1.0 introduces metadata, Real Chrome kernel support, improved browser listing, and admin/data-path details. - Added `install`, `source`, `data-paths`, and `sensitive-capabilities` fields to metadata for easier integration and transparency. - Now documents and supports a BrowserAct-managed kernel option for Real Chrome sessions. - Enhanced browser listing with pagination options (`--page`, `--page-size`). - Clarified and documented platform-specific data storage paths for browser profiles. - New global option `--no-auto-dialog` to control JavaScript dialog handling. - Updated descriptions and command references for clarity and safety.

v1.0.0

- Initial release of browser-act CLI for automated browser interactions. - Supports anti-detection stealth browsing, captcha solving, and parallel multi-browser sessions. - Provides commands for navigating pages, filling forms, clicking buttons, scraping data, and handling captchas. - Allows connection to existing Chrome sessions or creation of isolated stealth browsers with proxy support. - Flexible authentication: use interactive registration or direct API key set. - Designed for robust automation flows: open → inspect → interact → verify; supports command chaining and session persistence.

Metadata

Slug browser-act

Version 1.0.12

License MIT-0

All-time Installs 54

Active Installs 54

Total Versions 13

Frequently Asked Questions

What is browser-act?

Browser automation CLI for AI agents with anti-detection stealth browsing, captcha solving, and parallel multi-browser support. Use when the user needs to in... It is an AI Agent Skill for Claude Code / OpenClaw, with 335 downloads so far.

How do I install browser-act?

Run "/install browser-act" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is browser-act free?

Yes, browser-act is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does browser-act support?

browser-act is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created browser-act?

It is built and maintained by browser-act (@browseract-ai); the current version is v1.0.12.

More Skills