Description

AI-powered browser automation for complex multi-step web workflows. Uses Browser-Use framework when OpenClaw's built-in browser tool can't handle login flows...

README (SKILL.md)

Browser-Use — AI Browser Automation

Name: Browser Use Pro
Author: abczsl520

Security & Privacy

No credential logging: Passwords are handled via Browser-Use's sensitive_data parameter — the LLM never sees real credentials, only placeholder tokens.
User-initiated Chrome connection: CDP mode (connecting to real Chrome) is opt-in and requires the user to manually launch Chrome with debug flag. The skill never silently connects to running browsers.
All packages are open-source: Dependencies are browser-use (38k+ ⭐ on GitHub), playwright (by Microsoft), and langchain-openai — all widely audited open-source tools.
Local execution only: Scripts run locally on the user's machine. No data is sent to any server except the configured LLM API for step-by-step reasoning.
Domain restriction available: Use allowed_domains parameter to restrict which websites the agent can visit.
No telemetry: This skill does not collect, store, or transmit any usage data.

When to Use Browser-Use vs Built-in Tool

Scenario	Built-in tool	Browser-Use
Screenshot / click one button	✅ Free & fast	❌ Overkill
5+ step workflow (login→navigate→fill→submit)	❌ Breaks easily	✅
Anti-bot sites (real Chrome needed)	❌	✅
Batch repetitive operations	❌	✅

Cost: Browser-Use calls an external LLM per step (costs money + slower). Use built-in tool for simple actions.

Execution Flow

1. Check Environment

test -d ~/browser-use-env && echo "Installed" || echo "Need install"

2. First-Time Setup (once only)

python3 -m venv ~/browser-use-env
source ~/browser-use-env/bin/activate
pip install browser-use playwright langchain-openai
playwright install chromium

3. Choose Mode

Mode A — Built-in Chromium: For simple automation or when detection doesn't matter. Runs immediately.
Mode B — Real Chrome CDP: For anti-bot sites or when user's login session is needed. Requires user action.

Mode B setup — prompt user:

Please quit Chrome completely (Mac: Cmd+Q), then tell me "done"

After user confirms:

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 &

Verify: curl -s http://127.0.0.1:9222/json/version

4. Write Script and Run

Write script to user's workspace, then:

source ~/browser-use-env/bin/activate
python3 script_path.py

5. Report Results

Return results to user. On failure, follow the troubleshooting tree below.

Script Template

import asyncio
from browser_use import Agent, ChatOpenAI, Browser

async def main():
    # LLM — any OpenAI-compatible API
    llm = ChatOpenAI(
        model="gpt-4o-mini",
        api_key="\x3CYOUR_API_KEY>",  # From env var or user config
        base_url="https://api.openai.com/v1",
    )

    # Mode A: Built-in Chromium
    browser = Browser(headless=False, user_data_dir="~/.browser-use/task-profile")
    # Mode B: Real Chrome (user must launch with --remote-debugging-port=9222)
    # browser = Browser(cdp_url="http://127.0.0.1:9222")

    agent = Agent(
        task="Detailed step-by-step task description (see guide below)",
        llm=llm, browser=browser,
        use_vision=True, max_steps=25,
    )
    result = await agent.run()
    print(result)

asyncio.run(main())

Task Writing Guide

✅ Good: Specific steps

task = """
1. Open https://www.reddit.com/login
2. Enter username: x_user
3. Enter password: x_pass
4. Click login button
5. If CAPTCHA appears, wait 30s for user to complete
6. Navigate to https://www.reddit.com/r/xxx/submit
7. Enter title: xxx
8. Enter body: xxx
9. Click submit
"""

❌ Bad: Vague

task = "Post something on Reddit"

Tips

Keyboard fallback: Add "If button can't be clicked, use Tab+Enter"
Error recovery: Add "If page fails to load, refresh and retry"
Sensitive data: Use placeholders + sensitive_data parameter

Credential Security

agent = Agent(
    task="Login with x_user and x_pass",
    sensitive_data={"x_user": "[email protected]", "x_pass": "S3cret!"},
    use_vision=False,  # Disable screenshots when handling passwords
    llm=llm, browser=browser,
)

Key Parameters

Parameter	Purpose	Recommended
`use_vision`	AI sees screenshots	True normally, False with passwords
`max_steps`	Max actions	20-30
`max_failures`	Max retries	3 (default)
`flash_mode`	Skip reasoning	True for simple tasks
`extend_system_message`	Custom instructions	Add specific guidance
`allowed_domains`	Restrict URLs	Use for security
`fallback_llm`	Backup LLM	When primary is unstable

Troubleshooting

Detected as automation?
  └→ Switch to Mode B (real Chrome)

CAPTCHA / human verification?
  └→ Prompt user to complete manually, add wait time in task

LLM timeout?
  └→ Set fallback_llm or use faster model

Action succeeded but no effect (e.g. post not published)?
  └→ 1. Check if platform anti-spam blocked it (common with new accounts)
     2. Add explicit confirmation steps to task

Website UI changed, can't find elements?
  └→ Browser-Use auto-adapts, but add fallback paths in task

LLM Compatibility

LLM	Works	Notes
GPT-4o / 4o-mini	✅	Best choice, recommended
Claude	✅	Works well
Gemini	❌	Structured output incompatible

Usage Guidance

This skill appears to do what it claims (complex browser automation) but has some concerning and inconsistent points you should address before using it: - Expect to provide an LLM API key (the SKILL.md uses api_key) even though the registry says no env vars — verify where and how you will store that key (prefer a dedicated env var or local config, not embedding in code). - Review the 'browser-use' package on PyPI/GitHub before pip installing. The SKILL.md claims large popularity; confirm the project identity and inspect its code or maintainers. - Run automation in an isolated environment: create a dedicated virtualenv, and use a separate browser profile (user_data_dir) or a disposable VM/container to avoid exposing your main browser cookies/sessions. - For Mode B (CDP/remote debugging), be aware that a Chrome instance launched with remote debugging can expose session data to whatever connects to that port. Only enable this when you fully trust the code you run and do so on an isolated profile or machine. - Limit data sent to the remote LLM: avoid enabling screenshots/use_vision when dealing with passwords or sensitive pages, and use the documented 'sensitive_data' placeholders, but validate that the library actually enforces that behavior. - If you need stronger guarantees, consider running the tool offline or with a self-hosted LLM, or perform the sensitive login step manually and then let the agent continue with non-sensitive actions. If you want, I can: (1) extract the exact places the SKILL.md asks for secrets, (2) provide a checklist to safely set up the venv and isolated browser profile, or (3) attempt to identify the 'browser-use' project's upstream source so you can audit it.

Capability Analysis

Type: OpenClaw Skill Name: browser-use-pro Version: 1.2.0 The skill provides a legitimate framework for AI-driven browser automation using the `browser-use` and `playwright` libraries. It includes detailed setup instructions, security best practices for handling credentials (using `sensitive_data` and disabling vision), and requires explicit user consent to enable remote debugging for session-based automation. No malicious patterns, such as data exfiltration or unauthorized persistence, were detected in the code or instructions.

Capability Assessment

ℹ Purpose & Capability

Name/description (browser automation for complex flows) aligns with the instructions and listed Python packages (browser-use, playwright, langchain-openai). However the registry metadata states 'Required env vars: none' while SKILL.md clearly requires an LLM API key (api_key in the example) and runtime configuration — this mismatch is unexplained.

⚠ Instruction Scope

The SKILL.md instructs creating a virtualenv, pip-installing packages, running Playwright, writing/running Python scripts that drive real browsers, and optionally launching Chrome with --remote-debugging-port. Those steps can access local browser profiles, cookies, and pages; the skill also recommends sending screenshots and page content to an external LLM. Although the doc describes 'sensitive_data' placeholders, the instructions still place significant discretion and access in the agent (screenshots, HTML, and browser session data sent to an LLM), and they inconsistently state who must launch Chrome vs showing a command that launches it.

ℹ Install Mechanism

This is instruction-only (no install spec). The suggested install flow uses pip and Playwright (standard package sources). That is lower-risk than arbitrary downloads, but the user should vet the PyPI packages (and the claimed 'browser-use' project) before pip installing into a host environment.

⚠ Credentials

Metadata declares no required env vars but SKILL.md demonstrates and implies an LLM API key (api_key) and use of user_data_dir for browser profiles. Requiring an API key for an LLM is expected for this functionality, but its absence from declared requirements is an incoherence. Also, connecting to a browser debug port and using an existing profile gives the agent access to cookies, sessions, and stored secrets — a high-scope capability that users must explicitly accept.

ℹ Persistence & Privilege

The skill does not request 'always: true'. It suggests creating a per-user virtualenv (~ /browser-use-env) and a profile dir (~/.browser-use/task-profile), which is standard for a local tool but creates persistent artifacts in the user's home. It does not modify other skills or system-wide settings as documented.

Version History

v1.2.0

Added Security & Privacy section: credential safety, user-initiated CDP, open-source deps, domain restriction, no telemetry

v1.1.0

Added AI Dev Quality Suite cross-references, expanded Related links

v1.0.0

Initial release: AI-powered browser automation with real Chrome CDP, sensitive data handling, task prompting guide, and failure recovery.

Metadata

Slug browser-use-pro

Version 1.2.0

License —

All-time Installs 4

Active Installs 4

Total Versions 3

Frequently Asked Questions

What is Browser Use Pro?

AI-powered browser automation for complex multi-step web workflows. Uses Browser-Use framework when OpenClaw's built-in browser tool can't handle login flows... It is an AI Agent Skill for Claude Code / OpenClaw, with 700 downloads so far.

How do I install Browser Use Pro?

Run "/install browser-use-pro" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Browser Use Pro free?

Yes, Browser Use Pro is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Browser Use Pro support?

Browser Use Pro is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Browser Use Pro?

It is built and maintained by abczsl520 (@abczsl520); the current version is v1.2.0.

More Skills

Browser Use Pro