← 返回 Skills 市场

Browser Use Local

Name: Browser Use Local
Author: fengjiajie

作者 fengjiajie · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

1549

总下载

当前安装

版本数

在 OpenClaw 中安装

/install browser-use-local

功能描述

Automate browser actions locally via browser-use CLI/Python: open pages, click/type, screenshot, extract HTML/links, debug sessions, and capture login QR codes.

使用说明 (SKILL.md)

browser-use (local) playbook

Default constraints in this environment

Prefer browser-use (CLI/Python) over OpenClaw browser tool here; OpenClaw browser may fail if no supported system browser is present.
Use persistent sessions to do multi-step flows: --session \x3Cname>.

Quick CLI workflow (non-agent)

Open

browser-use --session demo open https://example.com

Inspect (sometimes state returns 0 elements on heavy/JS sites)

browser-use --session demo --json state | jq '.data | {url,title,elements:(.elements|length)}'

Screenshot (always works; best debugging primitive)

browser-use --session demo screenshot /home/node/.openclaw/workspace/page.png

HTML for link discovery (works even when state is empty)

browser-use --session demo --json get html > /tmp/page_html.json
python3 - \x3C\x3C'PY'
import json,re
html=json.load(open('/tmp/page_html.json')).get('data',{}).get('html','')
urls=set(re.findall(r"https?://[^\s\"'\x3C>]+", html))
for u in sorted([u for u in urls if any(k in u for k in ['demo','login','console','qr','qrcode'])])[:200]:
    print(u)
PY

Lightweight DOM queries via JS (useful when state is empty)

browser-use --session demo --json eval "location.href"
browser-use --session demo --json eval "document.title"

Agent workflow with OpenAI-compatible LLM (Moonshot/Kimi)

Use Python for Agent runs when the CLI run path requires Browser-Use cloud keys or when you need strict control over LLM parameters.

Minimal working Kimi example

Create .env (or export env vars) with:

OPENAI_API_KEY=...
OPENAI_BASE_URL=https://api.moonshot.cn/v1

Then run the bundled script:

source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python /home/node/.openclaw/workspace/skills/browser-use-local/scripts/run_agent_kimi.py

Kimi/Moonshot quirks observed in practice (fixes):

temperature must be 1 for kimi-k2.5.
frequency_penalty must be 0 for kimi-k2.5.
Moonshot can reject strict JSON Schema used for structured output. Enable:
- remove_defaults_from_schema=True
- remove_min_items_from_schema=True

If you get a 400 error mentioning response_format.json_schema ... keyword 'default' is not allowed or min_items unsupported, those two flags are the first thing to set.

QR code extraction (login/demo pages)

Preferred order

Screenshot the page and crop candidate regions (fast, robust).
If HTML contains data:image/png;base64,..., extract and decode it.

Crop candidates

Use scripts/crop_candidates.py to generate multiple likely QR crops from a screenshot.

source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python skills/browser-use-local/scripts/crop_candidates.py \
  --in /home/node/.openclaw/workspace/login.png \
  --outdir /home/node/.openclaw/workspace/qr_crops

Extract base64-embedded images from HTML

source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
browser-use --session demo --json get html > /tmp/page_html.json
python skills/browser-use-local/scripts/extract_data_images.py \
  --in /tmp/page_html.json \
  --outdir /home/node/.openclaw/workspace/data_imgs

Troubleshooting

state shows elements: 0: use get html + regex discovery, plus screenshots; use eval to query DOM.
Page readiness timeout warnings: usually harmless; rely on screenshot + HTML.
CLI flags order: global flags go before the subcommand:
- ✅ browser-use --browser chromium --json open https://...
- ❌ browser-use open https://... --browser chromium

安全使用建议

This skill appears to implement local browser automation helpers and small Python utilities; the code is short and readable. Before installing or running it, consider: - Sensitivity: Running the bundled agent (run_agent_kimi.py) requires OPENAI_API_KEY and OPENAI_BASE_URL and will contact the specified LLM provider. Page HTML and screenshots could be sent to that remote endpoint — do not run the agent on pages containing secrets (password fields, private dashboards) unless you trust the provider and its data-handling policy. - Metadata mismatch: The registry lists no required env vars or dependencies, but the SKILL.md and code do require LLM credentials and Python packages (Pillow, python-dotenv, browser_use, etc.). Expect to create/activate a venv and install dependencies manually. - Minimal audit: The included scripts (image crop, base64 extraction, small agent runner) are small and understandable. If you plan to use it, run the non-agent CLI workflows first (they don't require an LLM key) and inspect/execute the Python scripts in an isolated environment. - Hardening suggestions: Only provide OPENAI_API_KEY/OPENAI_BASE_URL to this skill if you trust the endpoint; consider using an account with limited privileges, or run the agent in an isolated VM/container. Ask the publisher to update registry metadata to declare required env vars and dependencies so you can make an informed decision.

功能分析

Type: OpenClaw Skill Name: browser-use-local Version: 1.0.0 The skill bundle is designed for browser automation, including interacting with web pages, taking screenshots, extracting HTML, and identifying QR codes. All provided scripts (`crop_candidates.py`, `extract_data_images.py`, `run_agent_kimi.py`) and the `SKILL.md` instructions are clearly aligned with this stated purpose. While browser automation and LLM integration involve inherently powerful capabilities and access to API keys (e.g., `OPENAI_API_KEY` in `scripts/run_agent_kimi.py`), there is no evidence of intentional harmful behavior such as data exfiltration, unauthorized execution, persistence, or malicious prompt injection against the agent. File operations are confined to the workspace or temporary directories, and the instructions are functional rather than deceptive.

能力评估

ℹ Purpose & Capability

The SKILL.md and included scripts are coherent with a 'browser-use local' helper: CLI examples, screenshot/HTML extraction, QR crop helper, and a small agent runner. The skill lacks a one-line description in the registry metadata, but the code and docs align with the claimed functionality.

⚠ Instruction Scope

The runtime instructions and run_agent_kimi.py require an OPENAI_API_KEY and OPENAI_BASE_URL and will call an external LLM (Moonshot/Kimi). That implies page HTML and/or screenshots may be transmitted to that remote endpoint during agent runs — a privacy/data-exfiltration risk for sensitive pages (logins, consoles, QR codes). The SKILL.md does not explicitly warn the user that page content will be sent to the LLM provider.

ℹ Install Mechanism

There is no install spec (instruction-only), which is low risk in that nothing is fetched automatically. However the Python scripts import packages (Pillow, python-dotenv, browser_use, etc.) and the README expects a virtualenv path to exist. Required dependencies are not declared in registry metadata, so a user may need to install additional packages manually.

⚠ Credentials

The registry metadata lists no required environment variables, but the SKILL.md and run_agent_kimi.py explicitly require OPENAI_API_KEY and OPENAI_BASE_URL (and optionally OPENAI_MODEL, etc.). Requesting an LLM API key/base URL is proportionate to running the bundled agent, but the omission from metadata is an inconsistency and a practical risk: secrets must be provided and will be used to contact an external service.

✓ Persistence & Privilege

The skill does not request always:true and does not modify other skills or system-wide settings. It uses the platform default (agent invocation allowed), which is expected for agent-capable skills.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install browser-use-local
安装完成后，直接呼叫该 Skill 的名称或使用 /browser-use-local 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of browser-use-local skill for browser automation in OpenClaw containers/hosts. - Provides CLI and Python instructions for opening pages, clicking/typing, screenshots, HTML/link extraction, and QR code retrieval. - Documents persistent session usage and troubleshooting for JS-heavy sites (state empty, page readiness). - Details workflow for running Agents with OpenAI-compatible LLMs (Moonshot/Kimi), including known parameter quirks and fixes. - Includes example scripts for QR code extraction from screenshots and HTML-embedded images. - Clarifies CLI flag order and recommends when to use browser-use over OpenClaw browser tool.

元数据

Slug browser-use-local

版本 1.0.0

许可证 —

累计安装 7

当前安装数 6

历史版本数 1

常见问题