功能描述

Use this skill when you need to control or make actions on the user's chrome tab.

使用说明 (SKILL.md)

Browser Bridge CLI

Name: Browser Agent Bridge CLI
Author: nmadeleidev

When to use

Use this skill when you need to control a real Chrome tab. Typical situations:

browser automation with live user browser context
page observation (interactive elements and DOM snapshots)
remote tab actions (navigate, click, type, press_key, scroll)
troubleshooting connection state between agent and browser

Project:

https://github.com/NmadeleiDev/browser_agent_bridge

What this gives you

This workflow has three connected parts:

Browser extension in Chrome receives tab commands.
Bridge server routes messages between browser and operator.
Operator CLI sends commands and reads results.

CLI commands used:

browser-bridge-server to run the server
browser-bridge to run operator actions

Prerequisites

Python 3.10+
Chrome browser
Terminal access
Ability to load an unpacked Chrome extension

Agent responsibility before startup

Before starting the server, generate strong tokens. Do not use weak defaults.

Example token generation:

python3 - \x3C\x3C'PY'
import secrets
print("BRIDGE_SHARED_TOKEN=" + secrets.token_urlsafe(32))
print("BRIDGE_OPERATOR_TOKEN=" + secrets.token_urlsafe(32))
PY

Use generated values when starting the server. Share only the client token (BRIDGE_SHARED_TOKEN) with the user for extension setup. Keep operator token for agent CLI usage.

Install the CLI

python3 -m pip install --user pipx
python3 -m pipx ensurepath
pipx install browser-agent-bridge

Upgrade later:

pipx upgrade browser-agent-bridge

Start the bridge server

Use static auth for straightforward local setup:

export BRIDGE_AUTH_MODE=static
export BRIDGE_SHARED_TOKEN='change-me-strong-token'
export BRIDGE_OPERATOR_TOKEN='Str0ng!Operator#42'
browser-bridge-server >/tmp/browser-bridge-server.log 2>&1 &
echo $! >/tmp/browser-bridge-server.pid

Start browser-bridge-server in the background. Do not leave it attached to the current shell, because the agent needs that shell for follow-up CLI commands, status checks, and diagnostics. If startup needs verification, inspect the log file or process state after backgrounding it.

Default endpoints:

Extension client WS: ws://127.0.0.1:8765/ws/client
Operator CLI WS: ws://127.0.0.1:8765/ws/operator

Connect the Chrome extension (tell your human to do this)

Open chrome://extensions.
Enable Developer mode.
Click Load unpacked.
Select the extension provided by this project from https://github.com/NmadeleiDev/browser_agent_bridge (extension/ folder).
Open the Browser Bridge extension popup.
Fill fields:

Bridge Server WS URL: ws://127.0.0.1:8765/ws/client
Instance ID: local-instance
Client ID: chrome-main
Auth Token / JWT: value of BRIDGE_SHARED_TOKEN generated by the agent

Click Save, then Connect.
Confirm popup status is connected to the server started by the agent.

Operator CLI usage

All examples use:

instance_id=local-instance
client_id=chrome-main
operator token Str0ng!Operator#42
operator websocket ws://127.0.0.1:8765/ws/operator

You can pass the operator token either with --token or by exporting BRIDGE_OPERATOR_TOKEN. The examples below use --token explicitly for clarity.

List connected browser clients:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' list-clients

Check whether the specific client is connected:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' \
  connect-status --instance-id local-instance --client-id chrome-main

Check whether tab command channel is ready:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' \
  ping-tab --instance-id local-instance --client-id chrome-main

Observe interactive nodes on current page:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' \
  observe --instance-id local-instance --client-id chrome-main --max-nodes 150

Get page HTML snapshot:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' \
  send-command --instance-id local-instance --client-id chrome-main \
  --type get_html --payload '{"max_chars":40000}'

Navigate with adaptive load wait:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' \
  send-command --instance-id local-instance --client-id chrome-main \
  --type navigate --payload '{"url":"https://example.com","wait_for_load":true,"wait_for_load_ms":7000}'

Click without load wait:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' \
  send-command --instance-id local-instance --client-id chrome-main \
  --type click --payload '{"selector":"a[href]","wait_for_load":false}'

Type into an element:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' \
  send-command --instance-id local-instance --client-id chrome-main \
  --type type --payload '{"selector":"input[name=q]","text":"browser bridge"}'

Press a special key:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' \
  send-command --instance-id local-instance --client-id chrome-main \
  --type press_key --payload '{"key":"Enter","selector":"input[name=q]"}'

press_key supports:

keys: Enter, Tab, Escape, Backspace, Delete, ArrowUp, ArrowDown, ArrowLeft, ArrowRight, Home, End, PageUp, PageDown, Space
aliases: return, esc, del, up, down, left, right, spacebar
modifiers: alt_key, ctrl_key, meta_key, shift_key
target selection via selector, ref, click_ref, or locator
default target: current document.activeElement when no selector/ref is provided

Recommended execution flow for agents

Ensure server process is running.
Ensure extension popup is connected with matching instance_id, client_id, and token.
Run list-clients.
Run connect-status.
Run ping-tab.
Run observe before action commands.
Run send-command actions (navigate, click, type, press_key, scroll, get_html).
Re-run observe to confirm page state after actions.

Troubleshooting

Target client not connected
- Verify popup shows connected.
- Verify instance_id and client_id exactly match CLI flags.
- Reconnect extension and retry.
Operator auth failed or auth errors
- Verify --token matches BRIDGE_OPERATOR_TOKEN.
Command timed out
- Increase --timeout-s.
- For action commands, disable or reduce load wait in payload.
- Confirm active tab is a normal webpage (not restricted pages like chrome://*).
Receiving end does not exist
- Retry once; extension can reinject content script when needed.
Slow responses on action commands
- Use wait_for_load=false for immediate response.
- Or set smaller wait_for_load_ms.

Security notes

Treat tokens as secrets.
For non-local deployments, use TLS (wss://) and strong secrets.

Done criteria

list-clients returns expected client.
connect-status is connected.
ping-tab reports ready.
observe returns page data.
send-command actions return valid results.

安全使用建议

Before installing or running: 1) Verify the PyPI package and GitHub extension repository (review code, recent commits, maintainer reputation). 2) Treat BRIDGE_OPERATOR_TOKEN as a high-value secret — generate fresh, strong tokens and do not reuse them. 3) Only load the extension in a disposable browser profile (not your primary profile with sensitive logins). 4) Prefer running the bridge on an isolated machine or VM and inspect the package contents before running. 5) Do not grant the agent unattended/autonomous permission to start the server or control your browser without explicit, per-action approval. 6) If you lack the ability to audit the code, consider declining installation or running it in a tightly sandboxed environment.

功能分析

Type: OpenClaw Skill Name: browser-agent-bridge-cli Version: 1.0.4 The skill facilitates full control over a user's Chrome browser by installing a local bridge server and requiring the manual installation of an unpacked extension from a GitHub repository (NmadeleiDev/browser_agent_bridge). While the SKILL.md instructions are transparent and include security practices like token generation, the capability to observe DOM elements, capture HTML, and simulate user input (click, type, navigate) represents a high-risk surface for session hijacking or sensitive data exposure. The use of unpacked extensions is a common method to bypass browser security controls.

能力评估

✓ Purpose & Capability

The name/description (control a Chrome tab) aligns with the SKILL.md: it documents a bridge server, a Chrome extension, and an operator CLI for navigation, clicking, DOM snapshots, etc.

⚠ Instruction Scope

The instructions tell the agent to generate and use BRIDGE_SHARED_TOKEN and BRIDGE_OPERATOR_TOKEN, run a background server, install a CLI package, and ask a human to load an unpacked Chrome extension from a GitHub repo — all of which are necessary for the described capability but expand scope beyond a pure 'instruction-only' skill. The SKILL.md will cause the agent (or user) to fetch and run third‑party code and to capture and transmit page DOM and UI events, which can reveal sensitive page content.

⚠ Install Mechanism

Although the registry lists no install spec, the SKILL.md instructs installing 'browser-agent-bridge' via pipx and loading an extension from a GitHub repo. That pulls unvetted code from external sources (PyPI and GitHub) and will execute it locally — this is a legitimate way to install the tool but increases risk if the packages/repo are untrusted.

⚠ Credentials

The registry metadata lists no required environment variables, yet the runtime instructions require BRIDGE_SHARED_TOKEN and BRIDGE_OPERATOR_TOKEN (and optional BRIDGE_AUTH_MODE). That mismatch is important: the skill needs secret tokens to operate but does not declare them in metadata for review. These tokens grant the operator full control of connected browser clients, so they are high-value secrets and should be explicitly declared and protected.

ℹ Persistence & Privilege

The skill is not marked 'always:true' and does not request system-wide config changes, but it enables remote control of a local browser. Because model invocation is allowed (default), an agent using this skill could autonomously start the bridge server and send commands if given the necessary tokens — consider requiring explicit user confirmation before performing actions that control the user's browser.

版本历史

v1.0.4

- Clarified that the operator token can be passed via the --token flag or the BRIDGE_OPERATOR_TOKEN environment variable. - Updated CLI usage examples to explicitly demonstrate passing --token for clarity. - No functional or code changes; documentation only.

v1.0.3

- Added documentation and examples for the new press_key action, allowing special key presses in the browser. - Expanded supported remote tab actions to include press_key, with support for key modifiers and multiple selector/ref targeting methods. - Updated recommended execution flow and CLI usage to demonstrate press_key and its configuration options. - Clarified the list of supported keys, aliases, and modifiers for press_key in the operator CLI usage section.

v1.0.2

- Description and title simplified for brevity and clarity. - Introductory and instructional text streamlined to be more concise. - Some detailed agent and extension setup examples removed or shortened. - Core operator CLI usage, troubleshooting, and best practices remain unchanged. - No code or interface changes; documentation cleanup only.

v1.0.1

- The browser-bridge-server startup section now instructs launching the server in the background, redirecting logs to a file, and saving the process ID. - Added guidance explaining why the server should not be left attached to the current shell, emphasizing agent needs for subsequent CLI commands and diagnostics. - Clarified server process verification steps, including checking logs or process state.

v1.0.0

Initial release of browser-bridge-cli skill - Provides an agent guide for end-to-end setup and use of the Browser Bridge CLI for real-time Chrome browser control via extension and Python server. - Details installation, secure token setup, and connection workflow for both operator and extension. - Includes step-by-step usage examples for observing the page, navigating, clicking, typing, and retrieving HTML. - Offers troubleshooting tips and security best practices. - Defines successful operation with a clear done criteria checklist.

元数据

Slug browser-agent-bridge-cli

版本 1.0.4

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 5

常见问题

Browser Agent Bridge CLI 是什么？

Use this skill when you need to control or make actions on the user's chrome tab. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 431 次。

如何安装 Browser Agent Bridge CLI？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install browser-agent-bridge-cli」即可一键安装，无需额外配置。

Browser Agent Bridge CLI 是免费的吗？

是的，Browser Agent Bridge CLI 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Browser Agent Bridge CLI 支持哪些平台？

Browser Agent Bridge CLI 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Browser Agent Bridge CLI？

由 Gregory Potemkin（@nmadeleidev）开发并维护，当前版本 v1.0.4。

Browser Agent Bridge CLI