← Back to Skills Marketplace
106
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install ai-browser-ws
Description
通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等完整自动化操作。特点:真正的浏览器内核 (Chromium)、WebSocket 实时控制、支持无头/有头模式、自动重连机制。
README (SKILL.md)
AI Browser Skill 🌐
通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等自动化操作。
特点
- ✅ 真正的浏览器内核 (Chromium)
- ✅ WebSocket 实时控制
- ✅ 支持无头/有头模式
- ✅ 简单的标签页管理
- ✅ 自动重连机制
启动方法
# 1. 安装依赖
npm install
# 2. 启动服务
npm start
# 服务将运行在 ws://localhost:18790
WebSocket 协议
连接
连接到 ws://localhost:18790
消息格式
发送 JSON:
{
"id": "请求 ID (可选)",
"action": "动作名称",
"params": { ... }
}
支持的动作
| 动作 | 参数 | 说明 |
|---|---|---|
navigate |
{ url: "https://..." } |
导航到指定 URL |
snapshot |
{} |
获取当前页面简化 DOM 结构 |
screenshot |
{ fullPage: false } |
截图 (返回 base64) |
click |
{ selector: "button" } |
点击元素 |
type |
{ selector: "input", text: "hello", delay: 50 } |
输入文本 |
evaluate |
{ script: "document.title" } |
执行 JS 脚本 |
status |
{} |
获取浏览器状态 |
响应格式
{
"id": "请求 ID",
"success": true,
"result": { ... }
}
使用示例 (Python)
import websocket
import json
ws = websocket.create_connection("ws://localhost:18790")
# 导航
ws.send(json.dumps({"action": "navigate", "params": {"url": "https://fanqie.baidu.com"}}))
print(ws.recv())
# 截图
ws.send(json.dumps({"action": "screenshot", "params": {}}))
resp = json.loads(ws.recv())
with open("screen.png", "wb") as f:
f.write(base64.b64decode(resp["result"]["image"]))
ws.close()
注意事项
- 首次启动会自动下载 Chromium (约 100MB)
- 默认端口 18790,可通过
AI_BROWSER_PORT环境变量修改 - 无头模式设为
false,可以看到浏览器界面(方便调试)
使用场景
- 网页自动化测试
- 数据抓取
- 截图采集
- 表单自动填写
- 网站监控
Usage Guidance
This skill does implement a local WebSocket-controlled browser, but it currently runs an unauthenticated server that can execute arbitrary JS in pages, capture screenshots, and read form inputs — all of which can leak sensitive data (cookies, tokens, private pages). Also review quick-control.js: it automatically connects to a local Chrome debug port and navigates to a specific site, which is unexpected for a general utility. Before installing or running: 1) audit the code and remove or inspect quick-control.js if you don't want site-specific automation; 2) run the service in an isolated environment (VM/container) until you're comfortable; 3) bind the WebSocket server to 127.0.0.1 only (or require a secret token) and do not expose it to networks; 4) avoid forwarding or exposing the Chromium remote-debugging port (9222) to untrusted networks; 5) if you must use it in production, add authentication/authorization and TLS for clients, and set headless:true for unattended runs. If you cannot audit/mitigate these issues, treat the skill as risky and avoid running it on machines containing sensitive data.
Capability Analysis
Type: OpenClaw Skill
Name: ai-browser-ws
Version: 1.0.0
The skill provides a WebSocket server (`server.js`) that grants full remote control over a Puppeteer-managed browser, including an `evaluate` action that allows execution of arbitrary JavaScript. It includes a specific script (`quick-control.js`) hardcoded to target the 'Fanqie Writer' backend (`fanqie.baidu.com/writer`), which monitors login status and page content. While these features align with the stated goal of browser automation, the combination of unauthenticated remote browser access and targeted automation of a specific authenticated platform's backend presents a high risk for unauthorized actions or session hijacking.
Capability Assessment
Purpose & Capability
The package and SKILL.md match the stated purpose: a Puppeteer-based WebSocket browser controller that downloads Chromium. However, quick-control.js contains hardcoded behavior that connects to a local debug port and automatically navigates to a specific site (https://fanqie.baidu.com/writer). That file is unexpected in a general-purpose browser-control skill and could be used to perform site-specific actions without explicit user intent.
Instruction Scope
SKILL.md instructs running an npm service providing ws://localhost:18790 that accepts JSON actions (navigate, screenshot, evaluate, etc.). The server implements an 'evaluate' action that runs arbitrary JS in page context and returns DOM/inputs/screenshots. There is no authentication or authorization in the code, and the server listens on a port with no access control — meaning any client that can reach the port can read page content, screenshots, and execute scripts (high risk for credential or data leakage).
Install Mechanism
There is no custom download URL; dependencies are standard npm packages (puppeteer, ws). Puppeteer will download a Chromium binary during install/start (noted in SKILL.md). This is expected for functionality and does not use arbitrary external URLs or archive extraction beyond Puppeteer's normal behavior.
Credentials
The skill declares no required env vars aside from optional AI_BROWSER_PORT to change the listening port. That is proportionate. However, the server launches Chrome with a remote-debugging port (9222) and quick-control.js connects to that port — exposing another local interface that could be abused if reachable. No credentials are requested, but the code can capture sensitive page content without needing explicit secrets.
Persistence & Privilege
always:false (good), but the skill is invocable by the model and exposes powerful capabilities (DOM extraction, screenshots, arbitrary JS) via an unauthenticated socket. Autonomous agent invocation plus an unauthenticated control channel increases the blast radius: an agent or any local process could access and exfiltrate sensitive info. The skill itself does not persist beyond running the node process.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install ai-browser-ws - After installation, invoke the skill by name or use
/ai-browser-ws - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
AI Browser 1.0.0 – Initial release
- 支持通过 WebSocket 实时控制 Chromium 浏览器,实现导航、点击、输入、截图、DOM 获取等自动化操作
- 提供无头/有头模式切换及自动重连机制
- 简单的多标签页管理
- 首次启动自动下载 Chromium
- 完善的消息格式与接口文档
Metadata
Frequently Asked Questions
What is AI浏览器WebSocket控制?
通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等完整自动化操作。特点:真正的浏览器内核 (Chromium)、WebSocket 实时控制、支持无头/有头模式、自动重连机制。 It is an AI Agent Skill for Claude Code / OpenClaw, with 106 downloads so far.
How do I install AI浏览器WebSocket控制?
Run "/install ai-browser-ws" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is AI浏览器WebSocket控制 free?
Yes, AI浏览器WebSocket控制 is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does AI浏览器WebSocket控制 support?
AI浏览器WebSocket控制 is cross-platform and runs anywhere OpenClaw / Claude Code is available (linux, darwin, win32).
Who created AI浏览器WebSocket控制?
It is built and maintained by linbo405 (@linbo405); the current version is v1.0.0.
More Skills