← 返回 Skills 市场
linbo405

AI浏览器WebSocket控制

作者 linbo405 · GitHub ↗ · v1.0.0 · MIT-0
linuxdarwinwin32 ⚠ suspicious
106
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install ai-browser-ws
功能描述
通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等完整自动化操作。特点:真正的浏览器内核 (Chromium)、WebSocket 实时控制、支持无头/有头模式、自动重连机制。
使用说明 (SKILL.md)

AI Browser Skill 🌐

通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等自动化操作。

特点

  • ✅ 真正的浏览器内核 (Chromium)
  • ✅ WebSocket 实时控制
  • ✅ 支持无头/有头模式
  • ✅ 简单的标签页管理
  • ✅ 自动重连机制

启动方法

# 1. 安装依赖
npm install

# 2. 启动服务
npm start

# 服务将运行在 ws://localhost:18790

WebSocket 协议

连接

连接到 ws://localhost:18790

消息格式

发送 JSON:

{
  "id": "请求 ID (可选)",
  "action": "动作名称",
  "params": { ... }
}

支持的动作

动作 参数 说明
navigate { url: "https://..." } 导航到指定 URL
snapshot {} 获取当前页面简化 DOM 结构
screenshot { fullPage: false } 截图 (返回 base64)
click { selector: "button" } 点击元素
type { selector: "input", text: "hello", delay: 50 } 输入文本
evaluate { script: "document.title" } 执行 JS 脚本
status {} 获取浏览器状态

响应格式

{
  "id": "请求 ID",
  "success": true,
  "result": { ... }
}

使用示例 (Python)

import websocket
import json

ws = websocket.create_connection("ws://localhost:18790")

# 导航
ws.send(json.dumps({"action": "navigate", "params": {"url": "https://fanqie.baidu.com"}}))
print(ws.recv())

# 截图
ws.send(json.dumps({"action": "screenshot", "params": {}}))
resp = json.loads(ws.recv())
with open("screen.png", "wb") as f:
    f.write(base64.b64decode(resp["result"]["image"]))

ws.close()

注意事项

  • 首次启动会自动下载 Chromium (约 100MB)
  • 默认端口 18790,可通过 AI_BROWSER_PORT 环境变量修改
  • 无头模式设为 false,可以看到浏览器界面(方便调试)

使用场景

  • 网页自动化测试
  • 数据抓取
  • 截图采集
  • 表单自动填写
  • 网站监控
安全使用建议
This skill does implement a local WebSocket-controlled browser, but it currently runs an unauthenticated server that can execute arbitrary JS in pages, capture screenshots, and read form inputs — all of which can leak sensitive data (cookies, tokens, private pages). Also review quick-control.js: it automatically connects to a local Chrome debug port and navigates to a specific site, which is unexpected for a general utility. Before installing or running: 1) audit the code and remove or inspect quick-control.js if you don't want site-specific automation; 2) run the service in an isolated environment (VM/container) until you're comfortable; 3) bind the WebSocket server to 127.0.0.1 only (or require a secret token) and do not expose it to networks; 4) avoid forwarding or exposing the Chromium remote-debugging port (9222) to untrusted networks; 5) if you must use it in production, add authentication/authorization and TLS for clients, and set headless:true for unattended runs. If you cannot audit/mitigate these issues, treat the skill as risky and avoid running it on machines containing sensitive data.
功能分析
Type: OpenClaw Skill Name: ai-browser-ws Version: 1.0.0 The skill provides a WebSocket server (`server.js`) that grants full remote control over a Puppeteer-managed browser, including an `evaluate` action that allows execution of arbitrary JavaScript. It includes a specific script (`quick-control.js`) hardcoded to target the 'Fanqie Writer' backend (`fanqie.baidu.com/writer`), which monitors login status and page content. While these features align with the stated goal of browser automation, the combination of unauthenticated remote browser access and targeted automation of a specific authenticated platform's backend presents a high risk for unauthorized actions or session hijacking.
能力评估
Purpose & Capability
The package and SKILL.md match the stated purpose: a Puppeteer-based WebSocket browser controller that downloads Chromium. However, quick-control.js contains hardcoded behavior that connects to a local debug port and automatically navigates to a specific site (https://fanqie.baidu.com/writer). That file is unexpected in a general-purpose browser-control skill and could be used to perform site-specific actions without explicit user intent.
Instruction Scope
SKILL.md instructs running an npm service providing ws://localhost:18790 that accepts JSON actions (navigate, screenshot, evaluate, etc.). The server implements an 'evaluate' action that runs arbitrary JS in page context and returns DOM/inputs/screenshots. There is no authentication or authorization in the code, and the server listens on a port with no access control — meaning any client that can reach the port can read page content, screenshots, and execute scripts (high risk for credential or data leakage).
Install Mechanism
There is no custom download URL; dependencies are standard npm packages (puppeteer, ws). Puppeteer will download a Chromium binary during install/start (noted in SKILL.md). This is expected for functionality and does not use arbitrary external URLs or archive extraction beyond Puppeteer's normal behavior.
Credentials
The skill declares no required env vars aside from optional AI_BROWSER_PORT to change the listening port. That is proportionate. However, the server launches Chrome with a remote-debugging port (9222) and quick-control.js connects to that port — exposing another local interface that could be abused if reachable. No credentials are requested, but the code can capture sensitive page content without needing explicit secrets.
Persistence & Privilege
always:false (good), but the skill is invocable by the model and exposes powerful capabilities (DOM extraction, screenshots, arbitrary JS) via an unauthenticated socket. Autonomous agent invocation plus an unauthenticated control channel increases the blast radius: an agent or any local process could access and exfiltrate sensitive info. The skill itself does not persist beyond running the node process.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install ai-browser-ws
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /ai-browser-ws 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
AI Browser 1.0.0 – Initial release - 支持通过 WebSocket 实时控制 Chromium 浏览器,实现导航、点击、输入、截图、DOM 获取等自动化操作 - 提供无头/有头模式切换及自动重连机制 - 简单的多标签页管理 - 首次启动自动下载 Chromium - 完善的消息格式与接口文档
元数据
Slug ai-browser-ws
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

AI浏览器WebSocket控制 是什么?

通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等完整自动化操作。特点:真正的浏览器内核 (Chromium)、WebSocket 实时控制、支持无头/有头模式、自动重连机制。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 106 次。

如何安装 AI浏览器WebSocket控制?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ai-browser-ws」即可一键安装,无需额外配置。

AI浏览器WebSocket控制 是免费的吗?

是的,AI浏览器WebSocket控制 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

AI浏览器WebSocket控制 支持哪些平台?

AI浏览器WebSocket控制 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(linux, darwin, win32)。

谁开发了 AI浏览器WebSocket控制?

由 linbo405(@linbo405)开发并维护,当前版本 v1.0.0。

💬 留言讨论