← 返回 Skills 市场
linbo405

AI Browser

作者 linbo405 · GitHub ↗ · v1.0.0 · MIT-0
linuxdarwinwin32 ⚠ suspicious
160
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install ai-browser
功能描述
通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等完整自动化操作。特点:真正的浏览器内核 (Chromium)、WebSocket 实时控制、支持无头/有头模式、自动重连机制。
使用说明 (SKILL.md)

AI Browser Skill 🌐

通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等自动化操作。

特点

  • ✅ 真正的浏览器内核 (Chromium)
  • ✅ WebSocket 实时控制
  • ✅ 支持无头/有头模式
  • ✅ 简单的标签页管理
  • ✅ 自动重连机制

启动方法

# 1. 安装依赖
npm install

# 2. 启动服务
npm start

# 服务将运行在 ws://localhost:18790

WebSocket 协议

连接

连接到 ws://localhost:18790

消息格式

发送 JSON:

{
  "id": "请求 ID (可选)",
  "action": "动作名称",
  "params": { ... }
}

支持的动作

动作 参数 说明
navigate { url: "https://..." } 导航到指定 URL
snapshot {} 获取当前页面简化 DOM 结构
screenshot { fullPage: false } 截图 (返回 base64)
click { selector: "button" } 点击元素
type { selector: "input", text: "hello", delay: 50 } 输入文本
evaluate { script: "document.title" } 执行 JS 脚本
status {} 获取浏览器状态

响应格式

{
  "id": "请求 ID",
  "success": true,
  "result": { ... }
}

使用示例 (Python)

import websocket
import json

ws = websocket.create_connection("ws://localhost:18790")

# 导航
ws.send(json.dumps({"action": "navigate", "params": {"url": "https://fanqie.baidu.com"}}))
print(ws.recv())

# 截图
ws.send(json.dumps({"action": "screenshot", "params": {}}))
resp = json.loads(ws.recv())
with open("screen.png", "wb") as f:
    f.write(base64.b64decode(resp["result"]["image"]))

ws.close()

注意事项

  • 首次启动会自动下载 Chromium (约 100MB)
  • 默认端口 18790,可通过 AI_BROWSER_PORT 环境变量修改
  • 无头模式设为 false,可以看到浏览器界面(方便调试)

使用场景

  • 网页自动化测试
  • 数据抓取
  • 截图采集
  • 表单自动填写
  • 网站监控
安全使用建议
This skill implements a full Puppeteer-controlled browser over a WebSocket API — powerful but risky if run without protections. Before installing/running: (1) review quick-control.js: it connects to a local Chrome and targets a specific site (fanqie.baidu.com) — remove or audit this if you don't trust that usage. (2) Run the service in an isolated environment (VM, container) to avoid exposing local browser cookies or credentials. (3) Restrict access: bind the WebSocket server to 127.0.0.1 explicitly or use firewall rules, and ensure port 18790 (and 9222 remote-debugging) are not reachable from untrusted networks. (4) Add authentication/authorization to the WebSocket API or a reverse-proxy that enforces it. (5) If you need only connect to an existing Chrome, prefer puppeteer-core usage but be aware connecting to remote debugging port can expose the whole browser. (6) Verify the package origin and consider pinning dependency versions; expect Puppeteer to download a Chromium binary during install. If you cannot take these precautions, treat this skill as unsafe on hosts with sensitive sessions or open network interfaces.
功能分析
Type: OpenClaw Skill Name: ai-browser Version: 1.0.0 The skill provides a WebSocket-based remote control interface for a Chromium browser using Puppeteer, which includes high-risk capabilities such as arbitrary JavaScript execution (`evaluate` action) and automated interaction (typing, clicking) without any authentication or authorization. The server (`server.js`) also disables the Chromium sandbox (`--no-sandbox`) and opens a remote debugging port (9222), creating a significant attack surface. While these features align with the stated goal of browser automation, the lack of security controls and the inclusion of a script (`quick-control.js`) specifically targeting a Chinese writing platform (fanqie.baidu.com) warrant a suspicious classification.
能力评估
Purpose & Capability
Overall the code and instructions match the stated purpose: a Puppeteer-based WebSocket service to control Chromium (navigation, click, screenshot, evaluate). However quick-control.js specifically connects to a local Chrome and targets a specific site (https://fanqie.baidu.com/writer) and contains logic about login/publishing — that is not aligned with a generic 'AI Browser' skill and looks like a leftover utility for a particular workflow.
Instruction Scope
SKILL.md instructs npm install and npm start and to connect to ws://localhost:18790, which matches server behavior in basic form. But the runtime code accepts arbitrary JSON actions including 'evaluate' (executes arbitrary JS in page context) and returns DOM/screenshot data — which is expected for this feature set but is a powerful capability. The server exposes a WebSocket server without any authentication or origin checks; SKILL.md emphasizes localhost but the Node WebSocket server binds the port with no explicit host and will listen on all interfaces by default, so it may be reachable beyond loopback. quick-control.js also connects to a remote-debugging port (9222) and navigates a specific site — instructions do not call out these risks or recommend isolation.
Install Mechanism
No install spec in registry; SKILL.md instructs npm install and package.json depends on puppeteer and ws. Installing puppeteer will download a Chromium binary (~100MB) at install-time/runtime — this is expected but significant. The install path uses npm (a well-known registry) so download-origin risk is moderate and expected for this skill.
Credentials
The skill requests no secret env vars (only optionally AI_BROWSER_PORT). Still, it launches a browser and exposes a debugging port (9222) and a WebSocket control port; these allow access to pages and any authenticated sessions loaded in the browser, which is a privilege that can expose sensitive local data (cookies, logged-in sites). quick-control.js specifically targets a third-party site (fanqie.baidu.com), which implies this bundle was prepared for a specific account/workflow and could interact with user sessions on that site — that raises proportionality questions given the generic description.
Persistence & Privilege
The skill is not always-enabled and does not request special platform privileges. It does not modify other skills or system configs. Autonomous invocation is allowed by default (normal for skills) but is not combined here with an 'always' flag or other elevation.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install ai-browser
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /ai-browser 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
AI Browser v1.0.0 – Initial release - Provides real browser automation via WebSocket, supporting navigation, clicks, typing, screenshots, and DOM retrieval. - Built on a real Chromium engine. - Supports both headless and headed modes. - Features live control, simple tab management, and automatic reconnection. - Easy setup with a default port and optional configuration.
元数据
Slug ai-browser
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

AI Browser 是什么?

通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等完整自动化操作。特点:真正的浏览器内核 (Chromium)、WebSocket 实时控制、支持无头/有头模式、自动重连机制。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 160 次。

如何安装 AI Browser?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ai-browser」即可一键安装,无需额外配置。

AI Browser 是免费的吗?

是的,AI Browser 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

AI Browser 支持哪些平台?

AI Browser 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(linux, darwin, win32)。

谁开发了 AI Browser?

由 linbo405(@linbo405)开发并维护,当前版本 v1.0.0。

💬 留言讨论