← 返回 Skills 市场
jami-lin

chrome_skill

作者 Jami-Lin · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
31
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install chromeskill
功能描述
Browser automation via Chrome AI Action (CAA) bridge. Control Chrome programmatically — navigate, click, type, screenshot, extract content, and more. Uses Pu...
使用说明 (SKILL.md)

Chrome AI Action — Browser Automation Skill

AI Agent 浏览器自动化技能。通过 Chrome AI Action (CAA) 桥接服务,以 Puppeteer (CDP) 模式编程控制 Chrome 浏览器,支持导航、点击、输入、截图、内容提取、网络拦截、Cookie 管理、PDF 导出等 60+ 操作。


When to Use / 何时使用

场景 调用
User asks to browse a web page, search, fill forms, extract data Yes
User needs screenshots of a web page Yes
User wants to automate browser interactions Yes
User asks about writing code / debugging (no browser involved) No
场景 调用
用户需要在浏览器中打开网页、搜索、填写表单、提取数据
用户需要网页截图
用户希望自动化浏览器操作
用户问代码/调试相关(不涉及浏览器)

⚠️ CRITICAL: Chinese URL Encoding

IMPORTANT: When constructing URLs with Chinese characters for the navigate action, the agent MUST encode the query string values using encodeURIComponent. The bridge automatically encodes non-ASCII characters in the URL path, but query string values must be pre-encoded by the caller.

重要说明: 调用 navigate 时,URL 中如果包含中文字符,智能体必须先用 encodeURIComponent 对查询参数值进行编码。例如 wd=妻子的浪漫旅行 必须写成 wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C

Correct / 正确写法

{"action": "navigate", "params": {"url": "https://www.baidu.com/s?wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C"}}

Wrong / 错误写法

{"action": "navigate", "params": {"url": "https://www.baidu.com/s?wd=妻子的浪漫旅行"}}

How to encode in Node.js / 如何在 Node.js 中编码

const encoded = encodeURIComponent('妻子的浪漫旅行');
// Result: %E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C

Prerequisites / 前提条件

Requirement Check Auto-resolve
Chrome / Chromium installed Detected automatically No (user must install)
Chrome running with CDP Detected on startup Yes (auto-launched)
Node.js 18+ node --version No
要求 检查方式 自动处理
已安装 Chrome / Chromium 自动检测常用安装路径 否(用户需安装)
Chrome 以 CDP 模式运行 启动时检测 是(自动启动)
Node.js 18+ node --version

Startup Protocol / 启动协议

When loaded for the first time, the agent MUST run the startup script. The script runs the bridge as a background child process — the agent does NOT need to manage the process separately.

首次加载时,AI 智能体必须执行以下启动脚本。脚本会自动在后台启动桥接服务,智能体无需单独管理进程

node \x3Cskill_dir>/scripts/startup.js

What it does / 执行流程

  1. Check if bridge is already running: GET /health on port 9876 → skip if OK
  2. Ensure npm package installed: npm list -g chrome-ai-action → installs via npm install -g chrome-ai-action if missing
  3. Start the bridge: chrome-ai-action --port 9876, waits for health check
  4. Auto-launch Chrome: If Chrome not running with CDP, the bridge starts it automatically (cross-platform)

Environment Variables / 环境变量

Variable Default Description
CAA_BRIDGE_PORT 9876 Bridge HTTP server port
CAA_STARTUP_TIMEOUT 30000 Max wait for bridge ready (ms)
CHROME_PATH auto-detect Custom Chrome executable path
CHROME_USER_DATA_DIR platform-dependent Chrome profile directory

API Protocol / 通信协议

Endpoint: http://127.0.0.1:9876/

Endpoints / 接口地址

Method Path Description
GET /health Health check — returns bridge & CDP status
GET /schema Full action schema (64+ actions)
POST / Execute action(s)

Request Format / 请求格式

{"type": "action", "action": "\x3CACTION>", "params": {...}, "requestId": "optional-id"}

Batch Request / 批量请求

{"type": "batch", "actions": [
  {"action": "navigate", "params": {"url": "https://example.com"}},
  {"action": "getTitle"}
]}

Response Format / 响应格式

{"success": true, "data": {...}, "requestId": "req-1", "timestamp": 1712345678901}

Error Response / 错误响应

{"success": false, "error": {"code": "ACTION_ERROR", "message": "..."}, "requestId": "req-1", "timestamp": 1712345678901}

Available Actions (64+) / 可用操作 (64+)

Navigation / 导航

navigate, goBack, goForward, reload, getUrl, getTitle

Page Content / 页面内容

getText, getHtml, getLinks, getImages, getHeadings, getMetaTags, getFormFields, getFocusableElements

Element Interaction / 元素交互

click, type, pressKey, scroll, scrollIntoView, findElement, focus, hover, select

Data Extraction / 数据提取

getValue, getAttribute, getAttributeAll, getBoundingBox, getCookies, getPerformanceMetrics, getSelectedValue, getSelectOptions

JavaScript / JS 执行

evaluate, injectScript, injectCSS

Screenshot & Export / 截图与导出

screenshot (PNG/JPEG), getPdf (A4/Letter)

Tab Management / 标签页管理

listTabs, newTab, closeTab, switchTab, getCurrentTab

Waiting / 等待

waitForElement, waitForTimeout, waitForNavigation

Cookie Management / Cookie 管理

setCookie, deleteCookie

Network Interception / 网络拦截

blockUrls, unblockUrls, mockResponse, getNetworkRequests, clearNetworkRequests

Storage / 本地存储

getLocalStorage, setLocalStorage, removeLocalStorage, clearLocalStorage

File Operations / 文件操作

uploadFile, setInputFiles, downloadFile

Viewport / 视口

getViewport, setViewport

Console / 控制台日志

getConsoleLogs, clearConsoleLogs

Accessibility / 无障碍

getAccessibilityTree

Utility / 工具

ping, connect, disconnect, getBrowserInfo, highlight, dispatchEvent


Typical Workflow / 典型工作流

  1. Navigate: navigate → go to target URL (encode Chinese in query params)
  2. Wait: waitForElement → wait for key content
  3. Read: getText / getHtml / getLinks → understand page
  4. Interact: click / type / pressKey → perform actions
  5. Extract: getText / screenshot / evaluate → get results
  6. Confirm: screenshot → visually verify

Example: Search Baidu with Chinese / 百度搜索中文示例

{"type": "batch", "actions": [
  {"action": "navigate", "params": {"url": "https://www.baidu.com/s?wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C"}},
  {"action": "waitForTimeout", "params": {"ms": 2000}},
  {"action": "getText"}
]}

Example: Full Login Flow / 登录流程示例

{"type": "batch", "actions": [
  {"action": "navigate", "params": {"url": "https://example.com/login"}},
  {"action": "waitForElement", "params": {"selector": "input[name=username]", "timeout": 10000}},
  {"action": "type", "params": {"selector": "input[name=username]", "value": "myuser"}},
  {"action": "type", "params": {"selector": "input[name=password]", "value": "mypassword"}},
  {"action": "click", "params": {"selector": "button[type=submit]"}},
  {"action": "waitForTimeout", "params": {"ms": 3000}},
  {"action": "getCurrentTab"}
]}

Error Handling / 错误处理

Error Code Meaning Resolution
CDP_NOT_CONNECTED Chrome not running with debug port Bridge auto-launches Chrome, retries every 3s
ACTION_ERROR Action execution failed Check params, use getFocusableElements to find elements first
INVALID_REQUEST Malformed request Check request format
PARSE_ERROR JSON parse failure Send valid JSON

Discovery Tips / 探测提示

When you don't know what elements are on a page:

  1. getFocusableElements → all interactive elements (with positions)
  2. getFormFields → all form inputs with metadata
  3. getLinks → all links on page
  4. getHeadings → understand page structure
  5. getText → all visible text

References / 参考资料

  • references/bridge-api.md — Complete API reference with all 64+ actions
  • references/setup-guide.md — Detailed setup and troubleshooting
  • scripts/startup.js — Startup automation script
安全使用建议
Install only if you are comfortable giving the agent broad Chrome control. Prefer a dedicated Chrome profile with no sensitive logins, manually verify and pin the npm package, run without admin privileges, and stop the bridge after use.
功能分析
Type: OpenClaw Skill Name: chromeskill Version: 1.0.0 The skill provides extensive browser automation capabilities via a bridge service, including cookie extraction, local storage access, arbitrary JavaScript execution (`evaluate`), and file upload/download. The `scripts/startup.js` file automatically installs a global npm package (`chrome-ai-action`) and launches a background process on port 9876. While these features are aligned with the stated goal of browser automation, the broad permissions and the automated global installation of external code represent a significant security risk and potential for data exfiltration or unauthorized system access if the agent is misused or the external package is compromised.
能力标签
crypto
能力评估
Purpose & Capability
The browser automation purpose is coherent with navigation, clicking, screenshots, extraction, and CDP control, but the exposed capabilities are high-impact because they include cookies, storage, JavaScript injection, network interception, and file upload/download.
Instruction Scope
The artifacts describe broad raw actions such as evaluate/injectScript, batch requests, cookie/storage mutation, and file operations without clear per-domain scoping or user-confirmation requirements for sensitive actions.
Install Mechanism
There is no registry install spec, yet first use runs a startup script that globally installs and executes the external npm package `chrome-ai-action` without a pinned version or reviewed package contents.
Credentials
Registry requirements declare no binaries, env vars, or credentials, while the docs require Node.js/Chrome/npm and create localhost bridge/CDP access to a Chrome profile; this under-declares the real environment authority.
Persistence & Privilege
The skill starts a background bridge and can auto-launch Chrome with remote debugging, but the artifacts do not provide clear shutdown, isolation, or cleanup guidance.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install chromeskill
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /chromeskill 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
chrome-ai-action-skill 1.0.0 initial release: - Enables full browser automation via the Chrome AI Action (CAA) bridge using Puppeteer (CDP) mode. - Supports 60+ browser actions: navigation, clicking, typing, screenshots, data extraction, network interception, cookie and storage management, PDF export, and more. - Automatically installs required npm package and launches the bridge; Chrome is auto-started if not running. - Startup, API usage, error handling, and discovery tips clearly documented in English and Chinese. - Special guidance for correct URL encoding with Chinese characters in navigation actions.
元数据
Slug chromeskill
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

chrome_skill 是什么?

Browser automation via Chrome AI Action (CAA) bridge. Control Chrome programmatically — navigate, click, type, screenshot, extract content, and more. Uses Pu... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 31 次。

如何安装 chrome_skill?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install chromeskill」即可一键安装,无需额外配置。

chrome_skill 是免费的吗?

是的,chrome_skill 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

chrome_skill 支持哪些平台?

chrome_skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 chrome_skill?

由 Jami-Lin(@jami-lin)开发并维护,当前版本 v1.0.0。

💬 留言讨论