← 返回 Skills 市场
UI-Agent
作者
Nima Ansari
· GitHub ↗
· v1.0.0
· MIT-0
100
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install ui-agent
功能描述
Universal UI automation for browsers and desktops. Chrome DevTools Protocol + native APIs. 15/15 verified tests.
安全使用建议
This skill contains real code for browser (CDP) and desktop automation and will try to launch Chrome, Xvfb/GUI tools, and run system utilities (xdotool, pgrep, scrot/gnome-screenshot, etc.). Before installing or enabling it: 1) Only install from a trusted source — the manifest’s source/homepage are absent and the repo links in docs are placeholders. 2) Expect to need system packages (Chrome/Chromium, display server/Xvfb or Wayland tools, xdotool/ydotool, screenshot tools); validate and control those dependencies in an isolated environment (VM/container). 3) Review the src/ files (chrome_session_vbox_fixed.py, desktop_helpers.py, cdp_typer.py) to ensure you’re comfortable with the skill’s ability to run shell commands, kill/relaunch processes, and read/write files (it writes cookies and screenshots to /tmp). 4) Don’t grant this skill access to machines with sensitive active sessions unless you trust it — its cookie read/restore behavior can persist web sessions. 5) Run the skill initially in a sandboxed VM with minimal privileges and inspect test outputs before letting it run autonomously.
功能分析
Type: OpenClaw Skill
Name: ui-agent
Version: 1.0.0
The bundle provides a comprehensive suite for browser and desktop automation using the Chrome DevTools Protocol (CDP) and X11 utilities like xdotool and wmctrl. It includes high-risk capabilities such as executing shell commands via subprocess and os.system (src/cdp_typer.py, src/desktop_helpers.py), bypassing browser security features (e.g., --no-sandbox and suppress_origin=True), and programmatically extracting and restoring browser cookies for session persistence (tests/test_sp1_official.py). While these features are aligned with the stated goal of a universal UI automation framework, the broad system access and potential for abuse—such as session hijacking or unauthorized remote control—meet the threshold for a suspicious classification despite the lack of clear evidence of malicious intent.
能力评估
Purpose & Capability
The name/description (UI automation via CDP + native APIs) matches the code and tests: the repo contains a CDP wrapper, VirtualBox-safe Chrome launcher, AT-SPI2/X11 helpers, and verification utilities. However, registry metadata claims no required binaries or environment setup while the code and docs clearly expect Chrome/Chromium, Xvfb (or a display), xdotool/ydotool, screenshot tools, etc. That mismatch (metadata says 'none' while the implementation requires many system-level tools) is unexpected and should be resolved before trusting the skill.
Instruction Scope
The SKILL.md and docs explicitly instruct the agent to launch and reuse Chrome, send CDP commands, take screenshots, run shell commands, kill/relaunch Chrome, write/read files under /tmp, and call subprocess tools (pgrep, xdotool, xclip). Those actions are within the stated purpose (UI automation), but they are powerful: the skill can execute arbitrary shell commands via its shell() helper and manipulate other processes. The SKILL.md documents this capability (e.g., agent.shell, desktop_helpers.shell), so it's not hidden — but it does grant broad system access which users should treat as high-impact.
Install Mechanism
There is no install spec (instruction-only), which reduces supply-chain risk, but the repo and docs require system-level packages (Chrome, Xvfb, xdotool, scrot/gnome-screenshot, ydotool, etc.). The skill does not declare these required system binaries in the registry metadata. If you install/run the skill as-is it may fail or behave unpredictably on systems that lack those tools; conversely, an environment with those tools gives the skill extensive ability to control the host. No remote download or archive installs were observed in the manifest.
Credentials
The skill does not request any environment variables, credentials, or secret tokens in its metadata (primaryEnv: none), which is appropriate. It does, however, manipulate browser cookies (saves/restores document.cookie), write files under /tmp, and communicate with external sites (e.g., httpbin.org) as part of tests. Those behaviors are coherent for a UI automation tool but mean the skill can capture and restore session material (cookies) — a legitimate feature for automation but also something to be mindful of if you have sensitive sessions on the host.
Persistence & Privilege
The skill is not always-enabled (always:false) and does not request elevated platform privileges in the manifest. Autonomy (disable-model-invocation:false) is the platform default. The skill's code can launch and kill browsers and run shell commands, but it does not appear to modify other skills or system-wide OpenClaw configuration. Still, combining autonomous invocation with the ability to execute shell commands and manage processes increases the potential blast radius if the skill is given unsupervised authority.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install ui-agent - 安装完成后,直接呼叫该 Skill 的名称或使用
/ui-agent触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
UIAgent v1.0.0 – Initial public release
- Universal UI automation framework for browsers (CDP) and desktops (OS-native APIs)
- 100% test coverage: 15/15 real-world verified tests passing
- Supports automation for web workflows, dynamic UIs, and desktop applications
- Evidence-based verification: screenshot hashing, DOM checks, file checks
- Reliable cross-browser session management and persistence, including headless Chrome
- Includes robust API: JavaScript execution, mouse/keyboard simulation, screenshots, and more
元数据
常见问题
UI-Agent 是什么?
Universal UI automation for browsers and desktops. Chrome DevTools Protocol + native APIs. 15/15 verified tests. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 100 次。
如何安装 UI-Agent?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install ui-agent」即可一键安装,无需额外配置。
UI-Agent 是免费的吗?
是的,UI-Agent 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
UI-Agent 支持哪些平台?
UI-Agent 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 UI-Agent?
由 Nima Ansari(@nimaansari)开发并维护,当前版本 v1.0.0。
推荐 Skills