GUI Agent
/install gui-claw
GUI Agent
STEP 0: Activate Platform (MANDATORY FIRST STEP)
Before any GUI operation, run:
python3 {baseDir}/scripts/activate.py
This detects your OS, sets up the correct action commands, and outputs platform context.
After running, {baseDir}/actions/_actions.yaml contains your platform's commands.
Workflow
OBSERVE → LEARN → ACT → VERIFY → SAVE
-
OBSERVE — Take screenshot → run OCR + detector → understand current state →
read {baseDir}/skills/gui-observe/SKILL.md -
LEARN — First time with an app? Save components to memory →
read {baseDir}/skills/gui-learn/SKILL.md→learn_from_screenshot()auto-outputs app tips if available -
ACT — Pick target → execute using
_actions.yamlcommands → verify →read {baseDir}/skills/gui-act/SKILL.md→read {baseDir}/actions/_actions.yamlfor available commands -
VERIFY — Screenshot again → confirm action succeeded
-
SAVE — Record state transitions to memory →
read {baseDir}/skills/gui-memory/SKILL.mdfor memory structure
Core Rules
- Coordinates from detection only — OCR or GPA-GUI-Detector, NEVER from guessing
- Look before you act — every action must be justified by what you observed
- image tool = understanding only — use it to decide WHAT to click, get WHERE from OCR/detector
Sub-Skills Reference
| Sub-Skill | When to read |
|---|---|
skills/gui-observe/SKILL.md |
Before screenshots or detection |
skills/gui-learn/SKILL.md |
Before learning a new app |
skills/gui-act/SKILL.md |
Before any click/type action |
skills/gui-memory/SKILL.md |
For memory structure details |
skills/gui-workflow/SKILL.md |
For multi-step navigation |
skills/gui-setup/SKILL.md |
For first-time machine setup |
skills/gui-report/SKILL.md |
For task performance reporting |
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install gui-claw - 安装完成后,直接呼叫该 Skill 的名称或使用
/gui-claw触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
GUI Agent 是什么?
GUI automation via visual detection. Clicking, typing, reading content, navigating menus, filling forms — all through screenshot → detect → act workflow. Sup... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 189 次。
如何安装 GUI Agent?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install gui-claw」即可一键安装,无需额外配置。
GUI Agent 是免费的吗?
是的,GUI Agent 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
GUI Agent 支持哪些平台?
GUI Agent 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 GUI Agent?
由 AlfredJamesLi(@alfredjamesli)开发并维护,当前版本 v1.0.1。