/install claw-vision
能力定位
本地图片路径 → 结构化文本理解。通过 vision-tool.py 调用 Gemini 3.1 Pro Preview(NUWA Flux)。
触发场景
- 用户发送截图、照片、图片文件
- 关键词:截图、图片里有什么、识别、screenshot、describe image
调用方式
python3 ~/Documents/OpenClaw/workspace/scripts/vision-tool.py \x3C图片绝对路径> "\x3C提示语>"
参数
| 参数 | 必填 | 默认值 |
|---|---|---|
| 图片路径 | ✅ | — |
| 提示语 | ✅ | "图片里有什么?" |
支持格式
PNG / JPG / JPEG / GIF / WEBP(仅本地文件,不支持URL)
输出规范
[summary] 图片内容概述
[fields] 关键字段提取(含文字/表格时)
[ui_elements] 界面元素列表(UI截图时)
[confidence] 置信度: 高/中/低
依赖
- vision-tool.py: ~/Documents/OpenClaw/workspace/scripts/vision-tool.py
- API: NUWA Flux gemini-3.1-pro-preview
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install claw-vision - After installation, invoke the skill by name or use
/claw-vision - Provide required inputs per the skill's parameter spec and get structured output
What is Claw Vision?
Analyze local images including screenshots, receipts, and documents to extract structured text, UI elements, and provide content summaries with confidence le... It is an AI Agent Skill for Claude Code / OpenClaw, with 561 downloads so far.
How do I install Claw Vision?
Run "/install claw-vision" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Claw Vision free?
Yes, Claw Vision is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Claw Vision support?
Claw Vision is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Claw Vision?
It is built and maintained by Puma1981 (@puma1981); the current version is v1.0.0.