← 返回 Skills 市场

Claw Vision

Name: Claw Vision
Author: puma1981

作者 Puma1981 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

561

总下载

当前安装

版本数

在 OpenClaw 中安装

/install claw-vision

功能描述

Analyze local images including screenshots, receipts, and documents to extract structured text, UI elements, and provide content summaries with confidence le...

使用说明 (SKILL.md)

能力定位

本地图片路径 → 结构化文本理解。通过 vision-tool.py 调用 Gemini 3.1 Pro Preview（NUWA Flux）。

触发场景

用户发送截图、照片、图片文件
关键词：截图、图片里有什么、识别、screenshot、describe image

调用方式

python3 ~/Documents/OpenClaw/workspace/scripts/vision-tool.py \x3C图片绝对路径> "\x3C提示语>"

参数

参数	必填	默认值
图片路径	✅	—
提示语	✅	"图片里有什么？"

支持格式

PNG / JPG / JPEG / GIF / WEBP（仅本地文件，不支持URL）

输出规范

[summary]     图片内容概述
[fields]      关键字段提取（含文字/表格时）
[ui_elements] 界面元素列表（UI截图时）
[confidence]   置信度: 高/中/低

依赖

vision-tool.py: ~/Documents/OpenClaw/workspace/scripts/vision-tool.py
API: NUWA Flux gemini-3.1-pro-preview

安全使用建议

Do not run or give this skill access until you verify the helper script. Ask the author to provide the vision-tool.py source or bundle it with the skill so it can be reviewed. Confirm how NUWA/Gemini credentials are stored and ensure they are not hardcoded in an opaque script. If you must test, inspect the script manually or run it inside a restricted sandbox (container) to prevent unintended file access or network exfiltration. Prefer skills that declare required env vars and include or link to verifiable code or an install step from a trusted release URL.

功能分析

Type: OpenClaw Skill Name: claw-vision Version: 1.0.0 The skill bundle defines an execution pattern in SKILL.md that is vulnerable to shell injection by instructing the agent to pass user-controlled strings directly into a shell command (python3 ... "<提示语>"). It also references a fictional model version (Gemini 3.1 Pro) and relies on an external script (vision-tool.py) located in the user's home directory rather than including it in the bundle. While no clear malicious intent or exfiltration logic is present, the insecure command construction and external dependencies are high-risk indicators.

能力评估

ℹ Purpose & Capability

The declared purpose (analyze local images) matches the instruction to run a local vision tool. However, the skill depends on a hardcoded user-local script path (~/Documents/OpenClaw/workspace/scripts/vision-tool.py) and references the NUWA/Gemini API without declaring how authentication should be provided. Requiring an arbitrary local script at that path is unusual and not justified in the SKILL.md.

⚠ Instruction Scope

SKILL.md instructs the agent to run a user-local Python script with an arbitrary image path and prompt. Because the script is not included, its behavior is unknown — it could read any files under the user's home, access network endpoints, or perform other actions. The instructions do not place limits on what the script may do or where credentials come from.

✓ Install Mechanism

There is no install spec and no code files in the skill bundle (instruction-only), which minimizes risk from untrusted downloads. However, the lack of an included script means the agent will rely on an external file that cannot be analyzed.

⚠ Credentials

The SKILL.md references NUWA Flux / gemini-3.1-pro-preview but declares no required environment variables or primary credential. It's unclear how the vision-tool.py authenticates to the external API (missing API key/env guidance). This mismatch is a red flag: either credentials are expected to exist elsewhere on the system or the script will prompt/handle them — both are security-relevant behaviors that should be declared.

ℹ Persistence & Privilege

The skill does not request persistent or always-on privileges (always:false). Nevertheless, it instructs execution of a local script which can run arbitrary code when invoked; that is a runtime privilege but not a declared persistent capability. No modifications to other skills or system-wide settings are specified.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install claw-vision
安装完成后，直接呼叫该 Skill 的名称或使用 /claw-vision 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

claw-vision 1.0.0 - Initial release. - Analyze images using Gemini 3.1 Pro Preview (NUWA Flux) via vision-tool.py. - Supports local PNG, JPG, JPEG, GIF, and WEBP files. - Outputs structured results: summary, key fields, UI elements, and confidence level. - Triggered when users send image files or relevant keywords/commands.

元数据

Slug claw-vision

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题

Claw Vision 是什么？

Analyze local images including screenshots, receipts, and documents to extract structured text, UI elements, and provide content summaries with confidence le... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 561 次。

如何安装 Claw Vision？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install claw-vision」即可一键安装，无需额外配置。

Claw Vision 是免费的吗？

是的，Claw Vision 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Claw Vision 支持哪些平台？

Claw Vision 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Claw Vision？

由 Puma1981（@puma1981）开发并维护，当前版本 v1.0.0。