← 返回 Skills 市场
andyzwp

explain image

作者 andyzwp · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ⚠ suspicious
1585
总下载
1
收藏
1
当前安装
2
版本数
在 OpenClaw 中安装
/install image-read
功能描述
使用智谱AI的GLM-4V-Flash免费多模态API理解图片内容。当用户需要理解图片内容、描述图片、识别图中物体时使用此skill。
使用说明 (SKILL.md)

Image Understanding Skill

这个skill用于理解图片内容,使用智谱AI的GLM-4V-Flash免费多模态API。

何时使用

当用户需要理解图片内容时使用此skill,例如:

  • "这张图里是什么"
  • "描述一下这个图片"
  • "这张细胞图显示了什么"
  • "分析这张图片的内容"

前置要求

用户需要:

  1. 访问 https://bigmodel.cn/ 注册账号
  2. 获取API Key:https://bigmodel.cn/console/apikeys
  3. 将API Key以环境变量方式提供:ZHIPU_API_KEY

使用方法

方式一:使用内置脚本

skill提供了 scripts/analyze_image.py 脚本,可以直接调用:

python scripts/analyze_image.py \x3C图片路径> "\x3C问题>"

参数:

  • \x3C图片路径>: 图片文件路径(建议使用jpg格式)
  • \x3C问题>: 要问的问题,如"这张图片里有什么"

方式二:手动调用API

如果没有脚本,可以直接用Python调用智谱API:

from zhipuai import ZhipuAI

client = ZhipuAI(api_key="你的API Key")

response = client.chat.completions.create(
    model="glm-4v",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "这张图片里有什么?请详细描述。"},
                {"type": "image_url", "image_url": {"url": "图片URL或base64"}}
            ]
        }
    ]
)

print(response.choices[0].message.content)

输出格式

返回图片内容的详细描述,包括:

  • 图像中的主要物体/人物
  • 场景/背景
  • 颜色、布局等视觉特征
  • 文字(如果有)
  • 可能的含义或推断

注意事项

  • GLM-4V-Flash完全免费,但有调用频率限制
  • 支持图片URL或Base64编码
  • 最佳支持图片尺寸:1024x1024以内
  • 建议使用JPG格式,PNG格式可能存在兼容性问题
安全使用建议
This skill legitimately sends images to the ZhipuAI/GLM-4V API and needs your ZHIPU_API_KEY. Before installing or running it: (1) be aware that any local image you pass will be uploaded to a third-party service (bigmodel.cn); avoid sending sensitive images. (2) The registry metadata failed to declare the required ZHIPU_API_KEY — treat that as a transparency/quality issue and confirm you are comfortable providing that secret. (3) If you must install the zhipuai package, inspect the package source (PyPI) and version; prefer installing in a virtualenv. (4) If you want higher assurance, ask the publisher to update the skill metadata to declare ZHIPU_API_KEY and provide a reproducible install spec, or request the skill be reviewed/published from a known source. If you trust bigmodel.cn and are comfortable uploading images and using your API key, the skill's behavior is coherent with its description; otherwise do not install.
功能分析
Type: OpenClaw Skill Name: image-read Version: 1.0.1 The skill bundle provides a legitimate implementation for image analysis using the ZhipuAI GLM-4V-Flash API. The primary script, `scripts/analyze_image.py`, correctly handles image encoding and API communication without any signs of data exfiltration, unauthorized network calls, or malicious execution. The instructions in `SKILL.md` are consistent with the code's functionality and do not contain any prompt-injection attempts or suspicious commands.
能力评估
Purpose & Capability
The skill's stated purpose (image understanding via 智谱AI/GLM-4V) matches the code and instructions: it uses the zhipuai SDK and the GLM-4V model. However the registry metadata declares no required environment variables or credentials while both SKILL.md and scripts clearly require a ZHIPU_API_KEY — this metadata mismatch is an incoherence.
Instruction Scope
SKILL.md and the script stay within the stated purpose: they read either a local image or an image URL, encode local files to base64, and call the GLM-4V API. The instructions prompt for an API key if not set and tell users to register at bigmodel.cn. Note: local image files are read and uploaded to an external third‑party service (bigmodel.cn), which is expected for this functionality but has privacy implications.
Install Mechanism
There is no install spec (instruction-only), and the Python script requires the third-party package 'zhipuai' (the script prints a pip install suggestion). This is a normal setup but the package installation is not automated nor pinned; you should verify the package source (PyPI) and review it before installing.
Credentials
The script and SKILL.md require a single credential ZHIPU_API_KEY (appropriate for the service). However the skill's declared registry requirements list no required environment variables or primary credential — that omission is inconsistent and reduces transparency about what secrets will be used. No unrelated credentials are requested.
Persistence & Privilege
The skill does not request persistent presence (always:false), does not modify other skills or system configs, and does not write persistent credentials. It only reads the environment or prompts the user at runtime.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install image-read
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /image-read 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
No changes detected in this version. - No file changes were made. - Functionality and documentation remain the same as the previous release.
v1.0.0
Image-understanding v1.0.0 - Initial public release of the skill for image understanding. - Added support for analyzing and describing image content using the ZhipuAI GLM-4V-Flash API. - Includes user-friendly usage instructions and requirements in the new documentation. - Provided a ready-to-use Python script for direct command line usage. - Updated and clarified documentation, replacing the previous version and standardizing filenames for consistency.
元数据
Slug image-read
版本 1.0.1
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 2
常见问题

explain image 是什么?

使用智谱AI的GLM-4V-Flash免费多模态API理解图片内容。当用户需要理解图片内容、描述图片、识别图中物体时使用此skill。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1585 次。

如何安装 explain image?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install image-read」即可一键安装,无需额外配置。

explain image 是免费的吗?

是的,explain image 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

explain image 支持哪些平台?

explain image 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 explain image?

由 andyzwp(@andyzwp)开发并维护,当前版本 v1.0.1。

💬 留言讨论