← Back to Skills Marketplace
andyzwp

explain image

by andyzwp · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ⚠ suspicious
1585
Downloads
1
Stars
1
Active Installs
2
Versions
Install in OpenClaw
/install image-read
Description
使用智谱AI的GLM-4V-Flash免费多模态API理解图片内容。当用户需要理解图片内容、描述图片、识别图中物体时使用此skill。
README (SKILL.md)

Image Understanding Skill

这个skill用于理解图片内容,使用智谱AI的GLM-4V-Flash免费多模态API。

何时使用

当用户需要理解图片内容时使用此skill,例如:

  • "这张图里是什么"
  • "描述一下这个图片"
  • "这张细胞图显示了什么"
  • "分析这张图片的内容"

前置要求

用户需要:

  1. 访问 https://bigmodel.cn/ 注册账号
  2. 获取API Key:https://bigmodel.cn/console/apikeys
  3. 将API Key以环境变量方式提供:ZHIPU_API_KEY

使用方法

方式一:使用内置脚本

skill提供了 scripts/analyze_image.py 脚本,可以直接调用:

python scripts/analyze_image.py \x3C图片路径> "\x3C问题>"

参数:

  • \x3C图片路径>: 图片文件路径(建议使用jpg格式)
  • \x3C问题>: 要问的问题,如"这张图片里有什么"

方式二:手动调用API

如果没有脚本,可以直接用Python调用智谱API:

from zhipuai import ZhipuAI

client = ZhipuAI(api_key="你的API Key")

response = client.chat.completions.create(
    model="glm-4v",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "这张图片里有什么?请详细描述。"},
                {"type": "image_url", "image_url": {"url": "图片URL或base64"}}
            ]
        }
    ]
)

print(response.choices[0].message.content)

输出格式

返回图片内容的详细描述,包括:

  • 图像中的主要物体/人物
  • 场景/背景
  • 颜色、布局等视觉特征
  • 文字(如果有)
  • 可能的含义或推断

注意事项

  • GLM-4V-Flash完全免费,但有调用频率限制
  • 支持图片URL或Base64编码
  • 最佳支持图片尺寸:1024x1024以内
  • 建议使用JPG格式,PNG格式可能存在兼容性问题
Usage Guidance
This skill legitimately sends images to the ZhipuAI/GLM-4V API and needs your ZHIPU_API_KEY. Before installing or running it: (1) be aware that any local image you pass will be uploaded to a third-party service (bigmodel.cn); avoid sending sensitive images. (2) The registry metadata failed to declare the required ZHIPU_API_KEY — treat that as a transparency/quality issue and confirm you are comfortable providing that secret. (3) If you must install the zhipuai package, inspect the package source (PyPI) and version; prefer installing in a virtualenv. (4) If you want higher assurance, ask the publisher to update the skill metadata to declare ZHIPU_API_KEY and provide a reproducible install spec, or request the skill be reviewed/published from a known source. If you trust bigmodel.cn and are comfortable uploading images and using your API key, the skill's behavior is coherent with its description; otherwise do not install.
Capability Analysis
Type: OpenClaw Skill Name: image-read Version: 1.0.1 The skill bundle provides a legitimate implementation for image analysis using the ZhipuAI GLM-4V-Flash API. The primary script, `scripts/analyze_image.py`, correctly handles image encoding and API communication without any signs of data exfiltration, unauthorized network calls, or malicious execution. The instructions in `SKILL.md` are consistent with the code's functionality and do not contain any prompt-injection attempts or suspicious commands.
Capability Assessment
Purpose & Capability
The skill's stated purpose (image understanding via 智谱AI/GLM-4V) matches the code and instructions: it uses the zhipuai SDK and the GLM-4V model. However the registry metadata declares no required environment variables or credentials while both SKILL.md and scripts clearly require a ZHIPU_API_KEY — this metadata mismatch is an incoherence.
Instruction Scope
SKILL.md and the script stay within the stated purpose: they read either a local image or an image URL, encode local files to base64, and call the GLM-4V API. The instructions prompt for an API key if not set and tell users to register at bigmodel.cn. Note: local image files are read and uploaded to an external third‑party service (bigmodel.cn), which is expected for this functionality but has privacy implications.
Install Mechanism
There is no install spec (instruction-only), and the Python script requires the third-party package 'zhipuai' (the script prints a pip install suggestion). This is a normal setup but the package installation is not automated nor pinned; you should verify the package source (PyPI) and review it before installing.
Credentials
The script and SKILL.md require a single credential ZHIPU_API_KEY (appropriate for the service). However the skill's declared registry requirements list no required environment variables or primary credential — that omission is inconsistent and reduces transparency about what secrets will be used. No unrelated credentials are requested.
Persistence & Privilege
The skill does not request persistent presence (always:false), does not modify other skills or system configs, and does not write persistent credentials. It only reads the environment or prompts the user at runtime.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install image-read
  3. After installation, invoke the skill by name or use /image-read
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
No changes detected in this version. - No file changes were made. - Functionality and documentation remain the same as the previous release.
v1.0.0
Image-understanding v1.0.0 - Initial public release of the skill for image understanding. - Added support for analyzing and describing image content using the ZhipuAI GLM-4V-Flash API. - Includes user-friendly usage instructions and requirements in the new documentation. - Provided a ready-to-use Python script for direct command line usage. - Updated and clarified documentation, replacing the previous version and standardizing filenames for consistency.
Metadata
Slug image-read
Version 1.0.1
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 2
Frequently Asked Questions

What is explain image?

使用智谱AI的GLM-4V-Flash免费多模态API理解图片内容。当用户需要理解图片内容、描述图片、识别图中物体时使用此skill。 It is an AI Agent Skill for Claude Code / OpenClaw, with 1585 downloads so far.

How do I install explain image?

Run "/install image-read" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is explain image free?

Yes, explain image is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does explain image support?

explain image is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created explain image?

It is built and maintained by andyzwp (@andyzwp); the current version is v1.0.1.

💬 Comments