← Back to Skills Marketplace

image-understanding

Name: image-understanding
Author: isabellazhangym

by IsabellaZhangYM · GitHub ↗ · v0.0.4

cross-platform ✓ Security Clean

599

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install image-understanding

Description

智谱 GLM-4.6V 多模态视觉模型集成插件。支持 128K 长上下文、文档解析、视频理解与原生工具调用。具备工业级安全审计指引。

README (SKILL.md)

---
name: glm-4.6v-connector
description: "智谱 GLM-4.6V 多模态视觉模型专业集成插件。支持图像理解、128K 文档解析及自动化工具调用。"
version: "1.0.0"
homepage: "https://github.com/zai-org/GLM-V"
repository: "https://github.com/zai-org/GLM-V.git"
authors: ["IsabellaZhangYM"]
license: "MIT"

# 🛠️ 关键修复：补齐 Registry 所需的元数据声明
requirements:
  environment_variables:
    - ZHIPUAI_API_KEY
  dependencies:
    python:
      - "zhipuai>=2.1.0"
  install_command: "pip install zhipuai"

credentials:
  ZHIPUAI_API_KEY:
    description: "智谱 AI 开放平台 (bigmodel.cn) 的 API Key"
    required: true
    source: "environment_variable"
---

# 👁️ GLM-4.6V 视觉模型集成指南

本 Skill 为开发者提供安全、高效的智谱多模态模型接入能力，适用于自动化文档处理、UI 复刻及智能视觉理解场景。

## 🛡️ 安全合规指引

1. **凭据安全**：本插件强制要求通过环境变量 `ZHIPUAI_API_KEY` 注入凭据。禁止在代码中硬编码任何密钥。
2. **隐私保护**：在上传企业财报、身份证明或敏感截图前，请务必进行局部遮盖或数据脱敏。
3. **调用审计**：建议在 `client` 初始化时启用日志记录，以便追踪工具调用 (Function Call) 的行为。

---

## ⚡ 快速开始

### 1. 环境准备
确保你的环境中已安装 Python 3.8+ 及官方 SDK：
```bash
pip install zhipuai

2. 基础调用示例

import os
from zhipuai import ZhipuAI

# 使用环境变量确保持久安全
client = ZhipuAI(api_key=os.environ.get("ZHIPUAI_API_KEY"))

def analyze_vision(image_path):
    response = client.chat.completions.create(
        model="glm-4.6v",
        messages=[{
            "role": "user", 
            "content": [
                {"type": "text", "text": "提取图中的关键信息并输出为 JSON"},
                {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,...(base64)..."}}
            ]
        }]
    )
    return response.choices[0].message.content

🏗️ 核心功能与场景

场景	推荐模型	特色能力
高精度 OCR	`glm-4.6v`	复杂排版、手写体、公式解析
超长文档/PPT	`glm-4.6v`	128K 上下文，支持 200 页文件深度摘要
成本敏感任务	`glm-4.6v-flash`	基础识图，完全免费

🔗 开发者资源

官方文档: bigmodel.cn/dev/api
MCP 协议集成: 文档入口
安全报告: 本 Skill 已通过初步静态扫描，建议在沙盒环境运行。

Usage Guidance

This skill appears coherent for integrating ZhipuAI's GLM-4.6V model, but take normal precautions: only provide a ZHIPUAI_API_KEY you trust and avoid pasting it into chats; run initial tests in a sandbox; redact or mask sensitive parts of images before sending; verify the 'zhipuai' Python package on PyPI (watch for typosquatting), pin a specific version (e.g., zhipuai==2.1.x), and review its release/source repository. If you cannot verify the SDK or you handle highly sensitive images, consider using an alternative workflow or isolated environment.

Capability Analysis

Type: OpenClaw Skill Name: image-understanding Version: 0.0.4 The skill bundle is benign. It provides an integration for the Zhipu GLM-4.6V multimodal vision model. The `skill.md` correctly specifies dependencies and installation via `pip install zhipuai`. It securely handles API keys by requiring them from environment variables (`ZHIPUAI_API_KEY`) and explicitly advises against hardcoding. The example code demonstrates standard API interaction without any signs of data exfiltration, malicious execution, persistence, or prompt injection against the agent. All external links point to the legitimate `bigmodel.cn` domain.

Capability Assessment

✓ Purpose & Capability

Name/description (GLM-4.6V multimodal image understanding) align with the declared requirements: an API key for ZhipuAI and the zhipuai Python SDK. Requiring ZHIPUAI_API_KEY and the zhipuai package is proportionate for this integration.

✓ Instruction Scope

SKILL.md contains usage examples that only send image data and text to the GLM model and recommends credential handling and data redaction. It does not instruct reading unrelated system files, other env vars, or transmitting data to unexpected endpoints; external endpoints referenced are the documented bigmodel.cn resources.

ℹ Install Mechanism

The registry bundle has no formal install spec, but SKILL.md recommends 'pip install zhipuai' and lists zhipuai>=2.1.0. Installing the SDK via pip is expected for this skill but carries normal supply-chain risk (package install scripts). Verify the package source and pin versions before installing.

✓ Credentials

Only one environment credential is required (ZHIPUAI_API_KEY), which is appropriate for a hosted-model integration. No unrelated credentials or config paths are requested.

✓ Persistence & Privilege

Skill is instruction-only, always:false, and does not request persistent system-wide privileges or modifications to other skills; autonomous invocation is allowed (platform default) but not elevated by the skill.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install image-understanding
After installation, invoke the skill by name or use /image-understanding
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.0.4

- Added essential metadata including environment variables, dependencies, and install instructions for Registry compatibility. - Updated and clarified the plugin description and key usage scenarios. - Improved security guidance, emphasizing environment variable requirements for API keys. - Provided updated quick-start instructions and sample code. - Enhanced documentation layout with clearer sections on security, features, and developer resources.

v0.0.3

- 修正并补充 Registry 所需元数据，包括环境变量、依赖及安装命令说明 - 更新插件描述，增强对专业集成和自动化场景的定位 - 优化安全与隐私合规流程，明确环境变量凭据的强制要求 - 提供更清晰的环境准备和官方 SDK 安装指引 - 基础模型调用示例调整为多模态输入格式，突出 JSON 输出用例 - 调整功能场景表，提升阅读清晰度，标注各模型免费策略及推荐适用范围

v0.0.2

优化合规性

v0.0.1

ImageUnderstanding_GLM-4.6V

Metadata

Slug image-understanding

Version 0.0.4

License —

All-time Installs 8

Active Installs 8

Total Versions 4

Frequently Asked Questions

What is image-understanding?

智谱 GLM-4.6V 多模态视觉模型集成插件。支持 128K 长上下文、文档解析、视频理解与原生工具调用。具备工业级安全审计指引。 It is an AI Agent Skill for Claude Code / OpenClaw, with 599 downloads so far.

How do I install image-understanding?

Run "/install image-understanding" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is image-understanding free?

Yes, image-understanding is completely free (open-source). You can download, install and use it at no cost.

Which platforms does image-understanding support?

image-understanding is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created image-understanding?

It is built and maintained by IsabellaZhangYM (@isabellazhangym); the current version is v0.0.4.

More Skills