← 返回 Skills 市场

image-understanding

Name: image-understanding
Author: isabellazhangym

作者 IsabellaZhangYM · GitHub ↗ · v0.0.4

cross-platform ✓ 安全检测通过

599

总下载

当前安装

版本数

在 OpenClaw 中安装

/install image-understanding

功能描述

智谱 GLM-4.6V 多模态视觉模型集成插件。支持 128K 长上下文、文档解析、视频理解与原生工具调用。具备工业级安全审计指引。

使用说明 (SKILL.md)

---
name: glm-4.6v-connector
description: "智谱 GLM-4.6V 多模态视觉模型专业集成插件。支持图像理解、128K 文档解析及自动化工具调用。"
version: "1.0.0"
homepage: "https://github.com/zai-org/GLM-V"
repository: "https://github.com/zai-org/GLM-V.git"
authors: ["IsabellaZhangYM"]
license: "MIT"

# 🛠️ 关键修复：补齐 Registry 所需的元数据声明
requirements:
  environment_variables:
    - ZHIPUAI_API_KEY
  dependencies:
    python:
      - "zhipuai>=2.1.0"
  install_command: "pip install zhipuai"

credentials:
  ZHIPUAI_API_KEY:
    description: "智谱 AI 开放平台 (bigmodel.cn) 的 API Key"
    required: true
    source: "environment_variable"
---

# 👁️ GLM-4.6V 视觉模型集成指南

本 Skill 为开发者提供安全、高效的智谱多模态模型接入能力，适用于自动化文档处理、UI 复刻及智能视觉理解场景。

## 🛡️ 安全合规指引

1. **凭据安全**：本插件强制要求通过环境变量 `ZHIPUAI_API_KEY` 注入凭据。禁止在代码中硬编码任何密钥。
2. **隐私保护**：在上传企业财报、身份证明或敏感截图前，请务必进行局部遮盖或数据脱敏。
3. **调用审计**：建议在 `client` 初始化时启用日志记录，以便追踪工具调用 (Function Call) 的行为。

---

## ⚡ 快速开始

### 1. 环境准备
确保你的环境中已安装 Python 3.8+ 及官方 SDK：
```bash
pip install zhipuai

2. 基础调用示例

import os
from zhipuai import ZhipuAI

# 使用环境变量确保持久安全
client = ZhipuAI(api_key=os.environ.get("ZHIPUAI_API_KEY"))

def analyze_vision(image_path):
    response = client.chat.completions.create(
        model="glm-4.6v",
        messages=[{
            "role": "user", 
            "content": [
                {"type": "text", "text": "提取图中的关键信息并输出为 JSON"},
                {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,...(base64)..."}}
            ]
        }]
    )
    return response.choices[0].message.content

🏗️ 核心功能与场景

场景	推荐模型	特色能力
高精度 OCR	`glm-4.6v`	复杂排版、手写体、公式解析
超长文档/PPT	`glm-4.6v`	128K 上下文，支持 200 页文件深度摘要
成本敏感任务	`glm-4.6v-flash`	基础识图，完全免费

🔗 开发者资源

官方文档: bigmodel.cn/dev/api
MCP 协议集成: 文档入口
安全报告: 本 Skill 已通过初步静态扫描，建议在沙盒环境运行。

安全使用建议

This skill appears coherent for integrating ZhipuAI's GLM-4.6V model, but take normal precautions: only provide a ZHIPUAI_API_KEY you trust and avoid pasting it into chats; run initial tests in a sandbox; redact or mask sensitive parts of images before sending; verify the 'zhipuai' Python package on PyPI (watch for typosquatting), pin a specific version (e.g., zhipuai==2.1.x), and review its release/source repository. If you cannot verify the SDK or you handle highly sensitive images, consider using an alternative workflow or isolated environment.

功能分析

Type: OpenClaw Skill Name: image-understanding Version: 0.0.4 The skill bundle is benign. It provides an integration for the Zhipu GLM-4.6V multimodal vision model. The `skill.md` correctly specifies dependencies and installation via `pip install zhipuai`. It securely handles API keys by requiring them from environment variables (`ZHIPUAI_API_KEY`) and explicitly advises against hardcoding. The example code demonstrates standard API interaction without any signs of data exfiltration, malicious execution, persistence, or prompt injection against the agent. All external links point to the legitimate `bigmodel.cn` domain.

能力评估

✓ Purpose & Capability

Name/description (GLM-4.6V multimodal image understanding) align with the declared requirements: an API key for ZhipuAI and the zhipuai Python SDK. Requiring ZHIPUAI_API_KEY and the zhipuai package is proportionate for this integration.

✓ Instruction Scope

SKILL.md contains usage examples that only send image data and text to the GLM model and recommends credential handling and data redaction. It does not instruct reading unrelated system files, other env vars, or transmitting data to unexpected endpoints; external endpoints referenced are the documented bigmodel.cn resources.

ℹ Install Mechanism

The registry bundle has no formal install spec, but SKILL.md recommends 'pip install zhipuai' and lists zhipuai>=2.1.0. Installing the SDK via pip is expected for this skill but carries normal supply-chain risk (package install scripts). Verify the package source and pin versions before installing.

✓ Credentials

Only one environment credential is required (ZHIPUAI_API_KEY), which is appropriate for a hosted-model integration. No unrelated credentials or config paths are requested.

✓ Persistence & Privilege

Skill is instruction-only, always:false, and does not request persistent system-wide privileges or modifications to other skills; autonomous invocation is allowed (platform default) but not elevated by the skill.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install image-understanding
安装完成后，直接呼叫该 Skill 的名称或使用 /image-understanding 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v0.0.4

- Added essential metadata including environment variables, dependencies, and install instructions for Registry compatibility. - Updated and clarified the plugin description and key usage scenarios. - Improved security guidance, emphasizing environment variable requirements for API keys. - Provided updated quick-start instructions and sample code. - Enhanced documentation layout with clearer sections on security, features, and developer resources.

v0.0.3

- 修正并补充 Registry 所需元数据，包括环境变量、依赖及安装命令说明 - 更新插件描述，增强对专业集成和自动化场景的定位 - 优化安全与隐私合规流程，明确环境变量凭据的强制要求 - 提供更清晰的环境准备和官方 SDK 安装指引 - 基础模型调用示例调整为多模态输入格式，突出 JSON 输出用例 - 调整功能场景表，提升阅读清晰度，标注各模型免费策略及推荐适用范围

v0.0.2

优化合规性

v0.0.1

ImageUnderstanding_GLM-4.6V

元数据

Slug image-understanding

版本 0.0.4

许可证 —

累计安装 8

当前安装数 8

历史版本数 4

常见问题