← 返回 Skills 市场
tridefender

GLM Multimodal Analyzer

作者 TriDefender · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
576
总下载
0
收藏
3
当前安装
1
版本数
在 OpenClaw 中安装
/install multimodal
功能描述
使用GLM-4.6V模型进行多模态内容理解(图片、视频、文档)
安全使用建议
This skill will read local files (images, videos, PDFs), base64-encode them, and send their contents to https://open.bigmodel.cn using a ZHIPU_API_KEY. Before installing: (1) Confirm the skill metadata is corrected to declare ZHIPU_API_KEY; (2) Verify you trust the remote endpoint and the publisher — the Homepage and source are unknown; (3) Do not feed sensitive or private files (passwords, keys, proprietary docs) to the skill; (4) Consider using an ephemeral or scoped API key and audit API usage; (5) If you need higher assurance, request the publisher provide provenance (source repo, signatures) or review the code yourself — the relevant behavior is visible in scripts/analyze.py. If you accept these privacy risks and trust the endpoint, the functionality is coherent; if not, do not install or run with sensitive inputs.
功能分析
Type: OpenClaw Skill Name: multimodal Version: 1.0.0 The skill bundle contains a command injection vulnerability in agent.json within the toolHandlers section, where user-provided parameters (input and prompt) are wrapped in single quotes and passed directly to a shell command. This allows an attacker to escape the quotes and execute arbitrary commands on the host system. While the Python script scripts/analyze.py appears to be a legitimate tool for interacting with the Zhipu AI API (open.bigmodel.cn), the insecure handling of shell execution makes the bundle high-risk.
能力评估
Purpose & Capability
The skill's purpose (multimodal analysis via GLM-4.6V) matches the code and agent configuration. However the registry metadata lists no required env vars while SKILL.md and scripts/analyze.py require ZHIPU_API_KEY — an inconsistency in declared requirements. Minor model naming/context inconsistencies (SKILL.md: GLM-4.6V 128K, agent.json/model: 'zai/glm-4.6v-flash', script MODEL='glm-4.6v', MAX_TOKENS=4096) are also present.
Instruction Scope
SKILL.md and analyze.py allow local file paths and will base64-encode entire local files and include them in requests to https://open.bigmodel.cn/api/paas/v4/chat/completions. That behavior is coherent with a multimodal uploader, but it means arbitrary local files (including sensitive documents) may be exfiltrated to the remote API without additional safeguards or filtering.
Install Mechanism
This is an instruction-only skill with no install spec (lowest install risk). README mentions requests will be auto-installed but there is no formal install step; the script exits if requests is missing. No external downloads or packaged installers are used.
Credentials
The runtime requires a single secret ZHIPU_API_KEY (used as a Bearer token) which is proportionate to calling a third-party API. The problem is that the registry metadata did not declare this requirement — the skill should have listed ZHIPU_API_KEY as required.env. Requiring an API key for the claimed purpose is expected, but the omission in metadata and the ability to send arbitrary local files increases risk.
Persistence & Privilege
The skill does not request always:true, does not declare system config paths, and does not modify other skills. It is user-invocable and can be invoked autonomously per platform default (not flagged here).
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install multimodal
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /multimodal 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Multimodal Analyzer 1.0.0 - Initial release with GLM-4.6V-powered multimodal content understanding. - Supports image OCR, scene and object analysis, video summarization & keyframe extraction, and document (PDF/table) parsing. - Includes a deep thinking mode for advanced reasoning. - Command-line interface for content analysis via script. - Currently processes one modality at a time; requires publicly accessible URLs for videos.
元数据
Slug multimodal
版本 1.0.0
许可证
累计安装 3
当前安装数 3
历史版本数 1
常见问题

GLM Multimodal Analyzer 是什么?

使用GLM-4.6V模型进行多模态内容理解(图片、视频、文档). 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 576 次。

如何安装 GLM Multimodal Analyzer?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install multimodal」即可一键安装,无需额外配置。

GLM Multimodal Analyzer 是免费的吗?

是的,GLM Multimodal Analyzer 完全免费(开源免费),可自由下载、安装和使用。

GLM Multimodal Analyzer 支持哪些平台?

GLM Multimodal Analyzer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 GLM Multimodal Analyzer?

由 TriDefender(@tridefender)开发并维护,当前版本 v1.0.0。

💬 留言讨论