← 返回 Skills 市场
GLM Multimodal Analyzer
作者
TriDefender
· GitHub ↗
· v1.0.0
576
总下载
0
收藏
3
当前安装
1
版本数
在 OpenClaw 中安装
/install multimodal
功能描述
使用GLM-4.6V模型进行多模态内容理解(图片、视频、文档)
安全使用建议
This skill will read local files (images, videos, PDFs), base64-encode them, and send their contents to https://open.bigmodel.cn using a ZHIPU_API_KEY. Before installing: (1) Confirm the skill metadata is corrected to declare ZHIPU_API_KEY; (2) Verify you trust the remote endpoint and the publisher — the Homepage and source are unknown; (3) Do not feed sensitive or private files (passwords, keys, proprietary docs) to the skill; (4) Consider using an ephemeral or scoped API key and audit API usage; (5) If you need higher assurance, request the publisher provide provenance (source repo, signatures) or review the code yourself — the relevant behavior is visible in scripts/analyze.py. If you accept these privacy risks and trust the endpoint, the functionality is coherent; if not, do not install or run with sensitive inputs.
功能分析
Type: OpenClaw Skill
Name: multimodal
Version: 1.0.0
The skill bundle contains a command injection vulnerability in agent.json within the toolHandlers section, where user-provided parameters (input and prompt) are wrapped in single quotes and passed directly to a shell command. This allows an attacker to escape the quotes and execute arbitrary commands on the host system. While the Python script scripts/analyze.py appears to be a legitimate tool for interacting with the Zhipu AI API (open.bigmodel.cn), the insecure handling of shell execution makes the bundle high-risk.
能力评估
Purpose & Capability
The skill's purpose (multimodal analysis via GLM-4.6V) matches the code and agent configuration. However the registry metadata lists no required env vars while SKILL.md and scripts/analyze.py require ZHIPU_API_KEY — an inconsistency in declared requirements. Minor model naming/context inconsistencies (SKILL.md: GLM-4.6V 128K, agent.json/model: 'zai/glm-4.6v-flash', script MODEL='glm-4.6v', MAX_TOKENS=4096) are also present.
Instruction Scope
SKILL.md and analyze.py allow local file paths and will base64-encode entire local files and include them in requests to https://open.bigmodel.cn/api/paas/v4/chat/completions. That behavior is coherent with a multimodal uploader, but it means arbitrary local files (including sensitive documents) may be exfiltrated to the remote API without additional safeguards or filtering.
Install Mechanism
This is an instruction-only skill with no install spec (lowest install risk). README mentions requests will be auto-installed but there is no formal install step; the script exits if requests is missing. No external downloads or packaged installers are used.
Credentials
The runtime requires a single secret ZHIPU_API_KEY (used as a Bearer token) which is proportionate to calling a third-party API. The problem is that the registry metadata did not declare this requirement — the skill should have listed ZHIPU_API_KEY as required.env. Requiring an API key for the claimed purpose is expected, but the omission in metadata and the ability to send arbitrary local files increases risk.
Persistence & Privilege
The skill does not request always:true, does not declare system config paths, and does not modify other skills. It is user-invocable and can be invoked autonomously per platform default (not flagged here).
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install multimodal - 安装完成后,直接呼叫该 Skill 的名称或使用
/multimodal触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Multimodal Analyzer 1.0.0
- Initial release with GLM-4.6V-powered multimodal content understanding.
- Supports image OCR, scene and object analysis, video summarization & keyframe extraction, and document (PDF/table) parsing.
- Includes a deep thinking mode for advanced reasoning.
- Command-line interface for content analysis via script.
- Currently processes one modality at a time; requires publicly accessible URLs for videos.
元数据
常见问题
GLM Multimodal Analyzer 是什么?
使用GLM-4.6V模型进行多模态内容理解(图片、视频、文档). 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 576 次。
如何安装 GLM Multimodal Analyzer?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install multimodal」即可一键安装,无需额外配置。
GLM Multimodal Analyzer 是免费的吗?
是的,GLM Multimodal Analyzer 完全免费(开源免费),可自由下载、安装和使用。
GLM Multimodal Analyzer 支持哪些平台?
GLM Multimodal Analyzer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 GLM Multimodal Analyzer?
由 TriDefender(@tridefender)开发并维护,当前版本 v1.0.0。
推荐 Skills