← 返回 Skills 市场

Minimax Image Understanding

Name: Minimax Image Understanding
Author: aidescend

作者 aidescend · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

844

总下载

当前安装

版本数

在 OpenClaw 中安装

/install minimax-image-understanding

功能描述

使用多模态大模型理解图片内容，生成业务含义描述。支持多种模型：(1) MiniMax VLM (2) OpenAI GPT-4V (3) Claude Vision。用于理解截图、图表、文档照片等，生成精准的文字描述。

安全使用建议

This skill appears to do what it says (send a local image to a selected multimodal model and return a description), but before installing or using it you should: - Confirm dependencies: ensure the runtime has the 'curl' binary (used by the MiniMax path) and the Python 'requests' package (used for OpenAI/Anthropic). The skill's metadata incorrectly states "no required binaries". - Consider privacy: the script base64-encodes and sends the entire image to remote APIs. Do not use it on images containing sensitive or private data unless you trust the target service and understand its retention policy. - Verify provider endpoints and keys: validate MINIMAX_API_HOST if you set it (default is https://api.minimaxi.com) and never hard-code API keys; supply them via environment variables as instructed. - Review model choice and costs: using OpenAI/Anthropic may incur usage charges and have different input formats/limits — test with non-sensitive images first. If you want stronger assurance, request an updated skill package that explicitly documents runtime dependencies (curl, requests) and includes checks that fail with clear messages when dependencies are missing.

功能分析

Type: OpenClaw Skill Name: minimax-image-understanding Version: 1.0.0 The skill bundle provides a utility for image understanding using MiniMax, OpenAI, or Anthropic APIs. The script `scripts/understand_image.py` correctly handles API keys via environment variables and transmits image data to the respective service providers as described. No evidence of malicious intent, data exfiltration to unauthorized endpoints, or command injection vulnerabilities was found.

能力评估

✓ Purpose & Capability

Name/description (image understanding via MiniMax/OpenAI/Anthropic) align with the included script and SKILL.md: the code reads a local image, base64-encodes it, and sends it to the selected model provider for analysis. Required environment variables listed in SKILL.md correspond to the providers used.

ℹ Instruction Scope

Runtime instructions and the script are scoped to reading a local image file and sending it to a model provider; they do not access unrelated system files or secrets. However the skill will transmit the entire image (base64) to remote APIs, so image confidentiality and provider trust are relevant security considerations that the user should evaluate.

⚠ Install Mechanism

No install spec is provided, but the script relies on external tools/libraries: it calls the 'curl' binary for the MiniMax path and imports the Python 'requests' module for OpenAI/Anthropic. The registry metadata claims 'required binaries: none' which contradicts the actual script requirements — this omission can cause runtime failures and indicates incomplete packaging/ documentation.

✓ Credentials

The env vars mentioned (MINIMAX_API_KEY, MINIMAX_API_HOST, OPENAI_API_KEY, ANTHROPIC_API_KEY) match the services the skill integrates with and are proportionate to its purpose. No unrelated credentials or additional config paths are requested.

✓ Persistence & Privilege

The skill does not request permanent presence (always:false) and does not modify other skills or system-wide settings. It runs on demand and does not persist credentials or change agent configuration.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install minimax-image-understanding
安装完成后，直接呼叫该 Skill 的名称或使用 /minimax-image-understanding 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

minimax-image-understanding v1.0.0 - Initial release supporting multimodal image understanding using large models. - Compatible with MiniMax VLM (default, recommended for Chinese), OpenAI GPT-4V, and Claude Vision (Anthropic). - Simple CLI tool for generating business-centric descriptions of images, charts, and document photos. - Environment-variable-based configuration for easy model selection. - Output focuses on key content and business logic, omitting positional element listings.

元数据

Slug minimax-image-understanding

版本 1.0.0

许可证 —

累计安装 10

当前安装数 10

历史版本数 1

常见问题

Minimax Image Understanding 是什么？

使用多模态大模型理解图片内容，生成业务含义描述。支持多种模型：(1) MiniMax VLM (2) OpenAI GPT-4V (3) Claude Vision。用于理解截图、图表、文档照片等，生成精准的文字描述。它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 844 次。

如何安装 Minimax Image Understanding？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install minimax-image-understanding」即可一键安装，无需额外配置。

Minimax Image Understanding 是免费的吗？

是的，Minimax Image Understanding 完全免费（开源免费），可自由下载、安装和使用。

Minimax Image Understanding 支持哪些平台？

Minimax Image Understanding 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Minimax Image Understanding？

由 aidescend（@aidescend）开发并维护，当前版本 v1.0.0。