← Back to Skills Marketplace
yhlorra

MiniMax Multimodal Toolkit

by zylorra · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
1344
Downloads
2
Stars
2
Active Installs
1
Versions
Install in OpenClaw
/install yh-minimax-multimodal-toolkit
Description
Generate and process speech, music, video, and images using MiniMax AI with voice cloning, custom voices, multi-scene video, and FFmpeg-based media tools.
Usage Guidance
This skill contains shell scripts that legitimately call MiniMax APIs and use FFmpeg. Before installing or running it: - Expect to provide MINIMAX_API_KEY and MINIMAX_API_HOST (SKILL.md requires these) even though the registry metadata does not list them — the metadata omission is a red flag. Ask the publisher to correct metadata if you need vetted declarations. - The scripts load .env files from the skill project root and from your current working directory and export each key=value they find. To avoid unintentionally exposing unrelated secrets, run the scripts from an isolated directory that does not contain a .env with other credentials, or remove/inspect any .env first. - check_environment.sh can print partial API key text when the key format is invalid. Avoid running these scripts where stdout/stderr is logged to systems you do not control, and avoid pasting your key into public chat. - If you proceed, use a scoped/limited MiniMax API key, test in an isolated environment (container or ephemeral VM), and rotate the key afterward. Review the scripts (they are pure bash) if you have security concerns; they only contact the documented MiniMax hosts (https://api.minimaxi.com or https://api.minimax.io) and use curl/ffmpeg/jq/xxd. Given the mismatches (metadata vs runtime) and the .env-loading behavior, treat this skill cautiously and correct or sandbox before use.
Capability Analysis
Type: OpenClaw Skill Name: yh-minimax-multimodal-toolkit Version: 1.0.0 The toolkit is a legitimate and well-structured set of Bash scripts for interacting with MiniMax multimodal APIs (TTS, Music, Video, and Image). It utilizes standard system utilities like curl, ffmpeg, jq, and xxd to handle API requests and media processing. The instructions in SKILL.md are focused on task execution and include safety-conscious directions, such as ensuring output is restricted to a specific directory and advising the agent to seek user confirmation when configuring environment variables. No evidence of data exfiltration, malicious persistence, or deceptive prompt injection was found.
Capability Assessment
Purpose & Capability
The skill's name/description (MiniMax multimodal generation) matches the included scripts (TTS, music, image, video, FFmpeg tools). However the registry metadata lists no required environment variables or primary credential while the SKILL.md and scripts clearly require MINIMAX_API_KEY and MINIMAX_API_HOST. That metadata omission is an incoherence the user should notice.
Instruction Scope
SKILL.md instructs the agent/user to set MINIMAX_API_KEY and MINIMAX_API_HOST and to run the provided bash scripts. The scripts themselves load a .env from two locations (the skill project root and the agent current working directory) and export all key=value pairs they find (if not already set). Loading and exporting arbitrary keys from the agent's working directory .env may pull unrelated secrets into the script environment. Also check_environment.sh prints partial API key when format is invalid, which could leak key fragments into logs.
Install Mechanism
There is no installer (instruction-only install spec), the tool is pure shell scripts using standard system binaries (curl, ffmpeg, jq, xxd). The required tools are proportionate to media generation/processing tasks; no remote arbitrary-code download/install URLs are present.
Credentials
The runtime requires MINIMAX_API_KEY and MINIMAX_API_HOST (documented in SKILL.md and used in scripts) but the skill registry metadata declares no required env vars or primary credential — a direct mismatch. Additionally, the load_env behavior will import and export any variables found in $(pwd)/.env (agent cwd), which can cause unrelated secrets to be exposed to the script environment. The scripts do transmit MINIMAX_API_KEY to the documented MiniMax endpoints (expected), but exporting other env vars without restriction is disproportionate.
Persistence & Privilege
The skill does not request permanent/always-on inclusion and does not modify other skills or system-wide settings. It runs as invoked. No elevated platform privileges are requested in the bundle.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install yh-minimax-multimodal-toolkit
  3. After installation, invoke the skill by name or use /yh-minimax-multimodal-toolkit
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial publish
Metadata
Slug yh-minimax-multimodal-toolkit
Version 1.0.0
License MIT-0
All-time Installs 2
Active Installs 2
Total Versions 1
Frequently Asked Questions

What is MiniMax Multimodal Toolkit?

Generate and process speech, music, video, and images using MiniMax AI with voice cloning, custom voices, multi-scene video, and FFmpeg-based media tools. It is an AI Agent Skill for Claude Code / OpenClaw, with 1344 downloads so far.

How do I install MiniMax Multimodal Toolkit?

Run "/install yh-minimax-multimodal-toolkit" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is MiniMax Multimodal Toolkit free?

Yes, MiniMax Multimodal Toolkit is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does MiniMax Multimodal Toolkit support?

MiniMax Multimodal Toolkit is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created MiniMax Multimodal Toolkit?

It is built and maintained by zylorra (@yhlorra); the current version is v1.0.0.

💬 Comments