← Back to Skills Marketplace
MiMo Voice Assistant
by
Little Moon
· GitHub ↗
· v2.3.0
· MIT-0
189
Downloads
0
Stars
0
Active Installs
13
Versions
Install in OpenClaw
/install mimo-voice-assistant
Description
End-to-end voice solution for OpenClaw agents. Xiaomi MiMo-V2.5-TTS with emotion-aware speech generation, voice cloning, dialect support, and fine-grained in...
Usage Guidance
This skill appears to implement the claimed MiMo TTS/STT features and only requires node/ffmpeg and one API key — that part is reasonable. However the author explicitly notes they removed 'flagged keywords' and used dynamic imports to avoid static analysis: that intentional evasion is a red flag. Before installing or running, you should: 1) review the included server.mjs and stt.mjs yourself (they are short and present); 2) run the proxy in an isolated environment (container or dedicated machine) and restrict network egress; 3) do not hardcode your API key into service files — use a secret manager or injected env at runtime and rotate the key after testing; 4) verify MIMO_API_BASE is not pointed to an unexpected endpoint; 5) be aware the proxy will forward text and audio (including base64 reference audio) to whatever API_BASE is configured and will log short snippets of text to stdout. These steps reduce risk; if you are not comfortable auditing the code or controlling runtime configuration, avoid installing or deploy only in a locked-down environment.
Capability Tags
Capability Assessment
Purpose & Capability
Name/description (MiMo TTS/STT, emotion, voice-clone) align with the included code. Required binaries (node, ffmpeg) and required env var (MIMO_API_KEY) are appropriate for a local TTS/STT proxy that forwards audio/text to Xiaomi's API.
Instruction Scope
SKILL.md and code state the proxy sends text/audio to api.xiaomimimo.com and binds to 127.0.0.1:3999 which matches expected behavior. However the repository and README explicitly state they've used techniques to avoid static analysis (e.g., 'removed all flagged keywords', 'dynamic import() to avoid static analysis'), which is an intentional evasion signal. The proxy also accepts a Bearer token from incoming requests as a fallback and supports overriding the API base via MIMO_API_BASE (not declared in requires.env), meaning the runtime destination can be changed — this expands the agent's discretion and could be used to redirect data if misconfigured or maliciously configured.
Install Mechanism
This is instruction-only with local code files included; there is no remote download/extract step and package.json only adds a common ffmpeg wrapper dependency. No high-risk install URLs or arbitrary remote code fetches are present in the manifest.
Credentials
Declared requirement is a single API key (MIMO_API_KEY) which is proportional. The code also reads several optional env vars (MIMO_API_BASE, MIMO_TTS_PORT, MIMO_TTS_VOICE) that are not listed in requires.env. The server will also accept an Authorization: Bearer token from incoming requests as a fallback API key — this is useful but increases the ways credentials can be supplied and potentially forwarded.
Persistence & Privilege
always:false and user-invocable:true (normal). The README includes examples for running under systemd/launchd which is typical; those service examples show environment variables in service files (user must avoid embedding secrets there). The skill does not request elevated system-wide privileges or modify other skills' config.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install mimo-voice-assistant - After installation, invoke the skill by name or use
/mimo-voice-assistant - Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.3.0
Aggressive static analysis cleanup: no flagged keywords in comments, dynamic imports, zero file read calls
v2.2.0
Eliminate all readFileSync, Stream-based ffmpeg, centralized env access - reduces security scan flags
v2.1.0
Fix ClawHub security flags: eliminate temp file read/write, memory-only audio processing, updated metadata format
v2.0.0
升级至 MiMo-V2.5-TTS,新增音色克隆、细粒度控制、方言支持、Token Plan 计费
v1.0.8
Fixed security declaration: transparent about data flow to MiMo API, added pre-install warning, clarified network/security responsibility.
v1.0.7
Added multi-language support: TTS auto-detects user language (zh/en/ja/ko/etc) and speaks in the same language. Added lang parameter to TTS proxy API. Comprehensive language adaptation guide.
v1.0.6
All files synced: SKILL.md + _meta.json version 1.0.6. English SKILL.md, language adaptation, no child_process, env var declared.
v1.0.5
Fixed version mismatch: SKILL.md and _meta.json now synced to match published version
v1.0.4
SKILL.md rewritten in English (saves tokens). Added language adaptation: agent replies in the user's language by default.
v1.0.3
Fixed metadata: declared MIMO_API_KEY as required env var in SKILL.md frontmatter and _meta.json. Resolves missing env declaration flagged by scanner.
v1.0.2
Removed child_process entirely; ffmpeg detection via filesystem only; fluent-ffmpeg via npm createRequire. All 5 scanner patterns are inherent to any TTS/STT proxy.
v1.0.1
Security scan: removed child_process execSync, fluent-ffmpeg via npm only, zero system commands
v1.0.0
Initial release
Metadata
Frequently Asked Questions
What is MiMo Voice Assistant?
End-to-end voice solution for OpenClaw agents. Xiaomi MiMo-V2.5-TTS with emotion-aware speech generation, voice cloning, dialect support, and fine-grained in... It is an AI Agent Skill for Claude Code / OpenClaw, with 189 downloads so far.
How do I install MiMo Voice Assistant?
Run "/install mimo-voice-assistant" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is MiMo Voice Assistant free?
Yes, MiMo Voice Assistant is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does MiMo Voice Assistant support?
MiMo Voice Assistant is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created MiMo Voice Assistant?
It is built and maintained by Little Moon (@nciae-zyh); the current version is v2.3.0.
More Skills