mlx-whisper
/install jimmy-claw-mlx-whisper
mlx-whisper — Local Voice Transcription for Apple Silicon
Enables automatic transcription of voice notes in OpenClaw using Apple's MLX framework. No API key required. Works fully offline. ~60× faster than standard Whisper on M1/M2/M3/M4.
How it works
- User sends a voice note (Telegram
.ogg/ WhatsApp.opus) - OpenClaw downloads the audio file
- Passes it to
mlx-whisper-transcribe.shvia{{MediaPath}} - Transcript is injected as the message body
- Agent replies to the text content
Setup
Step 1 — Install mlx-whisper
pip3 install mlx-whisper
Verify:
python3 -c "import mlx_whisper; print('OK')"
Step 2 — Install the wrapper script
Find the Python bin path:
python3 -m site --user-base
# e.g. /Users/\x3Cyou>/Library/Python/3.9
Copy bin/mlx-whisper-transcribe.sh from this skill to \x3Cuser-base>/bin/mlx-whisper-transcribe.sh, then make it executable:
PYBIN=$(python3 -m site --user-base)/bin
cp {baseDir}/bin/mlx-whisper-transcribe.sh "$PYBIN/mlx-whisper-transcribe.sh"
chmod +x "$PYBIN/mlx-whisper-transcribe.sh"
Test it:
"$PYBIN/mlx-whisper-transcribe.sh" /path/to/audio.ogg
# First run downloads the model (~465MB). Subsequent runs are instant.
Step 3 — Configure OpenClaw
Add to ~/.openclaw/openclaw.json under tools.media.audio:
{
"tools": {
"media": {
"audio": {
"enabled": true,
"models": [
{
"type": "cli",
"command": "\x3Cuser-base>/bin/mlx-whisper-transcribe.sh",
"args": ["{{MediaPath}}"],
"timeoutSeconds": 60
}
]
}
}
}
}
Replace \x3Cuser-base> with the output of python3 -m site --user-base.
Step 4 — Restart OpenClaw
openclaw gateway restart
Or restart the OpenClaw app from the menu bar.
Models
The wrapper uses whisper-small-mlx by default (465MB, good balance of speed and accuracy).
To change, edit bin/mlx-whisper-transcribe.sh and update path_or_hf_repo:
| Model | Size | Use case |
|---|---|---|
mlx-community/whisper-tiny-mlx |
75MB | Fastest, basic accuracy |
mlx-community/whisper-small-mlx |
465MB | Recommended |
mlx-community/whisper-medium-mlx |
1.5GB | Higher accuracy |
mlx-community/whisper-large-v3-mlx |
3GB | Best accuracy |
Language hint (optional)
Pass a language code as the second argument to skip auto-detection (faster):
mlx-whisper-transcribe.sh audio.ogg zh # Chinese
mlx-whisper-transcribe.sh audio.ogg en # English
In openclaw.json, add the language to args:
"args": ["{{MediaPath}}", "zh"]
Performance (M3 MacBook Pro, 8GB)
| Audio length | Transcription time |
|---|---|
| 10 sec | ~1 sec |
| 1 min | ~7 sec |
| 30 min | ~3.5 min |
Troubleshooting
mlx_whisper not found: Runpip3 install mlx-whisperagain- Empty transcript: Audio may be silent or music-only (Whisper transcribes speech only)
- Timeout: Increase
timeoutSecondsfor long audio files - Wrong language: Add
"language": "zh"or the target language code to args - Model download fails: Check internet connection; models are cached after first run in
~/.cache/huggingface
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install jimmy-claw-mlx-whisper - 安装完成后,直接呼叫该 Skill 的名称或使用
/jimmy-claw-mlx-whisper触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
mlx-whisper 是什么?
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 338 次。
如何安装 mlx-whisper?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install jimmy-claw-mlx-whisper」即可一键安装,无需额外配置。
mlx-whisper 是免费的吗?
是的,mlx-whisper 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
mlx-whisper 支持哪些平台?
mlx-whisper 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(darwin)。
谁开发了 mlx-whisper?
由 YinghaoJia(@yinghaojia)开发并维护,当前版本 v1.0.7。