← 返回 Skills 市场
yinghaojia

mlx-whisper

作者 YinghaoJia · GitHub ↗ · v1.0.7 · MIT-0
darwin ⚠ suspicious
338
总下载
0
收藏
1
当前安装
8
版本数
在 OpenClaw 中安装
/install jimmy-claw-mlx-whisper
功能描述
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T...
使用说明 (SKILL.md)

mlx-whisper — Local Voice Transcription for Apple Silicon

Enables automatic transcription of voice notes in OpenClaw using Apple's MLX framework. No API key required. Works fully offline. ~60× faster than standard Whisper on M1/M2/M3/M4.

How it works

  1. User sends a voice note (Telegram .ogg / WhatsApp .opus)
  2. OpenClaw downloads the audio file
  3. Passes it to mlx-whisper-transcribe.sh via {{MediaPath}}
  4. Transcript is injected as the message body
  5. Agent replies to the text content

Setup

Step 1 — Install mlx-whisper

pip3 install mlx-whisper

Verify:

python3 -c "import mlx_whisper; print('OK')"

Step 2 — Install the wrapper script

Find the Python bin path:

python3 -m site --user-base
# e.g. /Users/\x3Cyou>/Library/Python/3.9

Copy bin/mlx-whisper-transcribe.sh from this skill to \x3Cuser-base>/bin/mlx-whisper-transcribe.sh, then make it executable:

PYBIN=$(python3 -m site --user-base)/bin
cp {baseDir}/bin/mlx-whisper-transcribe.sh "$PYBIN/mlx-whisper-transcribe.sh"
chmod +x "$PYBIN/mlx-whisper-transcribe.sh"

Test it:

"$PYBIN/mlx-whisper-transcribe.sh" /path/to/audio.ogg
# First run downloads the model (~465MB). Subsequent runs are instant.

Step 3 — Configure OpenClaw

Add to ~/.openclaw/openclaw.json under tools.media.audio:

{
  "tools": {
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "type": "cli",
            "command": "\x3Cuser-base>/bin/mlx-whisper-transcribe.sh",
            "args": ["{{MediaPath}}"],
            "timeoutSeconds": 60
          }
        ]
      }
    }
  }
}

Replace \x3Cuser-base> with the output of python3 -m site --user-base.

Step 4 — Restart OpenClaw

openclaw gateway restart

Or restart the OpenClaw app from the menu bar.

Models

The wrapper uses whisper-small-mlx by default (465MB, good balance of speed and accuracy). To change, edit bin/mlx-whisper-transcribe.sh and update path_or_hf_repo:

Model Size Use case
mlx-community/whisper-tiny-mlx 75MB Fastest, basic accuracy
mlx-community/whisper-small-mlx 465MB Recommended
mlx-community/whisper-medium-mlx 1.5GB Higher accuracy
mlx-community/whisper-large-v3-mlx 3GB Best accuracy

Language hint (optional)

Pass a language code as the second argument to skip auto-detection (faster):

mlx-whisper-transcribe.sh audio.ogg zh   # Chinese
mlx-whisper-transcribe.sh audio.ogg en   # English

In openclaw.json, add the language to args:

"args": ["{{MediaPath}}", "zh"]

Performance (M3 MacBook Pro, 8GB)

Audio length Transcription time
10 sec ~1 sec
1 min ~7 sec
30 min ~3.5 min

Troubleshooting

  • mlx_whisper not found: Run pip3 install mlx-whisper again
  • Empty transcript: Audio may be silent or music-only (Whisper transcribes speech only)
  • Timeout: Increase timeoutSeconds for long audio files
  • Wrong language: Add "language": "zh" or the target language code to args
  • Model download fails: Check internet connection; models are cached after first run in ~/.cache/huggingface
安全使用建议
Do not copy or run any wrapper script you cannot inspect. The SKILL.md tells you to copy bin/mlx-whisper-transcribe.sh from the skill, but that file is not included in the published package — ask the publisher to provide the script source or include it in the skill so you can audit it. If you still want to proceed: 1) install mlx-whisper in a contained environment (virtualenv or user-only pip install) so install hooks are isolated; 2) verify what files pip installed (pip3 show -f mlx-whisper and inspect installed scripts); 3) if you must use a wrapper, write your own small wrapper that calls the mlx_whisper Python API or runs a short vetted command rather than copying an opaque shell script; 4) confirm model downloads will fit your disk (~465MB or more for larger models) and that cached models live under ~/.cache/huggingface; 5) only grant OpenClaw the configuration changes you understand and back up ~/.openclaw/openclaw.json before editing. If the publisher cannot produce the wrapper script source or explain why it was omitted, treat the skill as untrusted.
功能分析
Type: OpenClaw Skill Name: jimmy-claw-mlx-whisper Version: 1.0.7 The skill is a legitimate utility designed to enable local audio transcription on Apple Silicon Macs using the MLX framework. It provides transparent instructions for installing the 'mlx-whisper' Python package and configuring a shell wrapper for OpenClaw. No evidence of data exfiltration, malicious persistence, or harmful prompt injection was found in the documentation or metadata.
能力评估
Purpose & Capability
The name, description, and requested binaries (python3, pip3) align with installing a local Python-based transcription tool. However, SKILL.md repeatedly instructs you to copy a wrapper script from this skill (bin/mlx-whisper-transcribe.sh), but the file manifest does not include a bin directory or that script. That inconsistency is unexplained and disproportionate to the stated purpose.
Instruction Scope
Most runtime steps are in-scope (pip3 install mlx-whisper, configure openclaw.json, restart). The instructions ask you to copy a shell wrapper into your user bin and run it; because the wrapper script is not included in the package, you cannot inspect or verify what that script does. Installing and running an unanudited script is a risk. Otherwise the instructions do not request unrelated files, secrets, or external endpoints beyond model downloads from typical Hugging Face caching.
Install Mechanism
Installation is via pip3 (pip3 install mlx-whisper) which is expected for a Python package and uses public package registries; this is a common but moderately privileged operation because pip packages can run install-time code. There are no downloads from obscure URLs in the instructions.
Credentials
The skill requires only python3/pip3 and asks you to edit OpenClaw's config (~/.openclaw/openclaw.json) and to allow model downloads to the Hugging Face cache (~/.cache/huggingface). It does not request credentials or unrelated environment variables. These requirements are proportionate to the stated transcription purpose.
Persistence & Privilege
always is false and the skill does not request permanent platform-wide privileges. It instructs you to modify your OpenClaw config and restart the app, which is expected for adding a local tool. It does not ask to modify other skills or system-wide settings beyond the user config.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install jimmy-claw-mlx-whisper
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /jimmy-claw-mlx-whisper 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.7
Force new version to fix scanner cache
v1.0.6
Re-submit v1.0.3 as new version
v1.0.5
Force include bin folder via package.json
v1.0.4
Fix wrapper script missing (added .sh extension)
v1.0.3
Fix wrapper script missing (added .sh extension)
v1.0.2
Fix wrapper script missing from package (added .sh extension)
v1.0.1
Fix wrapper script inclusion and offline claim
v1.0.0
Local voice transcription for Apple Silicon via mlx-whisper
元数据
Slug jimmy-claw-mlx-whisper
版本 1.0.7
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 8
常见问题

mlx-whisper 是什么?

Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 338 次。

如何安装 mlx-whisper?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install jimmy-claw-mlx-whisper」即可一键安装,无需额外配置。

mlx-whisper 是免费的吗?

是的,mlx-whisper 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

mlx-whisper 支持哪些平台?

mlx-whisper 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(darwin)。

谁开发了 mlx-whisper?

由 YinghaoJia(@yinghaojia)开发并维护,当前版本 v1.0.7。

💬 留言讨论