← Back to Skills Marketplace
yinghaojia

mlx-whisper

by YinghaoJia · GitHub ↗ · v1.0.7 · MIT-0
darwin ⚠ suspicious
338
Downloads
0
Stars
1
Active Installs
8
Versions
Install in OpenClaw
/install jimmy-claw-mlx-whisper
Description
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T...
README (SKILL.md)

mlx-whisper — Local Voice Transcription for Apple Silicon

Enables automatic transcription of voice notes in OpenClaw using Apple's MLX framework. No API key required. Works fully offline. ~60× faster than standard Whisper on M1/M2/M3/M4.

How it works

  1. User sends a voice note (Telegram .ogg / WhatsApp .opus)
  2. OpenClaw downloads the audio file
  3. Passes it to mlx-whisper-transcribe.sh via {{MediaPath}}
  4. Transcript is injected as the message body
  5. Agent replies to the text content

Setup

Step 1 — Install mlx-whisper

pip3 install mlx-whisper

Verify:

python3 -c "import mlx_whisper; print('OK')"

Step 2 — Install the wrapper script

Find the Python bin path:

python3 -m site --user-base
# e.g. /Users/\x3Cyou>/Library/Python/3.9

Copy bin/mlx-whisper-transcribe.sh from this skill to \x3Cuser-base>/bin/mlx-whisper-transcribe.sh, then make it executable:

PYBIN=$(python3 -m site --user-base)/bin
cp {baseDir}/bin/mlx-whisper-transcribe.sh "$PYBIN/mlx-whisper-transcribe.sh"
chmod +x "$PYBIN/mlx-whisper-transcribe.sh"

Test it:

"$PYBIN/mlx-whisper-transcribe.sh" /path/to/audio.ogg
# First run downloads the model (~465MB). Subsequent runs are instant.

Step 3 — Configure OpenClaw

Add to ~/.openclaw/openclaw.json under tools.media.audio:

{
  "tools": {
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "type": "cli",
            "command": "\x3Cuser-base>/bin/mlx-whisper-transcribe.sh",
            "args": ["{{MediaPath}}"],
            "timeoutSeconds": 60
          }
        ]
      }
    }
  }
}

Replace \x3Cuser-base> with the output of python3 -m site --user-base.

Step 4 — Restart OpenClaw

openclaw gateway restart

Or restart the OpenClaw app from the menu bar.

Models

The wrapper uses whisper-small-mlx by default (465MB, good balance of speed and accuracy). To change, edit bin/mlx-whisper-transcribe.sh and update path_or_hf_repo:

Model Size Use case
mlx-community/whisper-tiny-mlx 75MB Fastest, basic accuracy
mlx-community/whisper-small-mlx 465MB Recommended
mlx-community/whisper-medium-mlx 1.5GB Higher accuracy
mlx-community/whisper-large-v3-mlx 3GB Best accuracy

Language hint (optional)

Pass a language code as the second argument to skip auto-detection (faster):

mlx-whisper-transcribe.sh audio.ogg zh   # Chinese
mlx-whisper-transcribe.sh audio.ogg en   # English

In openclaw.json, add the language to args:

"args": ["{{MediaPath}}", "zh"]

Performance (M3 MacBook Pro, 8GB)

Audio length Transcription time
10 sec ~1 sec
1 min ~7 sec
30 min ~3.5 min

Troubleshooting

  • mlx_whisper not found: Run pip3 install mlx-whisper again
  • Empty transcript: Audio may be silent or music-only (Whisper transcribes speech only)
  • Timeout: Increase timeoutSeconds for long audio files
  • Wrong language: Add "language": "zh" or the target language code to args
  • Model download fails: Check internet connection; models are cached after first run in ~/.cache/huggingface
Usage Guidance
Do not copy or run any wrapper script you cannot inspect. The SKILL.md tells you to copy bin/mlx-whisper-transcribe.sh from the skill, but that file is not included in the published package — ask the publisher to provide the script source or include it in the skill so you can audit it. If you still want to proceed: 1) install mlx-whisper in a contained environment (virtualenv or user-only pip install) so install hooks are isolated; 2) verify what files pip installed (pip3 show -f mlx-whisper and inspect installed scripts); 3) if you must use a wrapper, write your own small wrapper that calls the mlx_whisper Python API or runs a short vetted command rather than copying an opaque shell script; 4) confirm model downloads will fit your disk (~465MB or more for larger models) and that cached models live under ~/.cache/huggingface; 5) only grant OpenClaw the configuration changes you understand and back up ~/.openclaw/openclaw.json before editing. If the publisher cannot produce the wrapper script source or explain why it was omitted, treat the skill as untrusted.
Capability Analysis
Type: OpenClaw Skill Name: jimmy-claw-mlx-whisper Version: 1.0.7 The skill is a legitimate utility designed to enable local audio transcription on Apple Silicon Macs using the MLX framework. It provides transparent instructions for installing the 'mlx-whisper' Python package and configuring a shell wrapper for OpenClaw. No evidence of data exfiltration, malicious persistence, or harmful prompt injection was found in the documentation or metadata.
Capability Assessment
Purpose & Capability
The name, description, and requested binaries (python3, pip3) align with installing a local Python-based transcription tool. However, SKILL.md repeatedly instructs you to copy a wrapper script from this skill (bin/mlx-whisper-transcribe.sh), but the file manifest does not include a bin directory or that script. That inconsistency is unexplained and disproportionate to the stated purpose.
Instruction Scope
Most runtime steps are in-scope (pip3 install mlx-whisper, configure openclaw.json, restart). The instructions ask you to copy a shell wrapper into your user bin and run it; because the wrapper script is not included in the package, you cannot inspect or verify what that script does. Installing and running an unanudited script is a risk. Otherwise the instructions do not request unrelated files, secrets, or external endpoints beyond model downloads from typical Hugging Face caching.
Install Mechanism
Installation is via pip3 (pip3 install mlx-whisper) which is expected for a Python package and uses public package registries; this is a common but moderately privileged operation because pip packages can run install-time code. There are no downloads from obscure URLs in the instructions.
Credentials
The skill requires only python3/pip3 and asks you to edit OpenClaw's config (~/.openclaw/openclaw.json) and to allow model downloads to the Hugging Face cache (~/.cache/huggingface). It does not request credentials or unrelated environment variables. These requirements are proportionate to the stated transcription purpose.
Persistence & Privilege
always is false and the skill does not request permanent platform-wide privileges. It instructs you to modify your OpenClaw config and restart the app, which is expected for adding a local tool. It does not ask to modify other skills or system-wide settings beyond the user config.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install jimmy-claw-mlx-whisper
  3. After installation, invoke the skill by name or use /jimmy-claw-mlx-whisper
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.7
Force new version to fix scanner cache
v1.0.6
Re-submit v1.0.3 as new version
v1.0.5
Force include bin folder via package.json
v1.0.4
Fix wrapper script missing (added .sh extension)
v1.0.3
Fix wrapper script missing (added .sh extension)
v1.0.2
Fix wrapper script missing from package (added .sh extension)
v1.0.1
Fix wrapper script inclusion and offline claim
v1.0.0
Local voice transcription for Apple Silicon via mlx-whisper
Metadata
Slug jimmy-claw-mlx-whisper
Version 1.0.7
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 8
Frequently Asked Questions

What is mlx-whisper?

Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T... It is an AI Agent Skill for Claude Code / OpenClaw, with 338 downloads so far.

How do I install mlx-whisper?

Run "/install jimmy-claw-mlx-whisper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is mlx-whisper free?

Yes, mlx-whisper is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does mlx-whisper support?

mlx-whisper is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin).

Who created mlx-whisper?

It is built and maintained by YinghaoJia (@yinghaojia); the current version is v1.0.7.

💬 Comments