← 返回 Skills 市场

MLX STT

Name: MLX STT
Author: guoqiao

作者 guoqiao · GitHub ↗ · v1.0.7

darwin ⚠ suspicious

3528

总下载

当前安装

版本数

在 OpenClaw 中安装

/install mlx-stt

功能描述

Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.

使用说明 (SKILL.md)

MLX STT

Speech-To-Text/ASR/Transcribe with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.

Free and Accurate. No api key required. No server required.

Requirements

mlx: macOS with Apple Silicon
brew: used to install deps if not available

Installation

bash ${baseDir}/install.sh

This script will use brew to install these cli tools if not available:

ffmpeg: convert audio format when needed
uv: install python package and run python script
mlx_audio: do the real job

Usage

To transcribe an audio file, run this script:

bash  ${baseDir}/mlx-stt.sh \x3Caudio_file_path>

First run could be a little slow, since it will need to download model.
The transcript result will be printed to stdout.

安全使用建议

This skill appears to perform local STT as described, but exercise caution before installing: - always:true is unnecessary for an on‑demand STT tool; prefer a skill that is not force‑enabled. - install.sh runs 'uv tool install --force mlx-audio --prerelease=allow' — that will fetch and install a third‑party prerelease binary from an unspecified source. Ask the author for the exact upstream registry/URL and inspect that package before installing. - The mlx_audio tool will download models at runtime (network activity). If you have sensitive data or need an auditable supply chain, run this in an isolated VM or disposable machine first. - Because stdout/stderr are silenced for the tool, initial failures or unexpected network activity may be hidden; consider running the command manually without redirection to inspect behavior. - If you decide to proceed, manually run the install script in a controlled environment, verify the origin of the 'uv' CLI and the 'mlx-audio' package, and avoid installing on a machine with sensitive secrets. Additional information that would raise confidence to 'high': explicit upstream URLs or package registry details for 'uv' and 'mlx-audio', a signed release or checksum for the model/binary, and removal of always:true or an explanation why force‑enable is required.

功能分析

Type: OpenClaw Skill Name: mlx-stt Version: 1.0.7 The skill bundle is designed for local Speech-To-Text on Apple Silicon. It uses `brew` and `uv` to install necessary dependencies like `ffmpeg` and `mlx-audio`, which are standard tools for audio processing and MLX-based operations. The `mlx-stt.sh` script converts audio to a suitable format using `ffmpeg` and then processes it with `mlx_audio.stt.generate`, printing the transcript to stdout. There is no evidence of data exfiltration, malicious execution, persistence mechanisms, or prompt injection attempts against the agent. All actions are directly aligned with the stated purpose of providing local STT functionality.

能力评估

✓ Purpose & Capability

Name and description (local MLX-based STT on Apple Silicon) align with the provided scripts: ffmpeg + mlx_audio invocation to transcribe audio. Requiring brew on macOS to install ffmpeg/uv is reasonable for this purpose.

ℹ Instruction Scope

Runtime instructions and scripts only convert the provided audio to WAV, invoke mlx_audio.stt.generate, print transcript files, and clean up temporary files. The scripts do download a model at first run and the mlx_audio command's output is redirected to /dev/null (silenced), which hides runtime logs/errors — not clearly malicious but reduces transparency. The skill does not read unrelated files or request extra environment data.

⚠ Install Mechanism

install.sh uses brew (expected) but relies on the 'uv' CLI to install 'mlx-audio' with --force and --prerelease=allow. 'uv' and the source/registry used for mlx-audio are not documented here; installing a force/prerelease package from an opaque source can deliver arbitrary code. The install does not download from a clearly identified, verifiable release URL (e.g., official GitHub release or known package registry with provenance shown).

✓ Credentials

The skill declares no required environment variables or credentials and its scripts do not attempt to read other env vars or sensitive config paths — the requested environment access appears minimal and proportional.

⚠ Persistence & Privilege

Registry metadata sets always:true (force‑included in every agent run). A narrow, on‑demand STT skill does not reasonably need to be force‑enabled for all agents. Combined with the opaque install of a prerelease binary, this increases the blast radius if the installed tool were malicious or buggy.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install mlx-stt
安装完成后，直接呼叫该 Skill 的名称或使用 /mlx-stt 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.7

- Added main script mlx-stt.sh for audio transcription; removed previous Python script mlx-stt.py. - Updated usage instructions to use the new shell script instead of a Python command. - Expanded triggers in SKILL.md for easier activation. - Added version and author information to metadata. - The skill still performs local speech-to-text on Apple Silicon with MLX and open-source models.

v1.0.6

- Removed the deprecation notice from documentation. - Updated SKILL.md to indicate the skill is no longer deprecated and can be used normally.

v1.0.5

- Added a deprecation notice: this skill is no longer maintained. - Recommended migrating to the replacement skill, mlx-audio-server, for improved functionality.

v1.0.4

- Clarified and simplified the skill description and title. - Added information about initial model download from Hugging Face during first run. - Improved formatting and fixed typos (e.g., "Transcibe" → "Transcribe"). - Emphasized that no API key or server is required. - No changes to installation or usage commands.

v1.0.3

- Expanded description to clarify local operation, supported model (glm-asr-nano-2512), and no need for API keys or servers. - Added relevant tags in metadata for better discovery. - Improved description and title formatting for consistency. - Minor clarifications to requirements and feature statements.

v1.0.2

- Minor documentation update: clarified the role of `mlx_audio` (“do the real job”) in the installation instructions. - No functional or code changes in this release.

v1.0.1

- Added install.sh script for streamlined installation using Homebrew. - Removed deprecated mlx-stt.sh script. - Updated documentation: simplified requirements, added installation instructions, and clarified usage. - Metadata now references Homebrew for dependency management.

v1.0.0

Initial release of mlx-stt - Transcribe audio files to text using MLX (Apple Silicon) and GLM-ASR. - Provides both Python and Bash script options for running transcription. - Outputs transcription results directly to the terminal. - Requires Apple Silicon macOS, mlx, ffmpeg, and mlx_audio.generate.stt.

元数据

Slug mlx-stt

版本 1.0.7

许可证 —

累计安装 16

当前安装数 16

历史版本数 8

常见问题

MLX STT 是什么？

Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 3528 次。

如何安装 MLX STT？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install mlx-stt」即可一键安装，无需额外配置。

MLX STT 是免费的吗？

是的，MLX STT 完全免费（开源免费），可自由下载、安装和使用。

MLX STT 支持哪些平台？

MLX STT 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（darwin）。

谁开发了 MLX STT？

由 guoqiao（@guoqiao）开发并维护，当前版本 v1.0.7。

MLX STT

MLX STT

Requirements

Installation

Usage

MLX STT 是什么？

如何安装 MLX STT？

MLX STT 是免费的吗？

MLX STT 支持哪些平台？

谁开发了 MLX STT？

💬 留言讨论