← 返回 Skills 市场
3528
总下载
1
收藏
16
当前安装
8
版本数
在 OpenClaw 中安装
/install mlx-stt
功能描述
Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.
使用说明 (SKILL.md)
MLX STT
Speech-To-Text/ASR/Transcribe with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.
Free and Accurate. No api key required. No server required.
Requirements
mlx: macOS with Apple Siliconbrew: used to install deps if not available
Installation
bash ${baseDir}/install.sh
This script will use brew to install these cli tools if not available:
ffmpeg: convert audio format when neededuv: install python package and run python scriptmlx_audio: do the real job
Usage
To transcribe an audio file, run this script:
bash ${baseDir}/mlx-stt.sh \x3Caudio_file_path>
- First run could be a little slow, since it will need to download model.
- The transcript result will be printed to stdout.
安全使用建议
This skill appears to perform local STT as described, but exercise caution before installing:
- always:true is unnecessary for an on‑demand STT tool; prefer a skill that is not force‑enabled.
- install.sh runs 'uv tool install --force mlx-audio --prerelease=allow' — that will fetch and install a third‑party prerelease binary from an unspecified source. Ask the author for the exact upstream registry/URL and inspect that package before installing.
- The mlx_audio tool will download models at runtime (network activity). If you have sensitive data or need an auditable supply chain, run this in an isolated VM or disposable machine first.
- Because stdout/stderr are silenced for the tool, initial failures or unexpected network activity may be hidden; consider running the command manually without redirection to inspect behavior.
- If you decide to proceed, manually run the install script in a controlled environment, verify the origin of the 'uv' CLI and the 'mlx-audio' package, and avoid installing on a machine with sensitive secrets.
Additional information that would raise confidence to 'high': explicit upstream URLs or package registry details for 'uv' and 'mlx-audio', a signed release or checksum for the model/binary, and removal of always:true or an explanation why force‑enable is required.
功能分析
Type: OpenClaw Skill
Name: mlx-stt
Version: 1.0.7
The skill bundle is designed for local Speech-To-Text on Apple Silicon. It uses `brew` and `uv` to install necessary dependencies like `ffmpeg` and `mlx-audio`, which are standard tools for audio processing and MLX-based operations. The `mlx-stt.sh` script converts audio to a suitable format using `ffmpeg` and then processes it with `mlx_audio.stt.generate`, printing the transcript to stdout. There is no evidence of data exfiltration, malicious execution, persistence mechanisms, or prompt injection attempts against the agent. All actions are directly aligned with the stated purpose of providing local STT functionality.
能力评估
Purpose & Capability
Name and description (local MLX-based STT on Apple Silicon) align with the provided scripts: ffmpeg + mlx_audio invocation to transcribe audio. Requiring brew on macOS to install ffmpeg/uv is reasonable for this purpose.
Instruction Scope
Runtime instructions and scripts only convert the provided audio to WAV, invoke mlx_audio.stt.generate, print transcript files, and clean up temporary files. The scripts do download a model at first run and the mlx_audio command's output is redirected to /dev/null (silenced), which hides runtime logs/errors — not clearly malicious but reduces transparency. The skill does not read unrelated files or request extra environment data.
Install Mechanism
install.sh uses brew (expected) but relies on the 'uv' CLI to install 'mlx-audio' with --force and --prerelease=allow. 'uv' and the source/registry used for mlx-audio are not documented here; installing a force/prerelease package from an opaque source can deliver arbitrary code. The install does not download from a clearly identified, verifiable release URL (e.g., official GitHub release or known package registry with provenance shown).
Credentials
The skill declares no required environment variables or credentials and its scripts do not attempt to read other env vars or sensitive config paths — the requested environment access appears minimal and proportional.
Persistence & Privilege
Registry metadata sets always:true (force‑included in every agent run). A narrow, on‑demand STT skill does not reasonably need to be force‑enabled for all agents. Combined with the opaque install of a prerelease binary, this increases the blast radius if the installed tool were malicious or buggy.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install mlx-stt - 安装完成后,直接呼叫该 Skill 的名称或使用
/mlx-stt触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.7
- Added main script mlx-stt.sh for audio transcription; removed previous Python script mlx-stt.py.
- Updated usage instructions to use the new shell script instead of a Python command.
- Expanded triggers in SKILL.md for easier activation.
- Added version and author information to metadata.
- The skill still performs local speech-to-text on Apple Silicon with MLX and open-source models.
v1.0.6
- Removed the deprecation notice from documentation.
- Updated SKILL.md to indicate the skill is no longer deprecated and can be used normally.
v1.0.5
- Added a deprecation notice: this skill is no longer maintained.
- Recommended migrating to the replacement skill, mlx-audio-server, for improved functionality.
v1.0.4
- Clarified and simplified the skill description and title.
- Added information about initial model download from Hugging Face during first run.
- Improved formatting and fixed typos (e.g., "Transcibe" → "Transcribe").
- Emphasized that no API key or server is required.
- No changes to installation or usage commands.
v1.0.3
- Expanded description to clarify local operation, supported model (glm-asr-nano-2512), and no need for API keys or servers.
- Added relevant tags in metadata for better discovery.
- Improved description and title formatting for consistency.
- Minor clarifications to requirements and feature statements.
v1.0.2
- Minor documentation update: clarified the role of `mlx_audio` (“do the real job”) in the installation instructions.
- No functional or code changes in this release.
v1.0.1
- Added install.sh script for streamlined installation using Homebrew.
- Removed deprecated mlx-stt.sh script.
- Updated documentation: simplified requirements, added installation instructions, and clarified usage.
- Metadata now references Homebrew for dependency management.
v1.0.0
Initial release of mlx-stt
- Transcribe audio files to text using MLX (Apple Silicon) and GLM-ASR.
- Provides both Python and Bash script options for running transcription.
- Outputs transcription results directly to the terminal.
- Requires Apple Silicon macOS, mlx, ffmpeg, and mlx_audio.generate.stt.
元数据
常见问题
MLX STT 是什么?
Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 3528 次。
如何安装 MLX STT?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install mlx-stt」即可一键安装,无需额外配置。
MLX STT 是免费的吗?
是的,MLX STT 完全免费(开源免费),可自由下载、安装和使用。
MLX STT 支持哪些平台?
MLX STT 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(darwin)。
谁开发了 MLX STT?
由 guoqiao(@guoqiao)开发并维护,当前版本 v1.0.7。
推荐 Skills