← 返回 Skills 市场

Local fun-asr-nano powered by sherpa-onnx

Name: Local fun-asr-nano powered by sherpa-onnx
Author: pengzhendong

作者彭震东 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

118

总下载

当前安装

版本数

在 OpenClaw 中安装

/install fun-asr-nano

功能描述

Fun-ASR-Nano 基于 sherpa-onnx 引擎的本地语音识别，完全离线运行。当用户需要转录音频文件时触发。

使用说明 (SKILL.md)

Fun-ASR-Nano

Fun-ASR-Nano 基于 sherpa-onnx 引擎的本地语音识别，完全离线运行，无需网络连接。支持多种语言和方言的语音转文字。


中文包括 7 种方言（吴语、粤语、闽语、客家话、赣语、湘语、晋语）和 26 种地方口音（河南、山西、湖北、四川、重庆、云南、贵州、广东、广西 及其他 20 多个地区）。

英文和日文涵盖多种地方口音。

此外还支持歌词识别和说唱语音识别。

特点：
- 本地运行，保护隐私
- 离线识别，无需网络
- 支持多种音频格式

激活条件

触发场景	说明
用户发送音频文件	`.wav` / `.mp3` / `.m4a` / `.flac` / `.ogg` 等格式
用户要求转录	"转写音频"、"语音转文字"
音频文件处理	需要提取音频中的文字内容

使用方法

安装依赖

pip install sherpa-onnx soundfile modelscope

转写音频文件

python scripts/cli.py audio.wav

版本：1.0.0 创建于：2026-03-18

安全使用建议

This skill appears to implement local ASR, but it is misleading to call it 'completely offline' because the provided script uses modelscope.snapshot_download to fetch model weights at runtime. Before installing: (1) expect network access to download model files and Python packages; (2) confirm you are comfortable downloading and storing third-party model weights (check license and trustworthiness of the 'pengzhendong/Fun-ASR-Nano-int8' model on ModelScope); (3) consider running the script in an isolated environment (VM/container) if you want to limit exposure; (4) if you truly need offline operation without any network access, verify you already have the required model folder locally and modify the script to avoid snapshot_download; (5) be aware that while the script does not transmit audio, downloading models involves network activity and writing files to disk — review downloaded files and their origins.

功能分析

Type: OpenClaw Skill Name: fun-asr-nano Version: 1.0.0 The skill bundle provides legitimate local speech-to-text functionality using the sherpa-onnx engine and Fun-ASR-Nano models. The script `scripts/cli.py` uses the standard `modelscope` library to download model weights and performs offline inference as described in `SKILL.md`. No indicators of data exfiltration, malicious execution, or harmful prompt injection were found.

能力评估

ℹ Purpose & Capability

The declared purpose (local/offline speech recognition using sherpa-onnx) matches the included code and CLI. However, the skill's claim of '完全离线运行' (completely offline) conflicts with the code that downloads model artifacts from ModelScope at runtime (snapshot_download).

⚠ Instruction Scope

SKILL.md omits mention that the script will fetch model files from a remote registry. The runtime instructions tell the user to pip install dependencies but do not explain the need for network access or where models will be fetched from. The CLI reads local audio files only (no direct exfiltration), but it will initiate network activity to download model snapshots.

ℹ Install Mechanism

There is no formal install spec (instruction-only). The documentation asks users to pip install packages (normal), but the script uses modelscope.snapshot_download which will download model archives from ModelScope and write them to disk. The download source is an external model registry (ModelScope) rather than a bundled local model; this is higher-risk than a fully offline bundle but not itself malicious.

✓ Credentials

The skill requests no environment variables, credentials, or privileged config paths. There are no apparent demands for unrelated secrets or system credentials.

✓ Persistence & Privilege

Flags show normal invocation (not always:true). The skill does not request permanent platform presence or attempt to modify other skills' configurations. It will download and store model files locally using modelscope, which is expected for model-heavy tools.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install fun-asr-nano
安装完成后，直接呼叫该 Skill 的名称或使用 /fun-asr-nano 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of Fun-ASR-Nano (version 1.0.0). - Provides fully offline, local speech-to-text based on the sherpa-onnx engine. - Supports recognition for seven major Chinese dialects, 26 regional accents, as well as English and Japanese with multiple accents. - Handles various audio formats such as .wav, .mp3, .m4a, .flac, and .ogg. - Includes specialized support for lyrics and rap speech recognition. - Focused on privacy by running entirely offline without requiring any network connection.

元数据

Slug fun-asr-nano

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题