← 返回 Skills 市场
limboinf

Funasr Transcribe Skill

作者 limbo · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ✓ 安全检测通过
751
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install funasr-transcribe-skill
功能描述
Use when the user needs local speech-to-text transcription for audio files, especially Chinese or mixed Chinese-English audio, without relying on cloud trans...
使用说明 (SKILL.md)

FunASR Transcribe

Local speech-to-text for audio files using FunASR. It is best suited to Chinese and mixed Chinese-English audio, runs on the local machine, and does not require a paid transcription API.

When to Use

  • The user wants to transcribe .wav, .ogg, .mp3, .flac, or .m4a files into text.
  • The user prefers local ASR over cloud speech APIs for privacy, cost, or offline-friendly workflows.
  • The audio is primarily Chinese, dialect-heavy Chinese, or mixed Chinese-English.
  • The user is okay with installing Python dependencies and downloading models on first use.

Do not use this skill when the user explicitly forbids local dependency installation or any network access for dependency/model download.

Quick Start

# Install dependencies and create a virtual environment
bash ~/.openclaw/workspace/skills/funasr-transcribe/scripts/install.sh

# Transcribe an audio file
bash ~/.openclaw/workspace/skills/funasr-transcribe/scripts/transcribe.sh /path/to/audio.ogg

What It Does

  • Creates a Python virtual environment at ~/.openclaw/workspace/funasr_env by default.
  • Installs funasr, torch, torchaudio, modelscope, and related dependencies.
  • Loads FunASR models locally and writes the transcript to a sibling .txt file.
  • Prints the transcript to stdout for direct CLI use.

Models

  • ASR: damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
  • VAD: damo/speech_fsmn_vad_zh-cn-16k-common-pytorch
  • Punctuation: damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch

External Endpoints

Endpoint Purpose Data sent
https://pypi.tuna.tsinghua.edu.cn/simple Install Python packages during setup Package names and installer metadata requested by pip
ModelScope and/or Hugging Face endpoints used by FunASR dependencies Download model files on first run Model identifiers and standard HTTP request metadata

Security & Privacy

  • Audio files are read from the local machine and processed locally by FunASR.
  • The transcription flow does not intentionally upload audio content to a cloud ASR API.
  • Network access is still required during setup and first-run model download.
  • The generated transcript is written to a local .txt file next to the source audio unless the write step fails.
  • This skill does not require API keys or other secrets by default.

Model Invocation Note

Autonomous invocation is normal for this skill. If a user asks to transcribe local audio, an agent may install dependencies and run the helper scripts unless the user explicitly opts out of dependency installation or network access.

Trust Statement

By using this skill, package and model downloads may be fetched from third-party upstream sources such as the configured PyPI mirror and model hosting providers. Only install and use this skill if you trust those upstream sources.

Troubleshooting

  • python3 not found: install Python 3.7+ and rerun scripts/install.sh.
  • Install fails in the existing environment: rerun scripts/install.sh --force to recreate the virtual environment.
  • First transcription is slow: initial model downloads can take several minutes.
  • GPU is desired: edit scripts/transcribe.py and change device="cpu" to a CUDA device after installing the correct CUDA build.
安全使用建议
This skill appears to do what it claims: it will create a Python venv, pip-install funasr/torch/modelscope (from the Tsinghua PyPI mirror), and download models on first run. Before installing, consider: (1) you must allow network access for package and model downloads; (2) large packages and model files can consume disk space and take time; (3) the script uses the Tsinghua PyPI mirror — only proceed if you trust that mirror and the upstream model providers (ModelScope/Hugging Face); (4) the skill does not request credentials or exfiltrate data, but if you prefer, run install.sh and transcribe.sh manually in an isolated environment (container or VM) to review behavior first; (5) if you need offline/no-network guarantees, do not install or run the scripts until models are pre-provisioned locally.
功能分析
Type: OpenClaw Skill Name: funasr-transcribe-skill Version: 1.0.1 The skill provides local speech-to-text functionality using the Alibaba FunASR library. The provided scripts (install.sh, transcribe.sh, and transcribe.py) perform standard environment setup and local inference as described, with transparent disclosure of external network access for downloading models and Python packages via a well-known PyPI mirror. No evidence of data exfiltration, persistence, or malicious prompt injection was found.
能力评估
Purpose & Capability
Name/description promise (local ASR for Chinese/mixed audio) matches the included scripts and Python code: they create a venv, install funasr/torch/modelscope, load FunASR models, transcribe a provided audio file, print and write a sibling .txt. No unrelated credentials, binaries, or config paths are requested.
Instruction Scope
SKILL.md and scripts only instruct creating a venv, installing Python deps, loading models, reading the specified audio file, and writing a .txt alongside it. There are no instructions to read arbitrary host files, access unrelated env vars, or send audio/text to third-party endpoints beyond model/package hosting used during install/first-run.
Install Mechanism
Installation is a shell script that uses pip to install packages from a Tsinghua PyPI mirror and leaves model downloads to first-run. This is an expected method for Python-based local inference but does require network access and installs heavyweight packages (torch). No arbitrary binary downloads or obscure URLs are used, but model hosting (ModelScope/Hugging Face) will perform further downloads on first run.
Credentials
The skill requests no credentials or special environment variables; scripts only read HOME to locate ~/.openclaw/workspace/funasr_env. No secrets or unrelated service keys are required.
Persistence & Privilege
The skill is not marked always:true and does not modify other skills or system-wide settings. It writes a virtual environment and cached models into the user's workspace (~/.openclaw/workspace/funasr_env) and writes transcript files next to source audio — this is proportionate for a local transcription skill.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install funasr-transcribe-skill
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /funasr-transcribe-skill 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
- Added English and Chinese README files. - Rewrote and expanded SKILL.md with more detailed usage, privacy, security, and troubleshooting information. - Clarified which external endpoints are accessed and under what conditions. - Updated scripts and documentation for improved guidance and user experience.
v1.0.0
初始版本:本地中文语音识别工具
元数据
Slug funasr-transcribe-skill
版本 1.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

Funasr Transcribe Skill 是什么?

Use when the user needs local speech-to-text transcription for audio files, especially Chinese or mixed Chinese-English audio, without relying on cloud trans... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 751 次。

如何安装 Funasr Transcribe Skill?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install funasr-transcribe-skill」即可一键安装,无需额外配置。

Funasr Transcribe Skill 是免费的吗?

是的,Funasr Transcribe Skill 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Funasr Transcribe Skill 支持哪些平台?

Funasr Transcribe Skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Funasr Transcribe Skill?

由 limbo(@limboinf)开发并维护,当前版本 v1.0.1。

💬 留言讨论