← 返回 Skills 市场

Funasr Transcribe

Name: Funasr Transcribe
Author: zxkane

作者 zxkane · GitHub ↗ · v1.6.0 · MIT-0

cross-platform ⚠ suspicious

294

总下载

当前安装

版本数

在 OpenClaw 中安装

/install zxkane-audio-transcriber-funasr

功能描述

This skill should be used when the user explicitly asks to "transcribe a meeting", "transcribe audio", "transcribe a meeting recording", "convert audio to te...

安全使用建议

This skill appears to implement a real multi-speaker transcription pipeline and is coherent with its name, but it does things you should not run blind: it will install system packages and Python libraries, and it includes a script that patches an installed FunASR package file in site-packages (potentially system-wide and may require sudo). If you want to use it: (1) inspect patch_clustering.py and setup_env.sh line-by-line before running, (2) prefer running inside an isolated environment (container or VM) and as a non-root user, (3) avoid enabling the optional LLM cleanup (--model) if transcripts contain sensitive data unless you trust the chosen provider and have appropriate credentials, (4) note the skill references CLAUDE_PLUGIN_ROOT (ensure the platform provides that safely), and (5) if you are uncomfortable with automated patching of site-packages, run the transcribe pipeline manually after installing dependencies yourself or ask the author for a safer install mode. Because a prompt-injection pattern was detected in the SKILL.md, treat embedded instructions that affect agent behavior or system prompts with extra caution.

功能分析

Type: OpenClaw Skill Name: zxkane-audio-transcriber-funasr Version: 1.6.0 The skill provides a sophisticated audio transcription pipeline but employs several high-risk techniques. Specifically, 'scripts/patch_clustering.py' modifies the source code of the installed 'funasr' library within the Python site-packages to optimize performance, which is a high-privilege 'monkey patching' operation. Additionally, 'scripts/setup_env.sh' uses 'sudo' for system package installation, and 'SKILL.md' contains instructions for the AI agent to use 'systemd-run' to launch detached processes that bypass agent-imposed execution timeouts. While these behaviors are documented as optimizations for long-duration audio processing, they constitute risky capabilities that exceed the typical scope of a benign skill.

能力标签

cryptorequires-sensitive-credentials

能力评估

ℹ Purpose & Capability

Name/description match the included code: scripts perform ASR, diarization, post‑processing, optional LLM 'cleanup', hotword biasing, and speaker gender inference — all coherent for a transcription skill. Notable capabilities: speaker gender classification and an explicit clustering patch (modifies FunASR internals) which are beyond a minimal transcribe helper but plausible for long-meeting support.

⚠ Instruction Scope

SKILL.md instructs running the bundled scripts (e.g., setup_env.sh and transcribe_funasr.py) and to set SCRIPTS=${CLAUDE_PLUGIN_ROOT}/... — CLAUDE_PLUGIN_ROOT is referenced but not declared in the skill metadata. The docs explicitly allow sending transcript excerpts to external LLM providers when --model is used (this is opt‑in and documented). The presence of a pre-scan 'system-prompt-override' signal in SKILL.md raises concern about prompt-injection attempts in the instructions content.

⚠ Install Mechanism

No registry install spec, but bundled setup_env.sh will: (1) attempt to install ffmpeg via apt-get or brew (may invoke sudo), (2) create a Python venv and pip install torch, funasr, modelscope, boto3, and (3) run patch_clustering.py which modifies files inside the installed FunASR package in site-packages. Installing packages via pip and system package managers is expected for this task, but the automated modification of an installed third‑party package is a higher‑risk action and should be inspected before running.

ℹ Credentials

The skill declares no required env vars and lists optional LLM-related variables (AWS_REGION, ANTHROPIC_API_KEY, OPENAI_API_KEY, OPENAI_BASE_URL) for the opt‑in LLM cleanup — this is proportionate. However the instructions reference CLAUDE_PLUGIN_ROOT (not declared) and the LLM code will, if used, rely on provider credentials (AWS credentials via standard chain or explicit API keys).

⚠ Persistence & Privilege

always:false and the skill does not request permanent inclusion, but the setup script writes to disk (venv, installed packages) and patch_clustering.py edits files in site-packages (system/global Python package). That system‑level modification is a persistence/privilege concern and may require sudo; it changes third‑party library code outside the skill directory.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install zxkane-audio-transcriber-funasr
安装完成后，直接呼叫该 Skill 的名称或使用 /zxkane-audio-transcriber-funasr 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.6.0

- Added speaker gender detection script (`scripts/speaker_gender.py`). - Improved main transcription and speaker verification scripts. - Documentation updated to reflect new features and changes.

v1.5.1

Version 1.5.1 - Improved logic in transcribe_funasr.py for handling speaker verification and LLM cleanup stages. - Enhanced speaker label verification in scripts/test_speaker_verification.py, including better mismatch detection and reporting. - Updated documentation in SKILL.md to clarify workflow and usage, with small corrections and enhanced guidance.

v1.5.0

Version 1.5.0 - Adds a clear warning and detection logic for incorrect use of podcast aliases in `--speakers`, prompting for the real host/guest name based on supporting materials. - Updates SKILL.md documentation to explain the new requirement: `--speakers` must use actual real names, not show or podcast aliases. - Guidance added for including both real names and aliases in `hotwords.txt` for better ASR recognition. - No changes to skill APIs or setup; this update improves correctness of speaker labeling and overall transcript quality.

v1.4.1

# zxkane-audio-transcriber-funasr 1.4.1 - Added a new `test_speaker_verification.py` script for improved speaker verification testing. - Updated documentation in SKILL.md to match current features and clarify supported workflows. - Minor tweaks and improvements to `transcribe_funasr.py` to enhance transcription reliability and maintain consistency with speaker verification utilities.

v1.4.0

- LLM cleanup (Phase 3) is now opt-in and runs only when --model is specified; by default, all transcription is local and no data is sent to external services. - Documentation clarifies privacy defaults: external LLMs are used only with --model, and Bedrock uses the AWS credential chain. - Description updated to reflect that transcription triggers only when the user explicitly requests it. - Example commands and workflow sections updated to reflect the new LLM cleanup activation method. - No major behavioral or interface changes beyond LLM phase invocation and stricter privacy defaults.

v1.3.1

**Adds environment variable docs and improves setup automation.** - Documents support for Anthropic, OpenAI-compatible, and Bedrock LLM providers via environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, AWS_REGION, etc.) - Updates SKILL.md to explain LLM provider credentials, API base URL, and privacy/credential isolation - Adds "AUTO_YES=1" usage to environment setup for non-interactive installs - No behavioral changes to transcription logic itself

v1.3.0

- Expanded language support: now handles Chinese, English, Japanese, Korean, Cantonese, and 99 languages (via Whisper), with automatic speaker diarization and hotword biasing. - New, detailed workflow: guides users to provide context like meeting type, participant names, supporting documents, and preference for language and number of speakers to optimize transcription quality. - Enhanced presets and diarization: per-language model selection with clear caveats on diarization support, especially for `auto` and `whisper` modes. - LLM optional cleanup: supports post-processing transcripts with Bedrock, Anthropic, or OpenAI-compatible LLMs, with resume and skip options. - Utility scripts included: speaker verification and reassignment script helps detect and fix swapped or misidentified speakers. - Audio preprocessing improvements: all inputs auto-converted to 16kHz mono FLAC for reliability, with detailed format recommendations.

元数据

Slug zxkane-audio-transcriber-funasr

版本 1.6.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 7

常见问题

Funasr Transcribe 是什么？

This skill should be used when the user explicitly asks to "transcribe a meeting", "transcribe audio", "transcribe a meeting recording", "convert audio to te... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 294 次。

如何安装 Funasr Transcribe？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install zxkane-audio-transcriber-funasr」即可一键安装，无需额外配置。

Funasr Transcribe 是免费的吗？

是的，Funasr Transcribe 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Funasr Transcribe 支持哪些平台？

Funasr Transcribe 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Funasr Transcribe？

由 zxkane（@zxkane）开发并维护，当前版本 v1.6.0。