← 返回 Skills 市场
zxkane

Funasr Transcribe

作者 zxkane · GitHub ↗ · v1.6.0 · MIT-0
cross-platform ⚠ suspicious
294
总下载
0
收藏
0
当前安装
7
版本数
在 OpenClaw 中安装
/install zxkane-audio-transcriber-funasr
功能描述
This skill should be used when the user explicitly asks to "transcribe a meeting", "transcribe audio", "transcribe a meeting recording", "convert audio to te...
安全使用建议
This skill appears to implement a real multi-speaker transcription pipeline and is coherent with its name, but it does things you should not run blind: it will install system packages and Python libraries, and it includes a script that patches an installed FunASR package file in site-packages (potentially system-wide and may require sudo). If you want to use it: (1) inspect patch_clustering.py and setup_env.sh line-by-line before running, (2) prefer running inside an isolated environment (container or VM) and as a non-root user, (3) avoid enabling the optional LLM cleanup (--model) if transcripts contain sensitive data unless you trust the chosen provider and have appropriate credentials, (4) note the skill references CLAUDE_PLUGIN_ROOT (ensure the platform provides that safely), and (5) if you are uncomfortable with automated patching of site-packages, run the transcribe pipeline manually after installing dependencies yourself or ask the author for a safer install mode. Because a prompt-injection pattern was detected in the SKILL.md, treat embedded instructions that affect agent behavior or system prompts with extra caution.
功能分析
Type: OpenClaw Skill Name: zxkane-audio-transcriber-funasr Version: 1.6.0 The skill provides a sophisticated audio transcription pipeline but employs several high-risk techniques. Specifically, 'scripts/patch_clustering.py' modifies the source code of the installed 'funasr' library within the Python site-packages to optimize performance, which is a high-privilege 'monkey patching' operation. Additionally, 'scripts/setup_env.sh' uses 'sudo' for system package installation, and 'SKILL.md' contains instructions for the AI agent to use 'systemd-run' to launch detached processes that bypass agent-imposed execution timeouts. While these behaviors are documented as optimizations for long-duration audio processing, they constitute risky capabilities that exceed the typical scope of a benign skill.
能力标签
cryptorequires-sensitive-credentials
能力评估
Purpose & Capability
Name/description match the included code: scripts perform ASR, diarization, post‑processing, optional LLM 'cleanup', hotword biasing, and speaker gender inference — all coherent for a transcription skill. Notable capabilities: speaker gender classification and an explicit clustering patch (modifies FunASR internals) which are beyond a minimal transcribe helper but plausible for long-meeting support.
Instruction Scope
SKILL.md instructs running the bundled scripts (e.g., setup_env.sh and transcribe_funasr.py) and to set SCRIPTS=${CLAUDE_PLUGIN_ROOT}/... — CLAUDE_PLUGIN_ROOT is referenced but not declared in the skill metadata. The docs explicitly allow sending transcript excerpts to external LLM providers when --model is used (this is opt‑in and documented). The presence of a pre-scan 'system-prompt-override' signal in SKILL.md raises concern about prompt-injection attempts in the instructions content.
Install Mechanism
No registry install spec, but bundled setup_env.sh will: (1) attempt to install ffmpeg via apt-get or brew (may invoke sudo), (2) create a Python venv and pip install torch, funasr, modelscope, boto3, and (3) run patch_clustering.py which modifies files inside the installed FunASR package in site-packages. Installing packages via pip and system package managers is expected for this task, but the automated modification of an installed third‑party package is a higher‑risk action and should be inspected before running.
Credentials
The skill declares no required env vars and lists optional LLM-related variables (AWS_REGION, ANTHROPIC_API_KEY, OPENAI_API_KEY, OPENAI_BASE_URL) for the opt‑in LLM cleanup — this is proportionate. However the instructions reference CLAUDE_PLUGIN_ROOT (not declared) and the LLM code will, if used, rely on provider credentials (AWS credentials via standard chain or explicit API keys).
Persistence & Privilege
always:false and the skill does not request permanent inclusion, but the setup script writes to disk (venv, installed packages) and patch_clustering.py edits files in site-packages (system/global Python package). That system‑level modification is a persistence/privilege concern and may require sudo; it changes third‑party library code outside the skill directory.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install zxkane-audio-transcriber-funasr
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /zxkane-audio-transcriber-funasr 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.6.0
- Added speaker gender detection script (`scripts/speaker_gender.py`). - Improved main transcription and speaker verification scripts. - Documentation updated to reflect new features and changes.
v1.5.1
Version 1.5.1 - Improved logic in transcribe_funasr.py for handling speaker verification and LLM cleanup stages. - Enhanced speaker label verification in scripts/test_speaker_verification.py, including better mismatch detection and reporting. - Updated documentation in SKILL.md to clarify workflow and usage, with small corrections and enhanced guidance.
v1.5.0
Version 1.5.0 - Adds a clear warning and detection logic for incorrect use of podcast aliases in `--speakers`, prompting for the real host/guest name based on supporting materials. - Updates SKILL.md documentation to explain the new requirement: `--speakers` must use actual real names, not show or podcast aliases. - Guidance added for including both real names and aliases in `hotwords.txt` for better ASR recognition. - No changes to skill APIs or setup; this update improves correctness of speaker labeling and overall transcript quality.
v1.4.1
# zxkane-audio-transcriber-funasr 1.4.1 - Added a new `test_speaker_verification.py` script for improved speaker verification testing. - Updated documentation in SKILL.md to match current features and clarify supported workflows. - Minor tweaks and improvements to `transcribe_funasr.py` to enhance transcription reliability and maintain consistency with speaker verification utilities.
v1.4.0
- LLM cleanup (Phase 3) is now opt-in and runs only when --model is specified; by default, all transcription is local and no data is sent to external services. - Documentation clarifies privacy defaults: external LLMs are used only with --model, and Bedrock uses the AWS credential chain. - Description updated to reflect that transcription triggers only when the user explicitly requests it. - Example commands and workflow sections updated to reflect the new LLM cleanup activation method. - No major behavioral or interface changes beyond LLM phase invocation and stricter privacy defaults.
v1.3.1
**Adds environment variable docs and improves setup automation.** - Documents support for Anthropic, OpenAI-compatible, and Bedrock LLM providers via environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, AWS_REGION, etc.) - Updates SKILL.md to explain LLM provider credentials, API base URL, and privacy/credential isolation - Adds "AUTO_YES=1" usage to environment setup for non-interactive installs - No behavioral changes to transcription logic itself
v1.3.0
- Expanded language support: now handles Chinese, English, Japanese, Korean, Cantonese, and 99 languages (via Whisper), with automatic speaker diarization and hotword biasing. - New, detailed workflow: guides users to provide context like meeting type, participant names, supporting documents, and preference for language and number of speakers to optimize transcription quality. - Enhanced presets and diarization: per-language model selection with clear caveats on diarization support, especially for `auto` and `whisper` modes. - LLM optional cleanup: supports post-processing transcripts with Bedrock, Anthropic, or OpenAI-compatible LLMs, with resume and skip options. - Utility scripts included: speaker verification and reassignment script helps detect and fix swapped or misidentified speakers. - Audio preprocessing improvements: all inputs auto-converted to 16kHz mono FLAC for reliability, with detailed format recommendations.
元数据
Slug zxkane-audio-transcriber-funasr
版本 1.6.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 7
常见问题

Funasr Transcribe 是什么?

This skill should be used when the user explicitly asks to "transcribe a meeting", "transcribe audio", "transcribe a meeting recording", "convert audio to te... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 294 次。

如何安装 Funasr Transcribe?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install zxkane-audio-transcriber-funasr」即可一键安装,无需额外配置。

Funasr Transcribe 是免费的吗?

是的,Funasr Transcribe 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Funasr Transcribe 支持哪些平台?

Funasr Transcribe 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Funasr Transcribe?

由 zxkane(@zxkane)开发并维护,当前版本 v1.6.0。

💬 留言讨论