← Back to Skills Marketplace
294
Downloads
0
Stars
0
Active Installs
7
Versions
Install in OpenClaw
/install zxkane-audio-transcriber-funasr
Description
This skill should be used when the user explicitly asks to "transcribe a meeting", "transcribe audio", "transcribe a meeting recording", "convert audio to te...
Usage Guidance
This skill appears to implement a real multi-speaker transcription pipeline and is coherent with its name, but it does things you should not run blind: it will install system packages and Python libraries, and it includes a script that patches an installed FunASR package file in site-packages (potentially system-wide and may require sudo). If you want to use it: (1) inspect patch_clustering.py and setup_env.sh line-by-line before running, (2) prefer running inside an isolated environment (container or VM) and as a non-root user, (3) avoid enabling the optional LLM cleanup (--model) if transcripts contain sensitive data unless you trust the chosen provider and have appropriate credentials, (4) note the skill references CLAUDE_PLUGIN_ROOT (ensure the platform provides that safely), and (5) if you are uncomfortable with automated patching of site-packages, run the transcribe pipeline manually after installing dependencies yourself or ask the author for a safer install mode. Because a prompt-injection pattern was detected in the SKILL.md, treat embedded instructions that affect agent behavior or system prompts with extra caution.
Capability Analysis
Type: OpenClaw Skill
Name: zxkane-audio-transcriber-funasr
Version: 1.6.0
The skill provides a sophisticated audio transcription pipeline but employs several high-risk techniques. Specifically, 'scripts/patch_clustering.py' modifies the source code of the installed 'funasr' library within the Python site-packages to optimize performance, which is a high-privilege 'monkey patching' operation. Additionally, 'scripts/setup_env.sh' uses 'sudo' for system package installation, and 'SKILL.md' contains instructions for the AI agent to use 'systemd-run' to launch detached processes that bypass agent-imposed execution timeouts. While these behaviors are documented as optimizations for long-duration audio processing, they constitute risky capabilities that exceed the typical scope of a benign skill.
Capability Tags
Capability Assessment
Purpose & Capability
Name/description match the included code: scripts perform ASR, diarization, post‑processing, optional LLM 'cleanup', hotword biasing, and speaker gender inference — all coherent for a transcription skill. Notable capabilities: speaker gender classification and an explicit clustering patch (modifies FunASR internals) which are beyond a minimal transcribe helper but plausible for long-meeting support.
Instruction Scope
SKILL.md instructs running the bundled scripts (e.g., setup_env.sh and transcribe_funasr.py) and to set SCRIPTS=${CLAUDE_PLUGIN_ROOT}/... — CLAUDE_PLUGIN_ROOT is referenced but not declared in the skill metadata. The docs explicitly allow sending transcript excerpts to external LLM providers when --model is used (this is opt‑in and documented). The presence of a pre-scan 'system-prompt-override' signal in SKILL.md raises concern about prompt-injection attempts in the instructions content.
Install Mechanism
No registry install spec, but bundled setup_env.sh will: (1) attempt to install ffmpeg via apt-get or brew (may invoke sudo), (2) create a Python venv and pip install torch, funasr, modelscope, boto3, and (3) run patch_clustering.py which modifies files inside the installed FunASR package in site-packages. Installing packages via pip and system package managers is expected for this task, but the automated modification of an installed third‑party package is a higher‑risk action and should be inspected before running.
Credentials
The skill declares no required env vars and lists optional LLM-related variables (AWS_REGION, ANTHROPIC_API_KEY, OPENAI_API_KEY, OPENAI_BASE_URL) for the opt‑in LLM cleanup — this is proportionate. However the instructions reference CLAUDE_PLUGIN_ROOT (not declared) and the LLM code will, if used, rely on provider credentials (AWS credentials via standard chain or explicit API keys).
Persistence & Privilege
always:false and the skill does not request permanent inclusion, but the setup script writes to disk (venv, installed packages) and patch_clustering.py edits files in site-packages (system/global Python package). That system‑level modification is a persistence/privilege concern and may require sudo; it changes third‑party library code outside the skill directory.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install zxkane-audio-transcriber-funasr - After installation, invoke the skill by name or use
/zxkane-audio-transcriber-funasr - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.6.0
- Added speaker gender detection script (`scripts/speaker_gender.py`).
- Improved main transcription and speaker verification scripts.
- Documentation updated to reflect new features and changes.
v1.5.1
Version 1.5.1
- Improved logic in transcribe_funasr.py for handling speaker verification and LLM cleanup stages.
- Enhanced speaker label verification in scripts/test_speaker_verification.py, including better mismatch detection and reporting.
- Updated documentation in SKILL.md to clarify workflow and usage, with small corrections and enhanced guidance.
v1.5.0
Version 1.5.0
- Adds a clear warning and detection logic for incorrect use of podcast aliases in `--speakers`, prompting for the real host/guest name based on supporting materials.
- Updates SKILL.md documentation to explain the new requirement: `--speakers` must use actual real names, not show or podcast aliases.
- Guidance added for including both real names and aliases in `hotwords.txt` for better ASR recognition.
- No changes to skill APIs or setup; this update improves correctness of speaker labeling and overall transcript quality.
v1.4.1
# zxkane-audio-transcriber-funasr 1.4.1
- Added a new `test_speaker_verification.py` script for improved speaker verification testing.
- Updated documentation in SKILL.md to match current features and clarify supported workflows.
- Minor tweaks and improvements to `transcribe_funasr.py` to enhance transcription reliability and maintain consistency with speaker verification utilities.
v1.4.0
- LLM cleanup (Phase 3) is now opt-in and runs only when --model is specified; by default, all transcription is local and no data is sent to external services.
- Documentation clarifies privacy defaults: external LLMs are used only with --model, and Bedrock uses the AWS credential chain.
- Description updated to reflect that transcription triggers only when the user explicitly requests it.
- Example commands and workflow sections updated to reflect the new LLM cleanup activation method.
- No major behavioral or interface changes beyond LLM phase invocation and stricter privacy defaults.
v1.3.1
**Adds environment variable docs and improves setup automation.**
- Documents support for Anthropic, OpenAI-compatible, and Bedrock LLM providers via environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, AWS_REGION, etc.)
- Updates SKILL.md to explain LLM provider credentials, API base URL, and privacy/credential isolation
- Adds "AUTO_YES=1" usage to environment setup for non-interactive installs
- No behavioral changes to transcription logic itself
v1.3.0
- Expanded language support: now handles Chinese, English, Japanese, Korean, Cantonese, and 99 languages (via Whisper), with automatic speaker diarization and hotword biasing.
- New, detailed workflow: guides users to provide context like meeting type, participant names, supporting documents, and preference for language and number of speakers to optimize transcription quality.
- Enhanced presets and diarization: per-language model selection with clear caveats on diarization support, especially for `auto` and `whisper` modes.
- LLM optional cleanup: supports post-processing transcripts with Bedrock, Anthropic, or OpenAI-compatible LLMs, with resume and skip options.
- Utility scripts included: speaker verification and reassignment script helps detect and fix swapped or misidentified speakers.
- Audio preprocessing improvements: all inputs auto-converted to 16kHz mono FLAC for reliability, with detailed format recommendations.
Metadata
Frequently Asked Questions
What is Funasr Transcribe?
This skill should be used when the user explicitly asks to "transcribe a meeting", "transcribe audio", "transcribe a meeting recording", "convert audio to te... It is an AI Agent Skill for Claude Code / OpenClaw, with 294 downloads so far.
How do I install Funasr Transcribe?
Run "/install zxkane-audio-transcriber-funasr" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Funasr Transcribe free?
Yes, Funasr Transcribe is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Funasr Transcribe support?
Funasr Transcribe is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Funasr Transcribe?
It is built and maintained by zxkane (@zxkane); the current version is v1.6.0.
More Skills