← Back to Skills Marketplace

Funasr Transcribe

Name: Funasr Transcribe
Author: zxkane

by zxkane · GitHub ↗ · v1.6.0 · MIT-0

cross-platform ⚠ suspicious

294

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install zxkane-audio-transcriber-funasr

Description

This skill should be used when the user explicitly asks to "transcribe a meeting", "transcribe audio", "transcribe a meeting recording", "convert audio to te...

Usage Guidance

This skill appears to implement a real multi-speaker transcription pipeline and is coherent with its name, but it does things you should not run blind: it will install system packages and Python libraries, and it includes a script that patches an installed FunASR package file in site-packages (potentially system-wide and may require sudo). If you want to use it: (1) inspect patch_clustering.py and setup_env.sh line-by-line before running, (2) prefer running inside an isolated environment (container or VM) and as a non-root user, (3) avoid enabling the optional LLM cleanup (--model) if transcripts contain sensitive data unless you trust the chosen provider and have appropriate credentials, (4) note the skill references CLAUDE_PLUGIN_ROOT (ensure the platform provides that safely), and (5) if you are uncomfortable with automated patching of site-packages, run the transcribe pipeline manually after installing dependencies yourself or ask the author for a safer install mode. Because a prompt-injection pattern was detected in the SKILL.md, treat embedded instructions that affect agent behavior or system prompts with extra caution.

Capability Analysis

Type: OpenClaw Skill Name: zxkane-audio-transcriber-funasr Version: 1.6.0 The skill provides a sophisticated audio transcription pipeline but employs several high-risk techniques. Specifically, 'scripts/patch_clustering.py' modifies the source code of the installed 'funasr' library within the Python site-packages to optimize performance, which is a high-privilege 'monkey patching' operation. Additionally, 'scripts/setup_env.sh' uses 'sudo' for system package installation, and 'SKILL.md' contains instructions for the AI agent to use 'systemd-run' to launch detached processes that bypass agent-imposed execution timeouts. While these behaviors are documented as optimizations for long-duration audio processing, they constitute risky capabilities that exceed the typical scope of a benign skill.

Capability Tags

cryptorequires-sensitive-credentials

Capability Assessment

ℹ Purpose & Capability

Name/description match the included code: scripts perform ASR, diarization, post‑processing, optional LLM 'cleanup', hotword biasing, and speaker gender inference — all coherent for a transcription skill. Notable capabilities: speaker gender classification and an explicit clustering patch (modifies FunASR internals) which are beyond a minimal transcribe helper but plausible for long-meeting support.

⚠ Instruction Scope

SKILL.md instructs running the bundled scripts (e.g., setup_env.sh and transcribe_funasr.py) and to set SCRIPTS=${CLAUDE_PLUGIN_ROOT}/... — CLAUDE_PLUGIN_ROOT is referenced but not declared in the skill metadata. The docs explicitly allow sending transcript excerpts to external LLM providers when --model is used (this is opt‑in and documented). The presence of a pre-scan 'system-prompt-override' signal in SKILL.md raises concern about prompt-injection attempts in the instructions content.

⚠ Install Mechanism

No registry install spec, but bundled setup_env.sh will: (1) attempt to install ffmpeg via apt-get or brew (may invoke sudo), (2) create a Python venv and pip install torch, funasr, modelscope, boto3, and (3) run patch_clustering.py which modifies files inside the installed FunASR package in site-packages. Installing packages via pip and system package managers is expected for this task, but the automated modification of an installed third‑party package is a higher‑risk action and should be inspected before running.

ℹ Credentials

The skill declares no required env vars and lists optional LLM-related variables (AWS_REGION, ANTHROPIC_API_KEY, OPENAI_API_KEY, OPENAI_BASE_URL) for the opt‑in LLM cleanup — this is proportionate. However the instructions reference CLAUDE_PLUGIN_ROOT (not declared) and the LLM code will, if used, rely on provider credentials (AWS credentials via standard chain or explicit API keys).

⚠ Persistence & Privilege

always:false and the skill does not request permanent inclusion, but the setup script writes to disk (venv, installed packages) and patch_clustering.py edits files in site-packages (system/global Python package). That system‑level modification is a persistence/privilege concern and may require sudo; it changes third‑party library code outside the skill directory.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install zxkane-audio-transcriber-funasr
After installation, invoke the skill by name or use /zxkane-audio-transcriber-funasr
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.6.0

- Added speaker gender detection script (`scripts/speaker_gender.py`). - Improved main transcription and speaker verification scripts. - Documentation updated to reflect new features and changes.

v1.5.1

Version 1.5.1 - Improved logic in transcribe_funasr.py for handling speaker verification and LLM cleanup stages. - Enhanced speaker label verification in scripts/test_speaker_verification.py, including better mismatch detection and reporting. - Updated documentation in SKILL.md to clarify workflow and usage, with small corrections and enhanced guidance.

v1.5.0

Version 1.5.0 - Adds a clear warning and detection logic for incorrect use of podcast aliases in `--speakers`, prompting for the real host/guest name based on supporting materials. - Updates SKILL.md documentation to explain the new requirement: `--speakers` must use actual real names, not show or podcast aliases. - Guidance added for including both real names and aliases in `hotwords.txt` for better ASR recognition. - No changes to skill APIs or setup; this update improves correctness of speaker labeling and overall transcript quality.

v1.4.1

# zxkane-audio-transcriber-funasr 1.4.1 - Added a new `test_speaker_verification.py` script for improved speaker verification testing. - Updated documentation in SKILL.md to match current features and clarify supported workflows. - Minor tweaks and improvements to `transcribe_funasr.py` to enhance transcription reliability and maintain consistency with speaker verification utilities.

v1.4.0

- LLM cleanup (Phase 3) is now opt-in and runs only when --model is specified; by default, all transcription is local and no data is sent to external services. - Documentation clarifies privacy defaults: external LLMs are used only with --model, and Bedrock uses the AWS credential chain. - Description updated to reflect that transcription triggers only when the user explicitly requests it. - Example commands and workflow sections updated to reflect the new LLM cleanup activation method. - No major behavioral or interface changes beyond LLM phase invocation and stricter privacy defaults.

v1.3.1

**Adds environment variable docs and improves setup automation.** - Documents support for Anthropic, OpenAI-compatible, and Bedrock LLM providers via environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, AWS_REGION, etc.) - Updates SKILL.md to explain LLM provider credentials, API base URL, and privacy/credential isolation - Adds "AUTO_YES=1" usage to environment setup for non-interactive installs - No behavioral changes to transcription logic itself

v1.3.0

- Expanded language support: now handles Chinese, English, Japanese, Korean, Cantonese, and 99 languages (via Whisper), with automatic speaker diarization and hotword biasing. - New, detailed workflow: guides users to provide context like meeting type, participant names, supporting documents, and preference for language and number of speakers to optimize transcription quality. - Enhanced presets and diarization: per-language model selection with clear caveats on diarization support, especially for `auto` and `whisper` modes. - LLM optional cleanup: supports post-processing transcripts with Bedrock, Anthropic, or OpenAI-compatible LLMs, with resume and skip options. - Utility scripts included: speaker verification and reassignment script helps detect and fix swapped or misidentified speakers. - Audio preprocessing improvements: all inputs auto-converted to 16kHz mono FLAC for reliability, with detailed format recommendations.

Metadata

Slug zxkane-audio-transcriber-funasr

Version 1.6.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 7

Frequently Asked Questions

What is Funasr Transcribe?

This skill should be used when the user explicitly asks to "transcribe a meeting", "transcribe audio", "transcribe a meeting recording", "convert audio to te... It is an AI Agent Skill for Claude Code / OpenClaw, with 294 downloads so far.

How do I install Funasr Transcribe?

Run "/install zxkane-audio-transcriber-funasr" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Funasr Transcribe free?

Yes, Funasr Transcribe is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Funasr Transcribe support?

Funasr Transcribe is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Funasr Transcribe?

It is built and maintained by zxkane (@zxkane); the current version is v1.6.0.

More Skills