← Back to Skills Marketplace

SenseVoice Transcribe

Name: SenseVoice Transcribe
Author: ylongw

by ylongw · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

370

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install sensevoice-transcribe

Description

Transcribe audio files (WAV/MP3/M4A/FLAC) to timestamped text using SenseVoice-Small + FSMN-VAD. Supports single-file and batch mode with VAD-anchored per-se...

Usage Guidance

This skill appears coherent for local batch transcription. Before running: (1) review and run it inside an isolated venv as documented; (2) confirm the script will be pointed at the correct daylog directory (it writes to and may delete files there with --force-dates); (3) only supply a Discord webhook URL you control — the webhook is optional and would allow the script to send status messages; (4) be aware models (~234MB + VAD) will be downloaded from ModelScope on first run and that the script will write progress files where you point PROGRESS_FILE; (5) back up any existing transcripts before using --force-dates. If you want extra assurance, inspect the full script run (stdout/logs) during a dry-run and search the file for any additional network calls or subprocess usage not covered in the visible code.

Capability Analysis

Type: OpenClaw Skill Name: sensevoice-transcribe Version: 1.0.0 The skill provides legitimate audio transcription functionality but contains a significant security vulnerability in `scripts/batch_transcribe.py`. The `--force-dates` command-line argument is used to construct file paths for directory deletion via `shutil.rmtree` without any sanitization or validation, which allows for path traversal and arbitrary directory deletion. While the script's features (such as Discord webhook notifications and batch processing) align with its stated purpose, the lack of input handling in a file-deletion routine poses a high risk if the OpenClaw agent is manipulated into passing malicious strings.

Capability Assessment

✓ Purpose & Capability

Name/description (SenseVoice transcribe) match the included instructions and the batch_transcribe.py implementation: model loading, VAD segmentation, timestamp mapping, file collection, and transcript output are all relevant and expected. Required packages (funasr, modelscope, onnxruntime) and model downloads are proportionate to on-device transcription.

ℹ Instruction Scope

SKILL.md and the script operate on a 'daylog' directory (raw/ → transcripts/) and include expected behaviors (dry-run, re-transcribe, indexed outputs). The script can delete transcripts with --force-dates and can POST to a Discord webhook if the user provides one; both are documented CLI options. These behaviors are within the stated purpose but are powerful (deletion, external webhook) and should be used with care.

✓ Install Mechanism

No install spec in the registry (instruction-only), and SKILL.md recommends creating a Python venv and pip installing specific packages. This is a standard, low-risk approach for a Python-based transcription tool. Models are auto-downloaded from ModelScope as described.

✓ Credentials

The skill requests no environment variables or credentials. The only network interaction is optional and user-supplied (Discord webhook URL) and model downloads from ModelScope on first run — both are consistent with purpose and proportionate.

✓ Persistence & Privilege

The skill is not forced-always and uses default autonomous invocation settings. It does write transcripts and an optional progress file and can delete files when --force-dates is used; these are normal for a batch-processing tool and are limited to its working directories.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install sensevoice-transcribe
After installation, invoke the skill by name or use /sensevoice-transcribe
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release: SenseVoice-Small + FSMN-VAD batch transcription with VAD-anchored timestamps

Metadata

Slug sensevoice-transcribe

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is SenseVoice Transcribe?

Transcribe audio files (WAV/MP3/M4A/FLAC) to timestamped text using SenseVoice-Small + FSMN-VAD. Supports single-file and batch mode with VAD-anchored per-se... It is an AI Agent Skill for Claude Code / OpenClaw, with 370 downloads so far.

How do I install SenseVoice Transcribe?

Run "/install sensevoice-transcribe" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is SenseVoice Transcribe free?

Yes, SenseVoice Transcribe is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does SenseVoice Transcribe support?

SenseVoice Transcribe is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created SenseVoice Transcribe?

It is built and maintained by ylongw (@ylongw); the current version is v1.0.0.

More Skills