← Back to Skills Marketplace
🔌
Speech De-Noise, Vocal Enhancement
by
speech2srt
· GitHub ↗
· v1.3.1
· MIT-0
200
Downloads
1
Stars
0
Active Installs
5
Versions
Install in OpenClaw
/install speech-denoise
Description
Speech enhancement / vocal denoising on remote (FREE) L4 GPU. Trigger when user says: denoise, remove noise, clean up audio, 去噪, 降噪, enhance audio. Takes loc...
Usage Guidance
This skill appears to do exactly what it claims: convert and denoise local audio using a ClearerVoice model on a remote Modal GPU. Before using it: (1) ensure you run it from a directory that only contains the audio/video files you intend to upload (it will scan and upload matched extensions), (2) be aware you must authenticate with Modal (modal setup) so the tool can create/use Modal volumes — keep your Modal token private, (3) try with non-sensitive sample audio first, and (4) if you have policy or data-residency concerns, confirm that uploading files to Modal volumes and remote GPU inference is acceptable. If you'd like higher assurance, inspect or run the denoise.py logic locally on a small file to confirm behavior (it does symlink a checkpoints path inside the container and installs packages in the container only).
Capability Analysis
Type: OpenClaw Skill
Name: speech-denoise
Version: 1.3.1
The speech-denoise skill is a legitimate pipeline for audio enhancement using the Modal cloud platform and the ClearerVoice-Studio model. The code (denoise.py) and instructions (SKILL.md) describe a standard workflow involving file upload to Modal volumes, GPU-accelerated inference, and result retrieval. It uses safe subprocess calls for ffmpeg processing and lacks any indicators of data exfiltration, persistence, or malicious intent.
Capability Assessment
Purpose & Capability
Name/description (speech denoise on a remote L4 GPU) matches the code and SKILL.md: it uses Modal, mounts volumes, runs ffmpeg conversions and a ClearerVoice model on GPU. No unrelated credentials or services are requested.
Instruction Scope
Instructions are focused on finding audio/video files, uploading them to Modal volumes, running the denoise job, and downloading results. Caution: the workflow scans the current directory for matching extensions and uploads matched files — running it from a wide or sensitive directory could accidentally upload unintended files.
Install Mechanism
This is instruction-plus-code (no external arbitrary URL downloads). The image build (src/images.py) installs ffmpeg (apt) and Python packages (clearvoice, torch, torchaudio) in the container via standard package managers (apt/pip). This is expected for the task; no unusual remote installers or URL-shortened downloads are used.
Credentials
The skill declares no required environment variables or external credentials. It does rely on the user's Modal authentication (modal CLI) to upload to Modal volumes, which is expected. The image sets only harmless container env vars (TQDM_DISABLE, HF_HUB_DISABLE_PROGRESS).
Persistence & Privilege
always is false; the skill does not request permanent injection or modify other skills. It creates/uses Modal volumes and an image scoped to its app, which is appropriate for a remote processing pipeline.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install speech-denoise - After installation, invoke the skill by name or use
/speech-denoise - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.3.1
- Version bump from v1.3.0 to v1.3.1.
- No code or documentation changes detected in this release.
v1.3.0
- Version updated to v1.3.0.
- No file or functional changes detected in this release.
- Documentation remains unchanged from the previous version.
v1.0.2
v1.2.0 brings major updates for improved workflow and usability:
- Now uses (FREE) L4 GPU for speech enhancement on remote servers.
- Upgraded to the ClearerVoice-Studio MossFormer2 model for improved denoising.
- Changed volume name from speech2srt-denoise-data to speech2srt-data for all Modal operations.
- Enhanced description, clarifying input/output and workflow details.
- Added user guidance for subtitle tools and attribution note in results report.
v1.0.1
- Version updated to v1.0.1.
- Fix: yaml, name.
v1.0.0
Initial release of speech-denoise skill:
- Provides one-step speech enhancement (vocal denoising) using ClearVoice MossFormer2 on Modal L4 GPU.
- Accepts local audio and video files, returning noise-reduced audio.
- Supports intuitive trigger phrases (e.g., "denoise", "remove noise", "clean up audio", "enhance audio", "去噪", "降噪").
- Handles multiple formats: m4a, mp3, mp4, wav, flac, ogg, aac, mov, avi.
- Guides users through upload, processing, download, and optional format conversion.
- Includes built-in error handling and setup checks for Python, ffmpeg, and Modal CLI.
Metadata
Frequently Asked Questions
What is Speech De-Noise, Vocal Enhancement?
Speech enhancement / vocal denoising on remote (FREE) L4 GPU. Trigger when user says: denoise, remove noise, clean up audio, 去噪, 降噪, enhance audio. Takes loc... It is an AI Agent Skill for Claude Code / OpenClaw, with 200 downloads so far.
How do I install Speech De-Noise, Vocal Enhancement?
Run "/install speech-denoise" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Speech De-Noise, Vocal Enhancement free?
Yes, Speech De-Noise, Vocal Enhancement is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Speech De-Noise, Vocal Enhancement support?
Speech De-Noise, Vocal Enhancement is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Speech De-Noise, Vocal Enhancement?
It is built and maintained by speech2srt (@speech2srt); the current version is v1.3.1.
More Skills