← Back to Skills Marketplace

Voice Recognition

Name: Voice Recognition
Author: gykdly

by gykdly · GitHub ↗ · v1.0.0

cross-platform ✓ Security Clean

1940

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install voice-recognition

Description

Local speech-to-text with OpenAI Whisper CLI. Supports Chinese, English, 100+ languages with translation and summarization.

Usage Guidance

This skill appears to do what it says: a small Python wrapper that invokes the local OpenAI Whisper CLI and writes transcripts locally. Before installing/use: (1) install openai-whisper from a trusted source (Homebrew tap) so the 'whisper' binary on your PATH is legitimate; (2) be aware the first run will download model weights to ~/.cache/whisper (large download and disk usage); (3) update the SKILL.md usage examples to point to the script location on your system instead of the hard-coded /Users/liyi/... path, and only create the suggested alias if you trust the script location; (4) transcripts are written next to the input audio file — check permissions and disk location; (5) if you want to reduce risk, run the script in an isolated environment (container or VM) until you confirm behavior. No signs of credential exfiltration or remote endpoints were found in the included files.

Capability Analysis

Type: OpenClaw Skill Name: voice-recognition Version: 1.0.0 The skill bundle is designed for local speech-to-text using the OpenAI Whisper CLI. The `SKILL.md` provides clear usage instructions without any prompt injection attempts or malicious directives for the agent. The Python script `scripts/voice识别_升级版.py` uses `subprocess.run` to execute the `whisper` command, passing user-controlled audio file paths as distinct arguments, which is the recommended secure method to prevent shell injection. Output files are written locally to the same directory as the input audio, aligning with the skill's stated purpose. There is no evidence of data exfiltration, persistence mechanisms, or other malicious behaviors.

Capability Assessment

ℹ Purpose & Capability

The name/description (local Whisper-based speech-to-text) match the included Python script and the SKILL.md. The README asks you to install openai-whisper via Homebrew and use Python 3.10+, which is appropriate. Minor oddity: usage examples in SKILL.md hard-code an absolute path (/Users/liyi/.openclaw/workspace/...) pointing to a specific user's workspace — this is inconsistent with distributing the script and should be updated to relative or generic paths.

ℹ Instruction Scope

Runtime instructions simply run the included Python script which calls the external 'whisper' CLI (no shell=True). The script reads an audio file, writes a .txt transcript beside that file, and can generate a simple local summary. It does not read unrelated system files or environment variables, nor does it post data to remote endpoints. Note: first run will download model weights to ~/.cache/whisper (network and disk usage).

✓ Install Mechanism

There is no install spec (instruction-only skill). The SKILL.md recommends 'brew install openai-whisper' which is a reasonable, low-risk installation path for the Whisper CLI.

✓ Credentials

The skill requests no environment variables, no credentials, and no config paths. The behavior (invoking a local 'whisper' binary) is proportionate to the stated function. Reminder: because it calls an external binary by name, it depends on the 'whisper' in PATH being the expected implementation.

✓ Persistence & Privilege

The skill does not request permanent/always inclusion, does not modify other skills, and contains no code that attempts to change system-wide agent settings. It only suggests an optional shell alias for convenience (user action).

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install voice-recognition
After installation, invoke the skill by name or use /voice-recognition
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release of voice-recognition skill. - Provides local speech-to-text using OpenAI Whisper CLI, no API key required. - Supports 100+ languages, including Chinese and English. - Offers translation to English and text summarization features. - Compatible with various audio formats: MP3, M4A, WAV, OGG, FLAC, WebM. - Easy command-line usage and quick alias setup instructions included.

Metadata

Slug voice-recognition

Version 1.0.0

License —

All-time Installs 8

Active Installs 6

Total Versions 1

Frequently Asked Questions

What is Voice Recognition?

Local speech-to-text with OpenAI Whisper CLI. Supports Chinese, English, 100+ languages with translation and summarization. It is an AI Agent Skill for Claude Code / OpenClaw, with 1940 downloads so far.

How do I install Voice Recognition?

Run "/install voice-recognition" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Voice Recognition free?

Yes, Voice Recognition is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Voice Recognition support?

Voice Recognition is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Voice Recognition?

It is built and maintained by gykdly (@gykdly); the current version is v1.0.0.

More Skills