← Back to Skills Marketplace

Audio Command Executor

Name: Audio Command Executor
Author: sirkovz

by Sirko · GitHub ↗ · v1.0.1 · MIT-0

cross-platform ⚠ suspicious

139

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install audio-command-executor

Description

Processes inbound audio files, transcribes them, and answers to resulting texts. Converts non-WAV inputs to WAV before transcription.

README (SKILL.md)

Body

Trigger

Inbound Audiodateien, die im Verzeichnis /home/sirko/.openclaw/media/inbound/ landen (z. B. .ogg, .mp3, etc.)

Input

Eingabe: Pfad zur Audiodatei (z. B. /home/sirko/.openclaw/media/inbound/aufnahme.ogg)

Workflow

Normalize Format

Wenn Input nicht .wav ist, konvertiere zu WAV: /usr/bin/ffmpeg -i {input_file} -ar 16000 -ac 1 -c:a pcm_s16le {input_file}.wav Hinweis: Die Zieldatei heißt input_file.wav (Beispiel: /.../aufnahme.ogg → /.../aufnahme.wav)

Transkription

Transkribiere die WAV-Datei: /home/sirko/.openclaw/workspace/whisper.cpp/build/bin/whisper-cli -l DE -np -m /home/sirko/.openclaw/workspace/whisper.cpp/models/ggml-small.bin -f {input_wav_file}
Fange die Transkription als Text ab (stdout)

Ausführung

aus dem transkribierten Text entstandene Fragen oder Anweisungen einfach so in deutsch beantworten, als wäre es ein normaler Text, eingegeben über den Chat

Output

Einfach den Text verarbeiten, als wäre er als Text-DM eingegangen
Bei Fehlern: klare Fehlermeldung mit Ursachen (z. B. Datei nicht gefunden, Transkript leer, Ausführung fehlschlägt)

Beispiel-Ablauf

inbound/file.ogg → convert → /tmp/file.wav → whisper → "Was ist die Hauptstadt von Frankreich" → ermittele Antwort und zeige sie

Notes

immer auf deutsch antworten

Tests/Testszenarien

Test mit file.ogg (4 Sekunden) → Transkription prüfen
Test mit bereits WAV-Datei → direkte Transkription
Test mit fehlerhafter Datei → ordentliche Fehlermeldung

Usage Guidance

Before installing, consider: (1) The SKILL.md hardcodes user-specific paths (/home/sirko/...) and absolute binary paths (/usr/bin/ffmpeg, a local whisper-cli). Verify and edit these paths to match your environment so the skill won't try to access another user's directories. (2) Ensure ffmpeg, whisper-cli, and the referenced model file actually exist and are trusted — the skill will execute them. (3) Test with non-sensitive audio first, since transcription may expose private content to the agent. (4) Ask the publisher for corrected metadata: list required binaries and any model files, and remove or parameterize hardcoded paths. If you cannot verify or adjust these issues, avoid installing or run it in a restricted/test environment.

Capability Analysis

Type: OpenClaw Skill Name: audio-command-executor Version: 1.0.1 The skill exhibits critical vulnerabilities in SKILL.md, specifically shell command injection risks where the {input_file} variable is passed unsanitized to ffmpeg and whisper-cli. Furthermore, the instructions explicitly direct the agent to treat transcribed audio content as direct commands, creating a high risk for indirect prompt injection. While these appear to be design flaws rather than intentional malware, the combination of shell execution and hardcoded absolute paths (/home/sirko/) poses a significant security risk.

Capability Assessment

⚠ Purpose & Capability

The skill claims to process inbound audio and transcribe it, which aligns with the instructions. However the SKILL.md expects /usr/bin/ffmpeg and a local whisper-cli binary at /home/sirko/.openclaw/workspace/whisper.cpp/... and references inbound files under /home/sirko/.openclaw/media/inbound/. The registry metadata declared no required binaries or environment variables — that is inconsistent with the runtime commands the skill instructs the agent to run.

ℹ Instruction Scope

Runtime instructions are narrowly scoped to converting an input audio file and transcribing it, then responding in German. But they reference hardcoded absolute paths tied to a specific user (/home/sirko/...) and call local binaries by absolute path. This makes the skill environment-specific and could cause it to fail or behave unexpectedly on other systems; it also means the agent will read files from that user path and execute local binaries.

✓ Install Mechanism

There is no install spec (instruction-only). The included scripts/package_skill.py is a simple packaging utility with no network calls or hidden behavior. Lack of an install step reduces risk of arbitrary code being pulled during install, but runtime still executes local binaries.

ℹ Credentials

The skill declares no environment variables or credentials, which is reasonable for a local transcription skill. However, the instructions implicitly require filesystem access to /home/sirko/... and presence of a local model file and whisper binary; the absence of declared required binaries (ffmpeg, whisper-cli) is an omission that could mask necessary privileges or assumptions about the host environment.

✓ Persistence & Privilege

always is false and the skill is user-invocable. The skill does not request persistent privileges or attempt to modify other skills or global agent settings in the provided files.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install audio-command-executor
After installation, invoke the skill by name or use /audio-command-executor
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

- Updated ffmpeg command in documentation to reference /usr/bin/ffmpeg for audio conversion. - Added scripts/package_skill.py to the repository. - Bumped version from 1.0.2 to 1.0.3.

v1.0.0

audio-command-executor 1.0.2 - Added automatic conversion of non-WAV audio files (e.g., .ogg, .mp3) to WAV format before transcription. - Improved error handling with clearer messages for issues like missing files or empty transcriptions. - Now processes transcribed audio commands as if they were normal chat text, always responding in German. - Updated documentation with a detailed workflow, example usage, and specific test cases.

Metadata

Slug audio-command-executor

Version 1.0.1

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 2

Frequently Asked Questions

What is Audio Command Executor?

Processes inbound audio files, transcribes them, and answers to resulting texts. Converts non-WAV inputs to WAV before transcription. It is an AI Agent Skill for Claude Code / OpenClaw, with 139 downloads so far.

How do I install Audio Command Executor?

Run "/install audio-command-executor" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Audio Command Executor free?

Yes, Audio Command Executor is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Audio Command Executor support?

Audio Command Executor is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Audio Command Executor?

It is built and maintained by Sirko (@sirkovz); the current version is v1.0.1.

More Skills