← Back to Skills Marketplace
sirkovz

Audio Command Executor

by Sirko · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ⚠ suspicious
139
Downloads
0
Stars
1
Active Installs
2
Versions
Install in OpenClaw
/install audio-command-executor
Description
Processes inbound audio files, transcribes them, and answers to resulting texts. Converts non-WAV inputs to WAV before transcription.
README (SKILL.md)

Body

Trigger

  • Inbound Audiodateien, die im Verzeichnis /home/sirko/.openclaw/media/inbound/ landen (z. B. .ogg, .mp3, etc.)

Input

  • Eingabe: Pfad zur Audiodatei (z. B. /home/sirko/.openclaw/media/inbound/aufnahme.ogg)

Workflow

  1. Normalize Format
  • Wenn Input nicht .wav ist, konvertiere zu WAV: /usr/bin/ffmpeg -i {input_file} -ar 16000 -ac 1 -c:a pcm_s16le {input_file}.wav Hinweis: Die Zieldatei heißt input_file.wav (Beispiel: /.../aufnahme.ogg → /.../aufnahme.wav)
  1. Transkription
  • Transkribiere die WAV-Datei: /home/sirko/.openclaw/workspace/whisper.cpp/build/bin/whisper-cli -l DE -np -m /home/sirko/.openclaw/workspace/whisper.cpp/models/ggml-small.bin -f {input_wav_file}
  • Fange die Transkription als Text ab (stdout)
  1. Ausführung
  • aus dem transkribierten Text entstandene Fragen oder Anweisungen einfach so in deutsch beantworten, als wäre es ein normaler Text, eingegeben über den Chat

Output

  • Einfach den Text verarbeiten, als wäre er als Text-DM eingegangen
  • Bei Fehlern: klare Fehlermeldung mit Ursachen (z. B. Datei nicht gefunden, Transkript leer, Ausführung fehlschlägt)

Beispiel-Ablauf

  • inbound/file.ogg → convert → /tmp/file.wav → whisper → "Was ist die Hauptstadt von Frankreich" → ermittele Antwort und zeige sie

Notes

  • immer auf deutsch antworten

Tests/Testszenarien

  • Test mit file.ogg (4 Sekunden) → Transkription prüfen
  • Test mit bereits WAV-Datei → direkte Transkription
  • Test mit fehlerhafter Datei → ordentliche Fehlermeldung
Usage Guidance
Before installing, consider: (1) The SKILL.md hardcodes user-specific paths (/home/sirko/...) and absolute binary paths (/usr/bin/ffmpeg, a local whisper-cli). Verify and edit these paths to match your environment so the skill won't try to access another user's directories. (2) Ensure ffmpeg, whisper-cli, and the referenced model file actually exist and are trusted — the skill will execute them. (3) Test with non-sensitive audio first, since transcription may expose private content to the agent. (4) Ask the publisher for corrected metadata: list required binaries and any model files, and remove or parameterize hardcoded paths. If you cannot verify or adjust these issues, avoid installing or run it in a restricted/test environment.
Capability Analysis
Type: OpenClaw Skill Name: audio-command-executor Version: 1.0.1 The skill exhibits critical vulnerabilities in SKILL.md, specifically shell command injection risks where the {input_file} variable is passed unsanitized to ffmpeg and whisper-cli. Furthermore, the instructions explicitly direct the agent to treat transcribed audio content as direct commands, creating a high risk for indirect prompt injection. While these appear to be design flaws rather than intentional malware, the combination of shell execution and hardcoded absolute paths (/home/sirko/) poses a significant security risk.
Capability Assessment
Purpose & Capability
The skill claims to process inbound audio and transcribe it, which aligns with the instructions. However the SKILL.md expects /usr/bin/ffmpeg and a local whisper-cli binary at /home/sirko/.openclaw/workspace/whisper.cpp/... and references inbound files under /home/sirko/.openclaw/media/inbound/. The registry metadata declared no required binaries or environment variables — that is inconsistent with the runtime commands the skill instructs the agent to run.
Instruction Scope
Runtime instructions are narrowly scoped to converting an input audio file and transcribing it, then responding in German. But they reference hardcoded absolute paths tied to a specific user (/home/sirko/...) and call local binaries by absolute path. This makes the skill environment-specific and could cause it to fail or behave unexpectedly on other systems; it also means the agent will read files from that user path and execute local binaries.
Install Mechanism
There is no install spec (instruction-only). The included scripts/package_skill.py is a simple packaging utility with no network calls or hidden behavior. Lack of an install step reduces risk of arbitrary code being pulled during install, but runtime still executes local binaries.
Credentials
The skill declares no environment variables or credentials, which is reasonable for a local transcription skill. However, the instructions implicitly require filesystem access to /home/sirko/... and presence of a local model file and whisper binary; the absence of declared required binaries (ffmpeg, whisper-cli) is an omission that could mask necessary privileges or assumptions about the host environment.
Persistence & Privilege
always is false and the skill is user-invocable. The skill does not request persistent privileges or attempt to modify other skills or global agent settings in the provided files.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install audio-command-executor
  3. After installation, invoke the skill by name or use /audio-command-executor
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
- Updated ffmpeg command in documentation to reference /usr/bin/ffmpeg for audio conversion. - Added scripts/package_skill.py to the repository. - Bumped version from 1.0.2 to 1.0.3.
v1.0.0
audio-command-executor 1.0.2 - Added automatic conversion of non-WAV audio files (e.g., .ogg, .mp3) to WAV format before transcription. - Improved error handling with clearer messages for issues like missing files or empty transcriptions. - Now processes transcribed audio commands as if they were normal chat text, always responding in German. - Updated documentation with a detailed workflow, example usage, and specific test cases.
Metadata
Slug audio-command-executor
Version 1.0.1
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 2
Frequently Asked Questions

What is Audio Command Executor?

Processes inbound audio files, transcribes them, and answers to resulting texts. Converts non-WAV inputs to WAV before transcription. It is an AI Agent Skill for Claude Code / OpenClaw, with 139 downloads so far.

How do I install Audio Command Executor?

Run "/install audio-command-executor" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Audio Command Executor free?

Yes, Audio Command Executor is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Audio Command Executor support?

Audio Command Executor is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Audio Command Executor?

It is built and maintained by Sirko (@sirkovz); the current version is v1.0.1.

💬 Comments