← Back to Skills Marketplace
mengbad

Deepgram Voice Workflow

by MengBad · GitHub ↗ · v0.1.0 · MIT-0
cross-platform ⚠ suspicious
288
Downloads
0
Stars
2
Active Installs
1
Versions
Install in OpenClaw
/install deepgram-voice-workflow
Description
End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin...
README (SKILL.md)

Deepgram Voice Workflow

Overview

Use this skill for a complete speech workflow:

  1. transcribe audio to text with Deepgram STT
  2. optionally synthesize a spoken reply with Deepgram TTS
  3. return structured outputs that can feed chat or agent pipelines

This skill is the right choice when the task is broader than plain transcription and needs an input-audio to output-audio pipeline.

Quick Start

Transcribe only

{baseDir}/scripts/deepgram-transcribe.sh /path/to/audio.ogg

Generate speech from text

{baseDir}/scripts/deepgram-tts.sh "你好,我是 Neko。"

Run the full pipeline

{baseDir}/scripts/neko-voice-pipeline.sh /path/to/audio.ogg --reply "收到啦,这是语音回复测试。"

Environment

Set DEEPGRAM_API_KEY before use.

The bundled scripts also fall back to reading it from:

  • /root/.openclaw/.env

Workflow Decision

Use deepgram-transcribe.sh when

  • only text transcription is needed
  • the downstream system will generate its own reply
  • the task is speech-to-text only

Use deepgram-tts.sh when

  • text already exists
  • only an MP3 spoken response is needed
  • the workflow is text-to-speech only

Use neko-voice-pipeline.sh when

  • the task begins with an audio file
  • a transcript is needed
  • an optional spoken reply should be generated in the same flow

Outputs

STT output

deepgram-transcribe.sh writes:

  • transcript text file
  • raw API JSON file next to it

TTS output

deepgram-tts.sh writes:

  • MP3 output file

Pipeline output

neko-voice-pipeline.sh prints JSON with:

  • out_dir
  • transcript_path
  • transcript
  • reply_audio_path

This makes it easy to wire into scripts or adapters.

Typical Uses

Prefer this skill for:

  • transcribing Telegram/QQ/OneBot voice messages
  • generating MP3 replies to short voice prompts
  • building bot-side voice input/output automation
  • testing speech pipelines from shell without introducing a full SDK

Notes

  • Defaults are tuned for lightweight practical use, not maximal configurability.
  • deepgram-transcribe.sh defaults to model=nova-2 and language=zh.
  • deepgram-tts.sh defaults to model=aura-2-luna-en; override the model when a different voice is preferred.
  • Inspect the raw JSON transcript response when debugging recognition quality or API errors.

References

Read these files when needed:

  • references/stt-notes.md for transcription details
  • references/tts-notes.md for speech synthesis details
  • references/pipeline-notes.md for end-to-end pipeline behavior
Usage Guidance
This skill appears to do what it says (call Deepgram STT/TTS and write transcripts/MP3s), but the package metadata did not declare the required DEEPGRAM_API_KEY — the scripts will fail without it. Before installing or running: 1) do not put sensitive credentials into a shared root file; prefer setting DEEPGRAM_API_KEY in the invoking user's environment rather than relying on /root/.openclaw/.env; 2) verify the Deepgram API key you provide is scoped appropriately (rotate and limit permissions where possible); 3) inspect the three shell scripts yourself (they are short) to confirm you are comfortable with network calls to api.deepgram.com and with files being written to /tmp or your chosen out_dir; and 4) be cautious because the skill source/homepage is unknown — if you need stronger assurance ask the publisher for provenance or a homepage before use.
Capability Analysis
Type: OpenClaw Skill Name: deepgram-voice-workflow Version: 0.1.0 The scripts `deepgram-transcribe.sh` and `deepgram-tts.sh` contain shell injection vulnerabilities because command-line arguments (such as `--model`, `--language`, and `--content-type`) are expanded directly into a double-quoted string within a `curl` command. An attacker could exploit this via prompt injection to execute arbitrary commands. Furthermore, the scripts hardcode a sensitive credential lookup path at `/root/.openclaw/.env`, which is a high-privilege location. While the logic appears intended for legitimate Deepgram API integration, these significant security flaws and risky file access patterns warrant a suspicious classification.
Capability Assessment
Purpose & Capability
Name, description, and included scripts are consistent with an end-to-end Deepgram STT/TTS pipeline; however the registry metadata declares no required environment variables or primary credential while the runtime explicitly requires DEEPGRAM_API_KEY. That mismatch is an incoherence that could surprise users.
Instruction Scope
Runtime instructions and bundled scripts are narrowly scoped to: read an input audio file, call api.deepgram.com, and write transcript/MP3 outputs. However the scripts also look for /root/.openclaw/.env as a fallback for DEEPGRAM_API_KEY (documented in SKILL.md). Reading a root-level config file is outside the minimal scope and was not declared in the registry metadata.
Install Mechanism
No install spec (instruction-only with bundled shell scripts). No remote downloads, no package installs, and no code obfuscation — this is low-risk from an install mechanism perspective.
Credentials
The scripts require a Deepgram API token (DEEPGRAM_API_KEY) to function, but the skill registry lists no required env/primary credential. The fallback that reads /root/.openclaw/.env is a privileged file path not declared. Requesting a single Deepgram key is proportionate to the stated purpose, but the undeclared root-level config access is problematic.
Persistence & Privilege
The skill does not request persistent/system-wide privileges (always=false). It does not modify other skills or system configs. It creates local output files (under /tmp or user-specified directories) which is expected.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install deepgram-voice-workflow
  3. After installation, invoke the skill by name or use /deepgram-voice-workflow
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.0
Initial public release
Metadata
Slug deepgram-voice-workflow
Version 0.1.0
License MIT-0
All-time Installs 2
Active Installs 2
Total Versions 1
Frequently Asked Questions

What is Deepgram Voice Workflow?

End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin... It is an AI Agent Skill for Claude Code / OpenClaw, with 288 downloads so far.

How do I install Deepgram Voice Workflow?

Run "/install deepgram-voice-workflow" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Deepgram Voice Workflow free?

Yes, Deepgram Voice Workflow is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Deepgram Voice Workflow support?

Deepgram Voice Workflow is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Deepgram Voice Workflow?

It is built and maintained by MengBad (@mengbad); the current version is v0.1.0.

💬 Comments