← Back to Skills Marketplace

Deepgram Voice Workflow

Name: Deepgram Voice Workflow
Author: mengbad

by MengBad · GitHub ↗ · v0.1.0 · MIT-0

cross-platform ⚠ suspicious

288

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install deepgram-voice-workflow

Description

End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin...

README (SKILL.md)

Deepgram Voice Workflow

Overview

Use this skill for a complete speech workflow:

transcribe audio to text with Deepgram STT
optionally synthesize a spoken reply with Deepgram TTS
return structured outputs that can feed chat or agent pipelines

This skill is the right choice when the task is broader than plain transcription and needs an input-audio to output-audio pipeline.

Quick Start

Transcribe only

{baseDir}/scripts/deepgram-transcribe.sh /path/to/audio.ogg

Generate speech from text

{baseDir}/scripts/deepgram-tts.sh "你好，我是 Neko。"

Run the full pipeline

{baseDir}/scripts/neko-voice-pipeline.sh /path/to/audio.ogg --reply "收到啦，这是语音回复测试。"

Environment

Set DEEPGRAM_API_KEY before use.

The bundled scripts also fall back to reading it from:

/root/.openclaw/.env

Workflow Decision

Use `deepgram-transcribe.sh` when

only text transcription is needed
the downstream system will generate its own reply
the task is speech-to-text only

Use `deepgram-tts.sh` when

text already exists
only an MP3 spoken response is needed
the workflow is text-to-speech only

Use `neko-voice-pipeline.sh` when

the task begins with an audio file
a transcript is needed
an optional spoken reply should be generated in the same flow

Outputs

STT output

deepgram-transcribe.sh writes:

transcript text file
raw API JSON file next to it

TTS output

deepgram-tts.sh writes:

MP3 output file

Pipeline output

neko-voice-pipeline.sh prints JSON with:

out_dir
transcript_path
transcript
reply_audio_path

This makes it easy to wire into scripts or adapters.

Typical Uses

Prefer this skill for:

transcribing Telegram/QQ/OneBot voice messages
generating MP3 replies to short voice prompts
building bot-side voice input/output automation
testing speech pipelines from shell without introducing a full SDK

Notes

Defaults are tuned for lightweight practical use, not maximal configurability.
deepgram-transcribe.sh defaults to model=nova-2 and language=zh.
deepgram-tts.sh defaults to model=aura-2-luna-en; override the model when a different voice is preferred.
Inspect the raw JSON transcript response when debugging recognition quality or API errors.

References

Read these files when needed:

references/stt-notes.md for transcription details
references/tts-notes.md for speech synthesis details
references/pipeline-notes.md for end-to-end pipeline behavior

Usage Guidance

This skill appears to do what it says (call Deepgram STT/TTS and write transcripts/MP3s), but the package metadata did not declare the required DEEPGRAM_API_KEY — the scripts will fail without it. Before installing or running: 1) do not put sensitive credentials into a shared root file; prefer setting DEEPGRAM_API_KEY in the invoking user's environment rather than relying on /root/.openclaw/.env; 2) verify the Deepgram API key you provide is scoped appropriately (rotate and limit permissions where possible); 3) inspect the three shell scripts yourself (they are short) to confirm you are comfortable with network calls to api.deepgram.com and with files being written to /tmp or your chosen out_dir; and 4) be cautious because the skill source/homepage is unknown — if you need stronger assurance ask the publisher for provenance or a homepage before use.

Capability Analysis

Type: OpenClaw Skill Name: deepgram-voice-workflow Version: 0.1.0 The scripts `deepgram-transcribe.sh` and `deepgram-tts.sh` contain shell injection vulnerabilities because command-line arguments (such as `--model`, `--language`, and `--content-type`) are expanded directly into a double-quoted string within a `curl` command. An attacker could exploit this via prompt injection to execute arbitrary commands. Furthermore, the scripts hardcode a sensitive credential lookup path at `/root/.openclaw/.env`, which is a high-privilege location. While the logic appears intended for legitimate Deepgram API integration, these significant security flaws and risky file access patterns warrant a suspicious classification.

Capability Assessment

⚠ Purpose & Capability

Name, description, and included scripts are consistent with an end-to-end Deepgram STT/TTS pipeline; however the registry metadata declares no required environment variables or primary credential while the runtime explicitly requires DEEPGRAM_API_KEY. That mismatch is an incoherence that could surprise users.

⚠ Instruction Scope

Runtime instructions and bundled scripts are narrowly scoped to: read an input audio file, call api.deepgram.com, and write transcript/MP3 outputs. However the scripts also look for /root/.openclaw/.env as a fallback for DEEPGRAM_API_KEY (documented in SKILL.md). Reading a root-level config file is outside the minimal scope and was not declared in the registry metadata.

✓ Install Mechanism

No install spec (instruction-only with bundled shell scripts). No remote downloads, no package installs, and no code obfuscation — this is low-risk from an install mechanism perspective.

⚠ Credentials

The scripts require a Deepgram API token (DEEPGRAM_API_KEY) to function, but the skill registry lists no required env/primary credential. The fallback that reads /root/.openclaw/.env is a privileged file path not declared. Requesting a single Deepgram key is proportionate to the stated purpose, but the undeclared root-level config access is problematic.

✓ Persistence & Privilege

The skill does not request persistent/system-wide privileges (always=false). It does not modify other skills or system configs. It creates local output files (under /tmp or user-specified directories) which is expected.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install deepgram-voice-workflow
After installation, invoke the skill by name or use /deepgram-voice-workflow
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.1.0

Initial public release

Metadata

Slug deepgram-voice-workflow

Version 0.1.0

License MIT-0

All-time Installs 2

Active Installs 2

Total Versions 1

Frequently Asked Questions

What is Deepgram Voice Workflow?

End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin... It is an AI Agent Skill for Claude Code / OpenClaw, with 288 downloads so far.

How do I install Deepgram Voice Workflow?

Run "/install deepgram-voice-workflow" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Deepgram Voice Workflow free?

Yes, Deepgram Voice Workflow is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Deepgram Voice Workflow support?

Deepgram Voice Workflow is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Deepgram Voice Workflow?

It is built and maintained by MengBad (@mengbad); the current version is v0.1.0.

More Skills

Deepgram Voice Workflow

Deepgram Voice Workflow

Overview

Quick Start

Transcribe only

Generate speech from text

Run the full pipeline

Environment

Workflow Decision

Use deepgram-transcribe.sh when

Use deepgram-tts.sh when

Use neko-voice-pipeline.sh when

Outputs

STT output

TTS output

Pipeline output

Typical Uses

Notes

References

What is Deepgram Voice Workflow?

How do I install Deepgram Voice Workflow?

Is Deepgram Voice Workflow free?

Which platforms does Deepgram Voice Workflow support?

Who created Deepgram Voice Workflow?

💬 Comments

Use `deepgram-transcribe.sh` when

Use `deepgram-tts.sh` when

Use `neko-voice-pipeline.sh` when