/install deepgram-voice-workflow
Deepgram Voice Workflow
Overview
Use this skill for a complete speech workflow:
- transcribe audio to text with Deepgram STT
- optionally synthesize a spoken reply with Deepgram TTS
- return structured outputs that can feed chat or agent pipelines
This skill is the right choice when the task is broader than plain transcription and needs an input-audio to output-audio pipeline.
Quick Start
Transcribe only
{baseDir}/scripts/deepgram-transcribe.sh /path/to/audio.ogg
Generate speech from text
{baseDir}/scripts/deepgram-tts.sh "你好,我是 Neko。"
Run the full pipeline
{baseDir}/scripts/neko-voice-pipeline.sh /path/to/audio.ogg --reply "收到啦,这是语音回复测试。"
Environment
Set DEEPGRAM_API_KEY before use.
The bundled scripts also fall back to reading it from:
/root/.openclaw/.env
Workflow Decision
Use deepgram-transcribe.sh when
- only text transcription is needed
- the downstream system will generate its own reply
- the task is speech-to-text only
Use deepgram-tts.sh when
- text already exists
- only an MP3 spoken response is needed
- the workflow is text-to-speech only
Use neko-voice-pipeline.sh when
- the task begins with an audio file
- a transcript is needed
- an optional spoken reply should be generated in the same flow
Outputs
STT output
deepgram-transcribe.sh writes:
- transcript text file
- raw API JSON file next to it
TTS output
deepgram-tts.sh writes:
- MP3 output file
Pipeline output
neko-voice-pipeline.sh prints JSON with:
out_dirtranscript_pathtranscriptreply_audio_path
This makes it easy to wire into scripts or adapters.
Typical Uses
Prefer this skill for:
- transcribing Telegram/QQ/OneBot voice messages
- generating MP3 replies to short voice prompts
- building bot-side voice input/output automation
- testing speech pipelines from shell without introducing a full SDK
Notes
- Defaults are tuned for lightweight practical use, not maximal configurability.
deepgram-transcribe.shdefaults tomodel=nova-2andlanguage=zh.deepgram-tts.shdefaults tomodel=aura-2-luna-en; override the model when a different voice is preferred.- Inspect the raw JSON transcript response when debugging recognition quality or API errors.
References
Read these files when needed:
references/stt-notes.mdfor transcription detailsreferences/tts-notes.mdfor speech synthesis detailsreferences/pipeline-notes.mdfor end-to-end pipeline behavior
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install deepgram-voice-workflow - 安装完成后,直接呼叫该 Skill 的名称或使用
/deepgram-voice-workflow触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Deepgram Voice Workflow 是什么?
End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 288 次。
如何安装 Deepgram Voice Workflow?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install deepgram-voice-workflow」即可一键安装,无需额外配置。
Deepgram Voice Workflow 是免费的吗?
是的,Deepgram Voice Workflow 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Deepgram Voice Workflow 支持哪些平台?
Deepgram Voice Workflow 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Deepgram Voice Workflow?
由 MengBad(@mengbad)开发并维护,当前版本 v0.1.0。