← 返回 Skills 市场

Deepgram Voice Workflow

Name: Deepgram Voice Workflow
Author: mengbad

作者 MengBad · GitHub ↗ · v0.1.0 · MIT-0

cross-platform ⚠ suspicious

288

总下载

当前安装

版本数

在 OpenClaw 中安装

/install deepgram-voice-workflow

功能描述

End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin...

使用说明 (SKILL.md)

Deepgram Voice Workflow

Overview

Use this skill for a complete speech workflow:

transcribe audio to text with Deepgram STT
optionally synthesize a spoken reply with Deepgram TTS
return structured outputs that can feed chat or agent pipelines

This skill is the right choice when the task is broader than plain transcription and needs an input-audio to output-audio pipeline.

Quick Start

Transcribe only

{baseDir}/scripts/deepgram-transcribe.sh /path/to/audio.ogg

Generate speech from text

{baseDir}/scripts/deepgram-tts.sh "你好，我是 Neko。"

Run the full pipeline

{baseDir}/scripts/neko-voice-pipeline.sh /path/to/audio.ogg --reply "收到啦，这是语音回复测试。"

Environment

Set DEEPGRAM_API_KEY before use.

The bundled scripts also fall back to reading it from:

/root/.openclaw/.env

Workflow Decision

Use `deepgram-transcribe.sh` when

only text transcription is needed
the downstream system will generate its own reply
the task is speech-to-text only

Use `deepgram-tts.sh` when

text already exists
only an MP3 spoken response is needed
the workflow is text-to-speech only

Use `neko-voice-pipeline.sh` when

the task begins with an audio file
a transcript is needed
an optional spoken reply should be generated in the same flow

Outputs

STT output

deepgram-transcribe.sh writes:

transcript text file
raw API JSON file next to it

TTS output

deepgram-tts.sh writes:

MP3 output file

Pipeline output

neko-voice-pipeline.sh prints JSON with:

out_dir
transcript_path
transcript
reply_audio_path

This makes it easy to wire into scripts or adapters.

Typical Uses

Prefer this skill for:

transcribing Telegram/QQ/OneBot voice messages
generating MP3 replies to short voice prompts
building bot-side voice input/output automation
testing speech pipelines from shell without introducing a full SDK

Notes

Defaults are tuned for lightweight practical use, not maximal configurability.
deepgram-transcribe.sh defaults to model=nova-2 and language=zh.
deepgram-tts.sh defaults to model=aura-2-luna-en; override the model when a different voice is preferred.
Inspect the raw JSON transcript response when debugging recognition quality or API errors.

References

Read these files when needed:

references/stt-notes.md for transcription details
references/tts-notes.md for speech synthesis details
references/pipeline-notes.md for end-to-end pipeline behavior

安全使用建议

This skill appears to do what it says (call Deepgram STT/TTS and write transcripts/MP3s), but the package metadata did not declare the required DEEPGRAM_API_KEY — the scripts will fail without it. Before installing or running: 1) do not put sensitive credentials into a shared root file; prefer setting DEEPGRAM_API_KEY in the invoking user's environment rather than relying on /root/.openclaw/.env; 2) verify the Deepgram API key you provide is scoped appropriately (rotate and limit permissions where possible); 3) inspect the three shell scripts yourself (they are short) to confirm you are comfortable with network calls to api.deepgram.com and with files being written to /tmp or your chosen out_dir; and 4) be cautious because the skill source/homepage is unknown — if you need stronger assurance ask the publisher for provenance or a homepage before use.

功能分析

Type: OpenClaw Skill Name: deepgram-voice-workflow Version: 0.1.0 The scripts `deepgram-transcribe.sh` and `deepgram-tts.sh` contain shell injection vulnerabilities because command-line arguments (such as `--model`, `--language`, and `--content-type`) are expanded directly into a double-quoted string within a `curl` command. An attacker could exploit this via prompt injection to execute arbitrary commands. Furthermore, the scripts hardcode a sensitive credential lookup path at `/root/.openclaw/.env`, which is a high-privilege location. While the logic appears intended for legitimate Deepgram API integration, these significant security flaws and risky file access patterns warrant a suspicious classification.

能力评估

⚠ Purpose & Capability

Name, description, and included scripts are consistent with an end-to-end Deepgram STT/TTS pipeline; however the registry metadata declares no required environment variables or primary credential while the runtime explicitly requires DEEPGRAM_API_KEY. That mismatch is an incoherence that could surprise users.

⚠ Instruction Scope

Runtime instructions and bundled scripts are narrowly scoped to: read an input audio file, call api.deepgram.com, and write transcript/MP3 outputs. However the scripts also look for /root/.openclaw/.env as a fallback for DEEPGRAM_API_KEY (documented in SKILL.md). Reading a root-level config file is outside the minimal scope and was not declared in the registry metadata.

✓ Install Mechanism

No install spec (instruction-only with bundled shell scripts). No remote downloads, no package installs, and no code obfuscation — this is low-risk from an install mechanism perspective.

⚠ Credentials

The scripts require a Deepgram API token (DEEPGRAM_API_KEY) to function, but the skill registry lists no required env/primary credential. The fallback that reads /root/.openclaw/.env is a privileged file path not declared. Requesting a single Deepgram key is proportionate to the stated purpose, but the undeclared root-level config access is problematic.

✓ Persistence & Privilege

The skill does not request persistent/system-wide privileges (always=false). It does not modify other skills or system configs. It creates local output files (under /tmp or user-specified directories) which is expected.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install deepgram-voice-workflow
安装完成后，直接呼叫该 Skill 的名称或使用 /deepgram-voice-workflow 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v0.1.0

Initial public release

元数据

Slug deepgram-voice-workflow

版本 0.1.0

许可证 MIT-0

累计安装 2

当前安装数 2

历史版本数 1

常见问题