← Back to Skills Marketplace
adriel1006

Discord Voice Using Deepgram

by adriel1006 · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
1448
Downloads
5
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install discord-voice-deepgram
Description
Voice-channel conversations in Discord using Deepgram streaming STT + low-latency TTS
README (SKILL.md)

Deepgram Discord Voice (Clawdbot/OpenClaw Plugin)

This plugin lets you talk to your agent only from a Discord voice channel.

Pipeline (low latency):

  • Discord voice audio → Deepgram streaming STT (WebSocket)
  • Transcript → your agent
  • Agent reply → Deepgram TTS (/v1/speak streamed HTTP Ogg/Opus)
  • Audio played back into the voice channel

Requirements

  • A Discord bot token (DISCORD_TOKEN)
  • A Deepgram API key (DEEPGRAM_API_KEY)
  • Discord bot permissions in your server:
    • Connect
    • Speak
    • Use Voice Activity

Install

Option A: Install from ClawHub

  1. In your OpenClaw/Clawdbot dashboard, open Skills/Plugins.
  2. Add/install deepgram-discord-voice.
  3. Set the required environment variables.

Option B: Manual install

  1. Copy this folder into your extensions/plugins directory.
  2. Run:
npm install
  1. Restart OpenClaw/Clawdbot.

Configuration

Key settings

  • primaryUser (recommended): Who the bot listens to by default.

    • Best: your Discord user ID (numeric)
    • Also supported: username/display name (e.g., atechy) if unique in-channel
  • allowVoiceSwitch: If true, the primary user can switch who is allowed by voice.

  • wakeWord: Prefix for voice control commands. Default: openclaw.

  • deepgram.sttModel: Default nova-2.

  • deepgram.language: Optional BCP‑47 language tag (e.g., en-US, es, es-EC).

  • ttsVoice: Deepgram Aura voice model (e.g., aura-2-thalia-en).

Example config

{
  "plugins": {
    "entries": {
      "deepgram-discord-voice": {
        "enabled": true,
        "config": {
          "streamingSTT": true,
          "streamingTTS": true,

          "primaryUser": "atechy",
          "allowVoiceSwitch": true,
          "wakeWord": "openclaw",

          "ttsVoice": "aura-2-thalia-en",
          "vadSensitivity": "medium",
          "bargeIn": true,

          "deepgram": {
            "sttModel": "nova-2",
            "language": "en-US"
          }
        }
      }
    }
  }
}

Usage

Join a voice channel

Use the plugin tool or slash command (depends on your OpenClaw setup):

  • Join: action=join with the channelId
  • Leave: action=leave

Talk (voice channel)

Once the bot is connected, just speak.

Safeguard: only listen to you (default)

When primaryUser is set, the plugin will only listen to that user unless you allow someone else.

Let someone else talk (voice commands)

As the primary user, say:

  • openclaw allow \x3Cname>
  • openclaw listen to \x3Cname>

To lock it back:

  • openclaw only me
  • openclaw reset

Switch via tool actions (optional)

  • allow_speaker with user (id / @mention / name)
  • only_me
  • status

Notes

  • Lowest latency comes from streamingSTT=true and streamingTTS=true.
  • Deepgram TTS is streamed over HTTP in Ogg/Opus so Discord can play it immediately.
Usage Guidance
Before installing, be aware of these points: - Credentials: The plugin requires a Discord bot token and a Deepgram API key. Decide whether those keys will be stored in OpenClaw/Clawdbot config or environment variables; prefer least-privilege tokens for the Discord bot (only Connect/Speak/Voice Activity), and don't reuse high-privilege tokens. - Agent access: Voice input is forwarded into your embedded agent via runEmbeddedPiAgent. The plugin intentionally supplies an extra system prompt and does not enforce a restrictive 'lane' — meaning the invoked agent may have access to its usual tools and persisted session data. If your agent has tools that can access external services or secrets, voice input could indirectly trigger them. If you don't want that, do not enable this plugin or inspect/modify the runEmbeddedPiAgent call to restrict tool access. - Session persistence: Transcripts and session IDs are stored via the platform session store/workspace. If you handle sensitive conversations, verify where the session store is located and who can read it. - Test safely: Try this in a throwaway Discord server with a bot that has minimal permissions and with a non-production Deepgram key. Review and, if needed, modify the code to (a) explicitly restrict which tools the agent may use when invoked by voice, (b) avoid sending or persisting sensitive context, and (c) require an explicit opt-in to auto-join channels. - If you lack trust in the source (homepage unknown, owner ID only): prefer official/verified plugins or conduct a code review. The behavior is plausible for the stated purpose, but the privilege surface (embedded agent invocation + persisted sessions + undocumented system-prompt injection) merits caution.
Capability Analysis
Type: OpenClaw Skill Name: discord-voice-deepgram Version: 1.0.0 The OpenClaw Deepgram Discord Voice skill bundle appears benign. It integrates Discord voice channels with Deepgram STT/TTS and the OpenClaw agent. All network calls are directed to legitimate Discord and Deepgram APIs. Sensitive API keys (DISCORD_TOKEN, DEEPGRAM_API_KEY) are handled via standard configuration or environment variables. The `index.ts` file passes transcribed user input to the core agent, which is an inherent prompt injection risk in any LLM-based system, but the plugin includes an `extraSystemPrompt` to guide the agent's behavior, acting as a defense rather than an attack. There is no evidence of data exfiltration, unauthorized execution, persistence mechanisms, or obfuscation. The `SKILL.md` provides instructions for users and the OpenClaw platform, not malicious directives for the AI agent.
Capability Assessment
Purpose & Capability
Name/description match the code: this is a Discord voice plugin that uses Deepgram for STT/TTS and routes transcripts to the agent. However, the registry metadata listed no required env vars while the SKILL.md and code expect a Discord token and a Deepgram API key (DISCORD_TOKEN / DEEPGRAM_API_KEY or config.deepgram.apiKey). That mismatch is an inconsistency to be aware of.
Instruction Scope
The SKILL.md and code instruct the plugin to join voice channels, stream audio to Deepgram, and forward transcripts to the embedded agent. The code builds an extraSystemPrompt and calls runEmbeddedPiAgent (the agent is told it has access to its normal tools/skills and the user's Discord ID). The plugin also reads/writes the session store and agent workspace via core-bridge. Those actions go beyond simple STT/TTS plumbing because they give the invoked agent contextual info and access to its usual toolset and persisted session data — a potential surprise/privilege escalation if you weren't expecting that.
Install Mechanism
This is effectively an instruction-plus-source package (package.json present). There's no packaged install spec in the registry, so install is manual via npm (npm install). Dependencies are standard npm packages (discord.js, @discordjs/voice, ws, etc.) from normal registries — no obscure download URLs or extract steps were found.
Credentials
The plugin legitimately needs a Discord bot token and a Deepgram API key. The code reads Deepgram keys from config or environment and attempts to get the Discord token from the host OpenClaw/Clawdbot main config (mainConfig.channels.discord.token or mainConfig.discord.token) rather than directly requiring an env var. This is plausible but should be called out: the plugin expects access to your platform's Discord token storage and may also read Deepgram keys from env/config, so credential placement matters.
Persistence & Privilege
The plugin loads Clawdbot core modules and uses them to resolve agent workspace, session store, and to run an embedded agent. It also creates/updates session entries (saving a session store). It intentionally removed a commented-out 'lane' restriction and passes an extra system prompt telling the agent it 'has access to all your normal tools and skills'. That combination (embedded agent invocation + persisted session state + broad tool access) increases the blast radius of voice-triggered operations and is not clearly surfaced in SKILL.md.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install discord-voice-deepgram
  3. After installation, invoke the skill by name or use /discord-voice-deepgram
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of Deepgram Discord Voice skill for Clawdbot/OpenClaw. - Enables real-time voice conversations in Discord voice channels using Deepgram streaming STT and low-latency TTS. - Configurable to listen only to a specified primary user, with optional voice-activated speaker switching. - Supports wake-word activated voice commands for controlling permissions. - Provides example configurations and clear setup instructions. - Requires DISCORD_TOKEN and DEEPGRAM_API_KEY environment variables. - Delivers agent replies via Deepgram Aura TTS, streamed directly to the channel for minimal latency.
Metadata
Slug discord-voice-deepgram
Version 1.0.0
License
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is Discord Voice Using Deepgram?

Voice-channel conversations in Discord using Deepgram streaming STT + low-latency TTS. It is an AI Agent Skill for Claude Code / OpenClaw, with 1448 downloads so far.

How do I install Discord Voice Using Deepgram?

Run "/install discord-voice-deepgram" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Discord Voice Using Deepgram free?

Yes, Discord Voice Using Deepgram is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Discord Voice Using Deepgram support?

Discord Voice Using Deepgram is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Discord Voice Using Deepgram?

It is built and maintained by adriel1006 (@adriel1006); the current version is v1.0.0.

💬 Comments