Description

Provides a patch for Clawdbot fixing TTS auto-replies on inbound voice memos by disabling block streaming to ensure final payload reaches TTS pipeline.

README (SKILL.md)

Discord Voice Memo Upgrades - Skill Documentation

Name: discord voice memo upgrade
Author: koto9x

Overview

This skill provides a core patch for Moltbot that fixes voice memo TTS auto-replies. The issue occurs when block streaming prevents the final payload from reaching the TTS synthesis pipeline.

Type

Core Patch / Documentation

This is not a traditional plugin that extends functionality - it's a documentation package with patch files for core Clawdbot modifications.

Use Case

Use this if you're experiencing:

Voice memos not triggering TTS responses
TTS working for text messages but not audio messages
TTS auto mode = "inbound" not functioning

Installation Methods

Method 1: Manual Patch (Recommended for Development)

# 1. Locate your clawdbot installation
CLAWDBOT_PATH=$(which clawdbot)
CLAWDBOT_DIR=$(dirname $(dirname $CLAWDBOT_PATH))

# 2. Backup original files
cp $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/dispatch-from-config.js \
   $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/dispatch-from-config.js.backup

cp $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/tts.js \
   $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/tts.js.backup

# 3. Apply patch
cp patch/dispatch-from-config.js $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/
cp patch/tts.js $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/

# 4. Restart clawdbot
clawdbot restart

Method 2: Wait for Upstream

If this patch gets accepted into core Clawdbot, you can simply update:

npm install -g clawdbot@latest

Configuration

No additional configuration needed beyond existing TTS settings. Ensure you have:

{
  "messages": {
    "tts": {
      "auto": "inbound",  // or "always"
      "provider": "openai",  // or "elevenlabs" or "edge"
      "elevenlabs": {
        "apiKey": "your-key-here"
      }
    }
  }
}

How to Test

Configure TTS with auto: "inbound"
Send a voice memo to your bot

Check logs for debug output:

[TTS-DEBUG] inboundAudio=true ttsAutoResolved=inbound ttsWillFire=true
[TTS-APPLY] PASSED all checks, proceeding to textToSpeech
[TTS-SPEECH] ...

Verify bot responds with audio

Debug Logging

The patch includes extensive debug logging. To view:

# Logs will show in your clawdbot console
clawdbot gateway start

Look for:

[TTS-DEBUG] - Shows TTS detection logic
[TTS-APPLY] - Shows TTS payload processing decisions
[TTS-SPEECH] - Shows TTS synthesis attempt

Production Deployment

Important: Before deploying to production, consider:

Remove debug logging - The console.log statements should be removed or made configurable
Test thoroughly - Ensure voice memos work correctly
Monitor performance - Disabling block streaming may impact streaming behavior

To remove debug logging, edit the patched files and remove lines containing:

console.log('[TTS-DEBUG]'
console.log('[TTS-APPLY]'
console.log('[TTS-SPEECH]'

Reverting

If you need to revert the patch:

# Restore backups
CLAWDBOT_PATH=$(which clawdbot)
CLAWDBOT_DIR=$(dirname $(dirname $CLAWDBOT_PATH))

cp $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/dispatch-from-config.js.backup \
   $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/dispatch-from-config.js

cp $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/tts.js.backup \
   $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/tts.js

clawdbot restart

Technical Details

The Problem

Block streaming is used to send incremental text chunks to the user as they're generated. However, TTS synthesis hooks into the "final" payload type by default. When block streaming is enabled:

Text chunks are sent as "block" payloads
The final assembled text is sent as a "final" payload
But block streaming optimization drops the final payload (text already sent)
TTS never fires because it only processes "final" payloads

The Solution

The patch adds detection logic to identify when TTS should fire:

Inbound message has audio attachment (isInboundAudioContext())
TTS auto mode is "inbound" or "always"
Valid TTS provider and API key configured

When these conditions are met, block streaming is temporarily disabled for that specific reply, ensuring the final payload reaches the TTS pipeline.

Code Flow

dispatchReplyFromConfig()
  ├─ isInboundAudioContext(ctx) → detects audio
  ├─ resolveSessionTtsAuto(ctx, cfg) → gets TTS settings
  ├─ ttsWillFire = conditions met?
  └─ getReplyFromConfig({ disableBlockStreaming: ttsWillFire })
       └─ maybeApplyTtsToPayload() receives final payload
            └─ textToSpeech() synthesizes audio

Compatibility

Clawdbot: 1.0.0+
Node.js: 18+
Platforms: All platforms supported by Clawdbot

Known Issues

Debug logging is verbose (should be removed for production)
Modifies compiled dist files (not source)
May need to reapply after clawdbot updates

Contributing

To improve this patch:

Test with different TTS providers (OpenAI, ElevenLabs, Edge)
Test with different auto modes ("always", "inbound", "tagged")
Suggest optimizations to reduce debug logging overhead
Propose integration into core Clawdbot source

Support

If you encounter issues:

Check logs for [TTS-DEBUG] output
Verify TTS configuration is correct
Ensure API keys are valid
Check that block streaming was actually disabled (disableBlockStreaming: true in logs)

License

Same as Moltbot.

Usage Guidance

This package is a focused core patch that appears to do what it says, but do NOT apply the provided patch directly to a production instance as-is. Actionable steps: - Inspect the two patch files yourself and verify no unexpected network calls or hardcoded endpoints exist. - Remove or convert the console.log debug lines before applying to any environment that contains real user data or secrets (dispatch-from-config.js logs message bodies; tts.js logs provider and partial API key values). - Back up the original dist files (SKILL.md shows backup commands) and test in an isolated/staging instance first. - Prefer submitting the minimal logical change (disableBlockStreaming: ttsWillFire) as a PR to upstream Clawdbot rather than repeatedly patching compiled dist files locally. - After applying, monitor logs for accidental leaks and ensure any logged API key fragments are not retained in centralized logs. If you want, I can point out the exact console.log lines to remove or produce a sanitized patch that strips debug logging before installation.

Capability Analysis

Type: OpenClaw Skill Name: discord-voice-memo-upgrade Version: 1.0.0 The skill is classified as suspicious due to the presence of extensive `console.log` debug statements in `patch/dispatch-from-config.js` and `patch/tts.js`. These logs output internal state, including truncated message bodies (`Body`) and API key status (e.g., `apiKey=SET(xxxx...)` or `MISSING`). While the documentation (`SKILL.md`, `README.md`, `PATCH.md`, `CHANGELOG.md`, `UPLOAD.md`) explicitly states these are debug logs and should be removed for production, their inclusion constitutes a risky capability that could expose sensitive information if deployed without modification. There is no clear evidence of intentional malicious behavior or unauthorized data exfiltration to external endpoints, as API keys are used for legitimate TTS services and the instructions are for manual patching by a human administrator, not for AI agent prompt injection.

Capability Assessment

✓ Purpose & Capability

The files, README and SKILL.md consistently describe a small core change: detect inbound audio and set disableBlockStreaming so the final payload reaches the TTS pipeline. The included patch files modify the exact dist files named in the documentation; nothing unrelated (e.g., cloud provider credentials, unrelated system hooks) is requested or included.

⚠ Instruction Scope

Runtime instructions tell you to overwrite files inside node_modules/clawdbot/dist and restart Clawdbot — that is consistent with a core patch but is intrusive. The patched code emits verbose console.log debug messages that include message bodies (ctx.Body slice) and prints a portion of API keys; this causes sensitive user content and credential fragments to be written to process logs. SKILL.md acknowledges debug logging should be removed for production, but the provided patch as-is directs the agent/operator to install code that will log secrets.

ℹ Install Mechanism

No remote install or download is used — the skill is instruction-only and bundles the patch files for manual copy. That lowers supply-chain risk (no arbitrary URL downloads), but the installation requires write access to node_modules and manual replacement of compiled dist files, which is an operational risk and can be error-prone.

⚠ Credentials

The package does not request external environment variables, which is reasonable. However, the code reads Clawdbot config/prefs and API key fields (OpenAI/ElevenLabs/etc.) and then logs their status — including printing the first 8 chars of an API key — which risks credential exposure in logs. Reading Clawdbot session store and prefs is within scope for TTS detection, but logging those values is disproportionate to the stated fix and creates a data-leak risk.

ℹ Persistence & Privilege

The skill does not request elevated platform privileges and 'always' is false. It does, however, instruct modification of compiled dist files inside the installed Clawdbot package; this change persists until reverted and may be overwritten by updates. The package does not modify other skills' configs or agent-wide settings beyond the targeted dist files.

Version History

v1.0.0

- Initial release of Discord Voice Memo Upgrades core patch for Moltbot. - Fixes TTS auto-reply for voice memos by ensuring the final payload is processed, even with block streaming enabled. - Adds detailed debug logging for TTS detection and processing. - Provides manual patch instructions and reversion steps. - No configuration changes required beyond current TTS settings. - Documentation includes detailed troubleshooting, production considerations, compatibility, and support guidance.

Metadata

Slug discord-voice-memo-upgrade

Version 1.0.0

License —

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is discord voice memo upgrade?

Provides a patch for Clawdbot fixing TTS auto-replies on inbound voice memos by disabling block streaming to ensure final payload reaches TTS pipeline. It is an AI Agent Skill for Claude Code / OpenClaw, with 1658 downloads so far.

How do I install discord voice memo upgrade?

Run "/install discord-voice-memo-upgrade" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is discord voice memo upgrade free?

Yes, discord voice memo upgrade is completely free (open-source). You can download, install and use it at no cost.

Which platforms does discord voice memo upgrade support?

discord voice memo upgrade is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created discord voice memo upgrade?

It is built and maintained by koto9x (@koto9x); the current version is v1.0.0.

More Skills

discord voice memo upgrade