功能描述

Provides a patch for Clawdbot fixing TTS auto-replies on inbound voice memos by disabling block streaming to ensure final payload reaches TTS pipeline.

使用说明 (SKILL.md)

Discord Voice Memo Upgrades - Skill Documentation

Name: discord voice memo upgrade
Author: koto9x

Overview

This skill provides a core patch for Moltbot that fixes voice memo TTS auto-replies. The issue occurs when block streaming prevents the final payload from reaching the TTS synthesis pipeline.

Type

Core Patch / Documentation

This is not a traditional plugin that extends functionality - it's a documentation package with patch files for core Clawdbot modifications.

Use Case

Use this if you're experiencing:

Voice memos not triggering TTS responses
TTS working for text messages but not audio messages
TTS auto mode = "inbound" not functioning

Installation Methods

Method 1: Manual Patch (Recommended for Development)

# 1. Locate your clawdbot installation
CLAWDBOT_PATH=$(which clawdbot)
CLAWDBOT_DIR=$(dirname $(dirname $CLAWDBOT_PATH))

# 2. Backup original files
cp $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/dispatch-from-config.js \
   $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/dispatch-from-config.js.backup

cp $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/tts.js \
   $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/tts.js.backup

# 3. Apply patch
cp patch/dispatch-from-config.js $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/
cp patch/tts.js $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/

# 4. Restart clawdbot
clawdbot restart

Method 2: Wait for Upstream

If this patch gets accepted into core Clawdbot, you can simply update:

npm install -g clawdbot@latest

Configuration

No additional configuration needed beyond existing TTS settings. Ensure you have:

{
  "messages": {
    "tts": {
      "auto": "inbound",  // or "always"
      "provider": "openai",  // or "elevenlabs" or "edge"
      "elevenlabs": {
        "apiKey": "your-key-here"
      }
    }
  }
}

How to Test

Configure TTS with auto: "inbound"
Send a voice memo to your bot

Check logs for debug output:

[TTS-DEBUG] inboundAudio=true ttsAutoResolved=inbound ttsWillFire=true
[TTS-APPLY] PASSED all checks, proceeding to textToSpeech
[TTS-SPEECH] ...

Verify bot responds with audio

Debug Logging

The patch includes extensive debug logging. To view:

# Logs will show in your clawdbot console
clawdbot gateway start

Look for:

[TTS-DEBUG] - Shows TTS detection logic
[TTS-APPLY] - Shows TTS payload processing decisions
[TTS-SPEECH] - Shows TTS synthesis attempt

Production Deployment

Important: Before deploying to production, consider:

Remove debug logging - The console.log statements should be removed or made configurable
Test thoroughly - Ensure voice memos work correctly
Monitor performance - Disabling block streaming may impact streaming behavior

To remove debug logging, edit the patched files and remove lines containing:

console.log('[TTS-DEBUG]'
console.log('[TTS-APPLY]'
console.log('[TTS-SPEECH]'

Reverting

If you need to revert the patch:

# Restore backups
CLAWDBOT_PATH=$(which clawdbot)
CLAWDBOT_DIR=$(dirname $(dirname $CLAWDBOT_PATH))

cp $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/dispatch-from-config.js.backup \
   $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/auto-reply/reply/dispatch-from-config.js

cp $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/tts.js.backup \
   $CLAWDBOT_DIR/lib/node_modules/clawdbot/dist/tts/tts.js

clawdbot restart

Technical Details

The Problem

Block streaming is used to send incremental text chunks to the user as they're generated. However, TTS synthesis hooks into the "final" payload type by default. When block streaming is enabled:

Text chunks are sent as "block" payloads
The final assembled text is sent as a "final" payload
But block streaming optimization drops the final payload (text already sent)
TTS never fires because it only processes "final" payloads

The Solution

The patch adds detection logic to identify when TTS should fire:

Inbound message has audio attachment (isInboundAudioContext())
TTS auto mode is "inbound" or "always"
Valid TTS provider and API key configured

When these conditions are met, block streaming is temporarily disabled for that specific reply, ensuring the final payload reaches the TTS pipeline.

Code Flow

dispatchReplyFromConfig()
  ├─ isInboundAudioContext(ctx) → detects audio
  ├─ resolveSessionTtsAuto(ctx, cfg) → gets TTS settings
  ├─ ttsWillFire = conditions met?
  └─ getReplyFromConfig({ disableBlockStreaming: ttsWillFire })
       └─ maybeApplyTtsToPayload() receives final payload
            └─ textToSpeech() synthesizes audio

Compatibility

Clawdbot: 1.0.0+
Node.js: 18+
Platforms: All platforms supported by Clawdbot

Known Issues

Debug logging is verbose (should be removed for production)
Modifies compiled dist files (not source)
May need to reapply after clawdbot updates

Contributing

To improve this patch:

Test with different TTS providers (OpenAI, ElevenLabs, Edge)
Test with different auto modes ("always", "inbound", "tagged")
Suggest optimizations to reduce debug logging overhead
Propose integration into core Clawdbot source

Support

If you encounter issues:

Check logs for [TTS-DEBUG] output
Verify TTS configuration is correct
Ensure API keys are valid
Check that block streaming was actually disabled (disableBlockStreaming: true in logs)

License

Same as Moltbot.

安全使用建议

This package is a focused core patch that appears to do what it says, but do NOT apply the provided patch directly to a production instance as-is. Actionable steps: - Inspect the two patch files yourself and verify no unexpected network calls or hardcoded endpoints exist. - Remove or convert the console.log debug lines before applying to any environment that contains real user data or secrets (dispatch-from-config.js logs message bodies; tts.js logs provider and partial API key values). - Back up the original dist files (SKILL.md shows backup commands) and test in an isolated/staging instance first. - Prefer submitting the minimal logical change (disableBlockStreaming: ttsWillFire) as a PR to upstream Clawdbot rather than repeatedly patching compiled dist files locally. - After applying, monitor logs for accidental leaks and ensure any logged API key fragments are not retained in centralized logs. If you want, I can point out the exact console.log lines to remove or produce a sanitized patch that strips debug logging before installation.

功能分析

Type: OpenClaw Skill Name: discord-voice-memo-upgrade Version: 1.0.0 The skill is classified as suspicious due to the presence of extensive `console.log` debug statements in `patch/dispatch-from-config.js` and `patch/tts.js`. These logs output internal state, including truncated message bodies (`Body`) and API key status (e.g., `apiKey=SET(xxxx...)` or `MISSING`). While the documentation (`SKILL.md`, `README.md`, `PATCH.md`, `CHANGELOG.md`, `UPLOAD.md`) explicitly states these are debug logs and should be removed for production, their inclusion constitutes a risky capability that could expose sensitive information if deployed without modification. There is no clear evidence of intentional malicious behavior or unauthorized data exfiltration to external endpoints, as API keys are used for legitimate TTS services and the instructions are for manual patching by a human administrator, not for AI agent prompt injection.

能力评估

✓ Purpose & Capability

The files, README and SKILL.md consistently describe a small core change: detect inbound audio and set disableBlockStreaming so the final payload reaches the TTS pipeline. The included patch files modify the exact dist files named in the documentation; nothing unrelated (e.g., cloud provider credentials, unrelated system hooks) is requested or included.

⚠ Instruction Scope

Runtime instructions tell you to overwrite files inside node_modules/clawdbot/dist and restart Clawdbot — that is consistent with a core patch but is intrusive. The patched code emits verbose console.log debug messages that include message bodies (ctx.Body slice) and prints a portion of API keys; this causes sensitive user content and credential fragments to be written to process logs. SKILL.md acknowledges debug logging should be removed for production, but the provided patch as-is directs the agent/operator to install code that will log secrets.

ℹ Install Mechanism

No remote install or download is used — the skill is instruction-only and bundles the patch files for manual copy. That lowers supply-chain risk (no arbitrary URL downloads), but the installation requires write access to node_modules and manual replacement of compiled dist files, which is an operational risk and can be error-prone.

⚠ Credentials

The package does not request external environment variables, which is reasonable. However, the code reads Clawdbot config/prefs and API key fields (OpenAI/ElevenLabs/etc.) and then logs their status — including printing the first 8 chars of an API key — which risks credential exposure in logs. Reading Clawdbot session store and prefs is within scope for TTS detection, but logging those values is disproportionate to the stated fix and creates a data-leak risk.

ℹ Persistence & Privilege

The skill does not request elevated platform privileges and 'always' is false. It does, however, instruct modification of compiled dist files inside the installed Clawdbot package; this change persists until reverted and may be overwritten by updates. The package does not modify other skills' configs or agent-wide settings beyond the targeted dist files.

版本历史

v1.0.0

- Initial release of Discord Voice Memo Upgrades core patch for Moltbot. - Fixes TTS auto-reply for voice memos by ensuring the final payload is processed, even with block streaming enabled. - Adds detailed debug logging for TTS detection and processing. - Provides manual patch instructions and reversion steps. - No configuration changes required beyond current TTS settings. - Documentation includes detailed troubleshooting, production considerations, compatibility, and support guidance.

元数据

Slug discord-voice-memo-upgrade

版本 1.0.0

许可证 —

累计安装 0

当前安装数 0

历史版本数 1

常见问题

discord voice memo upgrade 是什么？

Provides a patch for Clawdbot fixing TTS auto-replies on inbound voice memos by disabling block streaming to ensure final payload reaches TTS pipeline. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 1658 次。

如何安装 discord voice memo upgrade？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install discord-voice-memo-upgrade」即可一键安装，无需额外配置。

discord voice memo upgrade 是免费的吗？

是的，discord voice memo upgrade 完全免费（开源免费），可自由下载、安装和使用。

discord voice memo upgrade 支持哪些平台？

discord voice memo upgrade 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 discord voice memo upgrade？

由 koto9x（@koto9x）开发并维护，当前版本 v1.0.0。

discord voice memo upgrade