← 返回 Skills 市场
17329971

Feishu Voice Note via FFmpeg

作者 17329971 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
69
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install feishu-voice-note-ffmpeg
功能描述
解决飞书 IM 语音气泡问题——通过 ffmpeg 将 TTS 输出的 mp3 转为飞书支持的 ogg-opus 格式。适用场景:(1) 在飞书机器人的 TTS 回复中需要显示语音气泡而非文件附件, (2) Edge TTS 或其他只支持 mp3/webm 输出的 TTS 引擎需要适配飞书, (3) 自定义 TT...
使用说明 (SKILL.md)

飞书语音气泡 ffmpeg 方案

在飞书机器人中,语音消息只有以 ogg-opus 格式发送才会显示为可播放的语音气泡。 纯文本附件或其他格式会显示为文件附件,无法内联播放。

使用方式

适合在以下场景直接套用:

  • TTS 已经能正常生成音频,但飞书里只显示为附件
  • 希望把现有 mp3 / webm-opus 输出适配成飞书语音气泡
  • 正在做 OpenClaw / 自定义机器人 / 自定义消息管线的飞书语音集成

核心原理

TTS 引擎(Edge TTS)
  → 输出 mp3(Edge TTS 原生仅支持 mp3 和 webm-opus)
    → ffmpeg 转码为 ogg-opus
      → 飞书 API 接收 ogg → 显示语音气泡 ✅

为什么需要转码:

  • Edge TTS 仅支持 audio-24khz-48kbitrate-mono-mp3(mp3)和 webm-opus 格式
  • 飞书官方只将 ogg-opus 识别为语音消息(msg_type: audio
  • webm 容器的 opus 文件飞书不识别,可能被当作视频或未知格式
  • mp3 文件在飞书中只能作为文件附件发送

飞书官方推荐命令

ffmpeg -i input.mp3 -acodec libopus -ac 1 -ar 16000 output.opus

参数说明:

  • -acodec libopus — 使用 Opus 编码器
  • -ac 1 — 单声道(语音消息标准)
  • -ar 16000 — 16kHz 采样率(语音质量与文件大小的平衡点)

在 OpenClaw 中的集成方案

方案概述

在 TTS provider 的 synthesize 函数中,检测当前通道是否要求语音气泡(通过 target 参数判断),如果是则:

  1. 调用目标 TTS 引擎生成 mp3
  2. 自动调用 ffmpeg 转成 ogg-opus
  3. 返回 .opus 文件路径给消息发送管线
  4. 飞书通道检测到 fileType: "opus" 后以 msg_type: "audio" 发送 → 语音气泡

关键集成点

TTS provider synthesize()
  → 生成 mp3 临时文件
  → 若 target === "voice-note"(飞书通道自动触发):
    → ffmpeg -i temp.mp3 ... temp.opus
    → 返回 temp.opus 路径
  → 否则直接返回 mp3 路径

出错处理

  • 如果 ffmpeg 转码失败(未安装、参数错误等),降级为返回原始 mp3 文件
  • 降级后 mp3 会作为文件附件发送,不会导致崩溃

格式验证

转码后的 opus 文件可通过以下方式验证:

# 查看文件格式
ffprobe output.opus

# 确认编码器
ffprobe -show_streams output.opus | findstr codec

# 确认飞书兼容性
# 文件扩展名必须为 .opus
# MIME 类型应为 audio/opus 或 audio/ogg

实施建议

  • 优先在 TTS provider 输出后、消息发送前 做转码
  • 不建议把转码逻辑塞进上层业务逻辑;媒体格式适配应尽量留在音频管线内部
  • 先保证失败可降级,再追求“始终发语音气泡”

已知限制

  • ffmpeg 必须安装在系统 PATH 中
  • 转码增加约 50-200ms 延迟(取决于音频时长)
  • 临时文件需要及时清理,避免磁盘占用
  • 更新 TTS provider 或消息通道组件后,集成代码可能需要重新应用
  • 飞书 API 要求 opus 文件大小不超过一定限制(通常语音消息几秒内的文件无问题)
安全使用建议
This skill appears to do what it says (mp3 -> opus conversion for Feishu), but before installing or enabling it consider the following: - Declare required binaries: SKILL.md requires ffmpeg and ffprobe in PATH but the skill metadata lists none — ensure ffmpeg/ffprobe are installed from a trusted source and update metadata so deployers know the dependency. - Temp files & permissions: the integration creates temporary audio files. Ensure they are written to a safe temp directory with least privilege, cleaned up promptly, and that the agent cannot be coerced into keeping or exposing them. - Shell/sanitization risk: ffmpeg/ffprobe are invoked via command lines in the doc. If filenames or parameters are built from user input, make sure to sanitize/escape them or use library bindings to avoid shell injection. - Input validation & limits: enforce size/duration limits on incoming TTS audio to avoid resource exhaustion or large uploads to Feishu. Confirm Feishu file-size limits in your environment. - Cross-platform notes: SKILL.md includes a Windows-specific findstr example for ffprobe output; adjust verification commands for Linux/macOS (grep) if your runtime is not Windows. - Security posture: run ffmpeg in a confined environment (container or sandbox) when processing untrusted audio, because ffmpeg has had parsing vulnerabilities historically. Keep ffmpeg up to date. If you accept these operational conditions (installing ffmpeg, handling temp files and sanitization), the skill is functionally coherent. If you cannot guarantee safe handling of temporary files or command-line invocations, treat the skill with caution.
功能分析
Type: OpenClaw Skill Name: feishu-voice-note-ffmpeg Version: 1.0.0 The skill bundle consists of documentation (SKILL.md) explaining how to use ffmpeg to convert MP3 audio to the ogg-opus format required for Feishu (Lark) voice bubbles. It provides standard ffmpeg commands and integration logic for OpenClaw pipelines without any executable code, malicious instructions, or suspicious network/file activities.
能力评估
Purpose & Capability
The name/description and instructions consistently describe converting TTS mp3/webm to Feishu-compatible ogg-opus so Feishu will show a voice bubble. That capability legitimately requires ffmpeg and audio file I/O, so the purpose and capability are coherent. However, the skill's metadata declares no required binaries while the docs explicitly require ffmpeg in PATH — a discrepancy that should be corrected.
Instruction Scope
SKILL.md stays within scope: it describes generating a temp mp3, invoking ffmpeg to transcode to opus, returning the .opus path, and a downgrade path if transcode fails. It does not instruct reading unrelated files or unrelated environment variables. It does instruct use of command-line tools (ffmpeg, ffprobe, and a Windows findstr example) and temporary files, which implies file-system access and shell execution that must be handled safely.
Install Mechanism
This is instruction-only with no install spec (low install risk). But because the instructions require ffmpeg/ffprobe to be present on PATH, the skill should declare that binary requirement in metadata so deployers know to install it. No remote downloads or extracts are present.
Credentials
No environment variables or credentials are requested or required. That is proportionate for the stated function. The only implicit requirement is filesystem access to create/read temporary files and an executable ffmpeg on PATH.
Persistence & Privilege
The skill does not request persistent presence, does not modify other skills, and is not set to always:true. It is user-invocable and can be called autonomously by the agent (platform default) which is expected for a runtime integration skill.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install feishu-voice-note-ffmpeg
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /feishu-voice-note-ffmpeg 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial publish: document Feishu voice-bubble format requirements and the ffmpeg conversion path from MP3 to Ogg/Opus.
元数据
Slug feishu-voice-note-ffmpeg
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Feishu Voice Note via FFmpeg 是什么?

解决飞书 IM 语音气泡问题——通过 ffmpeg 将 TTS 输出的 mp3 转为飞书支持的 ogg-opus 格式。适用场景:(1) 在飞书机器人的 TTS 回复中需要显示语音气泡而非文件附件, (2) Edge TTS 或其他只支持 mp3/webm 输出的 TTS 引擎需要适配飞书, (3) 自定义 TT... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 69 次。

如何安装 Feishu Voice Note via FFmpeg?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install feishu-voice-note-ffmpeg」即可一键安装,无需额外配置。

Feishu Voice Note via FFmpeg 是免费的吗?

是的,Feishu Voice Note via FFmpeg 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Feishu Voice Note via FFmpeg 支持哪些平台?

Feishu Voice Note via FFmpeg 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Feishu Voice Note via FFmpeg?

由 17329971(@17329971)开发并维护,当前版本 v1.0.0。

💬 留言讨论