← Back to Skills Marketplace

Feishu Voice Note via FFmpeg

Name: Feishu Voice Note via FFmpeg
Author: 17329971

by 17329971 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install feishu-voice-note-ffmpeg

Description

解决飞书 IM 语音气泡问题——通过 ffmpeg 将 TTS 输出的 mp3 转为飞书支持的 ogg-opus 格式。适用场景：(1) 在飞书机器人的 TTS 回复中需要显示语音气泡而非文件附件, (2) Edge TTS 或其他只支持 mp3/webm 输出的 TTS 引擎需要适配飞书, (3) 自定义 TT...

README (SKILL.md)

飞书语音气泡 ffmpeg 方案

在飞书机器人中，语音消息只有以 ogg-opus 格式发送才会显示为可播放的语音气泡。纯文本附件或其他格式会显示为文件附件，无法内联播放。

使用方式

适合在以下场景直接套用：

TTS 已经能正常生成音频，但飞书里只显示为附件
希望把现有 mp3 / webm-opus 输出适配成飞书语音气泡
正在做 OpenClaw / 自定义机器人 / 自定义消息管线的飞书语音集成

核心原理

TTS 引擎（Edge TTS）
  → 输出 mp3（Edge TTS 原生仅支持 mp3 和 webm-opus）
    → ffmpeg 转码为 ogg-opus
      → 飞书 API 接收 ogg → 显示语音气泡 ✅

为什么需要转码：

Edge TTS 仅支持 audio-24khz-48kbitrate-mono-mp3（mp3）和 webm-opus 格式
飞书官方只将 ogg-opus 识别为语音消息（msg_type: audio）
webm 容器的 opus 文件飞书不识别，可能被当作视频或未知格式
mp3 文件在飞书中只能作为文件附件发送

飞书官方推荐命令

ffmpeg -i input.mp3 -acodec libopus -ac 1 -ar 16000 output.opus

参数说明：

-acodec libopus — 使用 Opus 编码器
-ac 1 — 单声道（语音消息标准）
-ar 16000 — 16kHz 采样率（语音质量与文件大小的平衡点）

在 OpenClaw 中的集成方案

方案概述

在 TTS provider 的 synthesize 函数中，检测当前通道是否要求语音气泡（通过 target 参数判断），如果是则：

调用目标 TTS 引擎生成 mp3
自动调用 ffmpeg 转成 ogg-opus
返回 .opus 文件路径给消息发送管线
飞书通道检测到 fileType: "opus" 后以 msg_type: "audio" 发送 → 语音气泡

关键集成点

TTS provider synthesize()
  → 生成 mp3 临时文件
  → 若 target === "voice-note"（飞书通道自动触发）:
    → ffmpeg -i temp.mp3 ... temp.opus
    → 返回 temp.opus 路径
  → 否则直接返回 mp3 路径

出错处理

如果 ffmpeg 转码失败（未安装、参数错误等），降级为返回原始 mp3 文件
降级后 mp3 会作为文件附件发送，不会导致崩溃

格式验证

转码后的 opus 文件可通过以下方式验证：

# 查看文件格式
ffprobe output.opus

# 确认编码器
ffprobe -show_streams output.opus | findstr codec

# 确认飞书兼容性
# 文件扩展名必须为 .opus
# MIME 类型应为 audio/opus 或 audio/ogg

实施建议

优先在 TTS provider 输出后、消息发送前 做转码
不建议把转码逻辑塞进上层业务逻辑；媒体格式适配应尽量留在音频管线内部
先保证失败可降级，再追求“始终发语音气泡”

已知限制

ffmpeg 必须安装在系统 PATH 中
转码增加约 50-200ms 延迟（取决于音频时长）
临时文件需要及时清理，避免磁盘占用
更新 TTS provider 或消息通道组件后，集成代码可能需要重新应用
飞书 API 要求 opus 文件大小不超过一定限制（通常语音消息几秒内的文件无问题）

Usage Guidance

This skill appears to do what it says (mp3 -> opus conversion for Feishu), but before installing or enabling it consider the following: - Declare required binaries: SKILL.md requires ffmpeg and ffprobe in PATH but the skill metadata lists none — ensure ffmpeg/ffprobe are installed from a trusted source and update metadata so deployers know the dependency. - Temp files & permissions: the integration creates temporary audio files. Ensure they are written to a safe temp directory with least privilege, cleaned up promptly, and that the agent cannot be coerced into keeping or exposing them. - Shell/sanitization risk: ffmpeg/ffprobe are invoked via command lines in the doc. If filenames or parameters are built from user input, make sure to sanitize/escape them or use library bindings to avoid shell injection. - Input validation & limits: enforce size/duration limits on incoming TTS audio to avoid resource exhaustion or large uploads to Feishu. Confirm Feishu file-size limits in your environment. - Cross-platform notes: SKILL.md includes a Windows-specific findstr example for ffprobe output; adjust verification commands for Linux/macOS (grep) if your runtime is not Windows. - Security posture: run ffmpeg in a confined environment (container or sandbox) when processing untrusted audio, because ffmpeg has had parsing vulnerabilities historically. Keep ffmpeg up to date. If you accept these operational conditions (installing ffmpeg, handling temp files and sanitization), the skill is functionally coherent. If you cannot guarantee safe handling of temporary files or command-line invocations, treat the skill with caution.

Capability Analysis

Type: OpenClaw Skill Name: feishu-voice-note-ffmpeg Version: 1.0.0 The skill bundle consists of documentation (SKILL.md) explaining how to use ffmpeg to convert MP3 audio to the ogg-opus format required for Feishu (Lark) voice bubbles. It provides standard ffmpeg commands and integration logic for OpenClaw pipelines without any executable code, malicious instructions, or suspicious network/file activities.

Capability Assessment

ℹ Purpose & Capability

The name/description and instructions consistently describe converting TTS mp3/webm to Feishu-compatible ogg-opus so Feishu will show a voice bubble. That capability legitimately requires ffmpeg and audio file I/O, so the purpose and capability are coherent. However, the skill's metadata declares no required binaries while the docs explicitly require ffmpeg in PATH — a discrepancy that should be corrected.

✓ Instruction Scope

SKILL.md stays within scope: it describes generating a temp mp3, invoking ffmpeg to transcode to opus, returning the .opus path, and a downgrade path if transcode fails. It does not instruct reading unrelated files or unrelated environment variables. It does instruct use of command-line tools (ffmpeg, ffprobe, and a Windows findstr example) and temporary files, which implies file-system access and shell execution that must be handled safely.

ℹ Install Mechanism

This is instruction-only with no install spec (low install risk). But because the instructions require ffmpeg/ffprobe to be present on PATH, the skill should declare that binary requirement in metadata so deployers know to install it. No remote downloads or extracts are present.

✓ Credentials

No environment variables or credentials are requested or required. That is proportionate for the stated function. The only implicit requirement is filesystem access to create/read temporary files and an executable ffmpeg on PATH.

✓ Persistence & Privilege

The skill does not request persistent presence, does not modify other skills, and is not set to always:true. It is user-invocable and can be called autonomously by the agent (platform default) which is expected for a runtime integration skill.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install feishu-voice-note-ffmpeg
After installation, invoke the skill by name or use /feishu-voice-note-ffmpeg
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial publish: document Feishu voice-bubble format requirements and the ffmpeg conversion path from MP3 to Ogg/Opus.

Metadata

Slug feishu-voice-note-ffmpeg

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Feishu Voice Note via FFmpeg?

解决飞书 IM 语音气泡问题——通过 ffmpeg 将 TTS 输出的 mp3 转为飞书支持的 ogg-opus 格式。适用场景：(1) 在飞书机器人的 TTS 回复中需要显示语音气泡而非文件附件, (2) Edge TTS 或其他只支持 mp3/webm 输出的 TTS 引擎需要适配飞书, (3) 自定义 TT... It is an AI Agent Skill for Claude Code / OpenClaw, with 69 downloads so far.

How do I install Feishu Voice Note via FFmpeg?

Run "/install feishu-voice-note-ffmpeg" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Feishu Voice Note via FFmpeg free?

Yes, Feishu Voice Note via FFmpeg is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Feishu Voice Note via FFmpeg support?

Feishu Voice Note via FFmpeg is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Feishu Voice Note via FFmpeg?

It is built and maintained by 17329971 (@17329971); the current version is v1.0.0.

More Skills