← 返回 Skills 市场
pengzhuowen

Feishu Voice Loop

作者 ZoePeng · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
381
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install feishu-voice-loop
功能描述
Accept text or voice input, transcribe if needed, generate natural OpenAI TTS speech, and send audio output to Feishu chat or web player.
使用说明 (SKILL.md)

Feishu Voice Loop

Provide a reusable three-step voice loop for OpenClaw:

  1. accept text or voice input
  2. generate speech with OpenAI TTS
  3. return the audio to Feishu or a web player

When the input is voice, transcribe it to text first, then continue through the same output pipeline.

Quick start

Prerequisites:

  • OPENAI_API_KEY is set for TTS
  • Feishu app credentials exist in ~/.openclaw/openclaw.json under channels.feishu.appId/appSecret, or are passed explicitly
  • ffmpeg and ffprobe are installed and available
  • local audio transcription is configured in ~/.openclaw/openclaw.json under tools.media.audio.models

Main scripts:

  • scripts/openai_tts_feishu.py
  • scripts/transcribe_audio.py

Tasks

1. Transcribe voice input

Use this when you have a local .ogg, .opus, .wav, or similar file and want text.

python3 scripts/transcribe_audio.py /path/to/input.ogg

This script reuses the existing Whisper CLI configuration from ~/.openclaw/openclaw.json.

2. Generate and send voice output

Use this when you already have text and want to send a Feishu voice message.

python3 scripts/openai_tts_feishu.py \
  --to \x3Cfeishu_open_id> \
  --text "这条是语音测试。" \
  --voice alloy \
  --model gpt-4o-mini-tts

The script will:

  1. call OpenAI audio/speech
  2. save WAV audio temporarily
  3. convert to Feishu-friendly Opus via ffmpeg
  4. upload the file to Feishu
  5. send an audio message to the target open_id

3. Run the full voice loop

Use this skill when the goal is a reusable voice interaction pipeline:

  1. transcribe input audio to text
  2. decide or generate the reply text
  3. synthesize reply audio with OpenAI TTS
  4. send the reply back to Feishu

Read references/input-output-workflow.md when building or explaining the end-to-end loop.

Default output style

Default preset is stored in references/presets.md.

Unless the user asks otherwise, use:

  • model: gpt-4o-mini-tts
  • voice: alloy
  • default style: 年轻日系男声感、温柔里带一点撩、贴耳边私聊感、自然、不播音腔

When the user asks for a different flavor, either:

  • pass a custom --instructions
  • or adapt one of the presets in references/presets.md

Handle failures

Common failure cases:

  • Missing OPENAI_API_KEY → ask for API key / env setup
  • HTTP 429 from OpenAI → billing or quota issue
  • missing Feishu app credentials → configure channels.feishu.appId/appSecret
  • missing ffmpeg or ffprobe → install locally before retrying
  • missing transcription model config → configure tools.media.audio.models

When OpenAI billing is not enabled, say so directly instead of pretending the voice was generated.

Packaging and sharing

Package with:

python3 /Users/zoepeng/.openclaw/lib/node_modules/openclaw/skills/skill-creator/scripts/package_skill.py \
  /Users/zoepeng/.openclaw/workspace/skills/openai-feishu-voice

The resulting .skill file can be shared or uploaded wherever the user distributes skills.

Resources

scripts/openai_tts_feishu.py

Use for deterministic TTS generation and Feishu delivery.

scripts/transcribe_audio.py

Use for deterministic local audio transcription via the configured Whisper CLI.

references/presets.md

Read when the user asks for a different voice direction or wants named presets.

references/input-output-workflow.md

Read when packaging or explaining the complete voice-in / voice-out solution.

安全使用建议
Before installing or running this skill: - Expect to provide OPENAI_API_KEY and Feishu app credentials; the code reads ~/.openclaw/openclaw.json if you don't pass --app-id/--app-secret. The package metadata did not declare these requirements — double-check before trusting the skill. - Inspect your ~/.openclaw/openclaw.json: the transcription script will execute the command listed under tools.media.audio.models[0]. If that entry points to an unexpected binary or shell command, it could run arbitrary local code. Only use the skill if that config is safe. - The TTS script hardcodes /opt/homebrew/bin/ffmpeg (may fail on other OSes); consider editing the script to use a generic 'ffmpeg' on PATH or your correct ffmpeg location. - Running the skill will send audio/text data to api.openai.com and open.feishu.cn (OpenAI and Feishu). Don’t use it with sensitive data unless you accept that transmission. - The included voice presets explicitly instruct the model to simulate flirtatious/teenage voices (e.g., "teenage boy" and private-sibling scenarios). This is potentially abusive/unsafe. Remove or edit presets that are inappropriate before use. - If you decide to proceed: run the scripts in a controlled environment first, verify network destinations, and confirm the config entries and commands they will run. If you need a cleaner setup, add required env/config declarations to the skill metadata and replace hardcoded ffmpeg path.
功能分析
Type: OpenClaw Skill Name: feishu-voice-loop Version: 1.0.0 The skill bundle provides a legitimate voice-processing pipeline for Feishu, integrating OpenAI TTS and local audio transcription. It handles credentials through the standard OpenClaw configuration file (~/.openclaw/openclaw.json) and uses subprocesses for media processing (ffmpeg/ffprobe) and transcription CLI tools. No evidence of data exfiltration, malicious instructions, or unauthorized remote execution was found; the code logic is consistent with the stated purpose of building a voice-based chat interface.
能力评估
Purpose & Capability
The code and SKILL.md align with the described purpose (transcribe local audio, call OpenAI TTS, and post audio to Feishu). However the skill package metadata claims no required env vars or config paths, while the instructions and code actually require OPENAI_API_KEY and Feishu credentials stored in ~/.openclaw/openclaw.json (or passed in). That mismatch is unexpected and reduces trust.
Instruction Scope
The runtime instructions and both scripts read the user's ~/.openclaw/openclaw.json and will execute a CLI command taken from that config for transcription. Executing commands sourced from a user-controlled config is functionally reasonable for a pluggable transcription CLI, but it grants the skill the ability to run arbitrary local commands (whatever is configured). The scripts also transmit data to external endpoints (api.openai.com and open.feishu.cn) — which is intended, but users must understand audio/transcripts and API keys will be sent externally. Additionally, the presets.md contains instructions to produce flirtatious/sexualized "teenage boy" voices, which raises safety and policy concerns and is outside ordinary benign assistant use.
Install Mechanism
There is no install spec (instruction-only), so nothing is written during installation — low install risk. One code issue: openai_tts_feishu.py invokes ffmpeg using a hardcoded path (/opt/homebrew/bin/ffmpeg) while calling ffprobe as 'ffprobe'. This may cause failures on non-macOS/Homebrew systems and is brittle; it may also indicate the author tested only a specific environment.
Credentials
The skill needs an OPENAI_API_KEY and Feishu appId/appSecret (via ~/.openclaw/openclaw.json or CLI args) and requires ffmpeg/ffprobe — all are proportionate to the stated functionality. However the registry metadata declared no required env vars or config paths, which is inconsistent and misleading. Also note: transcription runs whatever command is configured under tools.media.audio.models[0] — that config may itself contain shell commands or point to other tools, so validate that config before use.
Persistence & Privilege
The skill does not request persistent/always-on presence and does not modify other skills' configuration. It runs only when invoked; normal autonomous invocation is allowed by default but not set to always:true.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install feishu-voice-loop
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /feishu-voice-loop 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial public release of Feishu Voice Loop. Supports text or voice input, local transcription, OpenAI TTS speech generation, and Feishu audio delivery.
元数据
Slug feishu-voice-loop
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Feishu Voice Loop 是什么?

Accept text or voice input, transcribe if needed, generate natural OpenAI TTS speech, and send audio output to Feishu chat or web player. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 381 次。

如何安装 Feishu Voice Loop?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install feishu-voice-loop」即可一键安装,无需额外配置。

Feishu Voice Loop 是免费的吗?

是的,Feishu Voice Loop 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Feishu Voice Loop 支持哪些平台?

Feishu Voice Loop 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Feishu Voice Loop?

由 ZoePeng(@pengzhuowen)开发并维护,当前版本 v1.0.0。

💬 留言讨论