← 返回 Skills 市场

Feishu Voice Loop

Name: Feishu Voice Loop
Author: pengzhuowen

作者 ZoePeng · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

381

总下载

当前安装

版本数

在 OpenClaw 中安装

/install feishu-voice-loop

功能描述

Accept text or voice input, transcribe if needed, generate natural OpenAI TTS speech, and send audio output to Feishu chat or web player.

使用说明 (SKILL.md)

Feishu Voice Loop

Provide a reusable three-step voice loop for OpenClaw:

accept text or voice input
generate speech with OpenAI TTS
return the audio to Feishu or a web player

When the input is voice, transcribe it to text first, then continue through the same output pipeline.

Quick start

Prerequisites:

OPENAI_API_KEY is set for TTS
Feishu app credentials exist in ~/.openclaw/openclaw.json under channels.feishu.appId/appSecret, or are passed explicitly
ffmpeg and ffprobe are installed and available
local audio transcription is configured in ~/.openclaw/openclaw.json under tools.media.audio.models

Main scripts:

scripts/openai_tts_feishu.py
scripts/transcribe_audio.py

Tasks

1. Transcribe voice input

Use this when you have a local .ogg, .opus, .wav, or similar file and want text.

python3 scripts/transcribe_audio.py /path/to/input.ogg

This script reuses the existing Whisper CLI configuration from ~/.openclaw/openclaw.json.

2. Generate and send voice output

Use this when you already have text and want to send a Feishu voice message.

python3 scripts/openai_tts_feishu.py \
  --to \x3Cfeishu_open_id> \
  --text "这条是语音测试。" \
  --voice alloy \
  --model gpt-4o-mini-tts

The script will:

call OpenAI audio/speech
save WAV audio temporarily
convert to Feishu-friendly Opus via ffmpeg
upload the file to Feishu
send an audio message to the target open_id

3. Run the full voice loop

Use this skill when the goal is a reusable voice interaction pipeline:

transcribe input audio to text
decide or generate the reply text
synthesize reply audio with OpenAI TTS
send the reply back to Feishu

Read references/input-output-workflow.md when building or explaining the end-to-end loop.

Default output style

Default preset is stored in references/presets.md.

Unless the user asks otherwise, use:

model: gpt-4o-mini-tts
voice: alloy
default style: 年轻日系男声感、温柔里带一点撩、贴耳边私聊感、自然、不播音腔

When the user asks for a different flavor, either:

pass a custom --instructions
or adapt one of the presets in references/presets.md

Handle failures

Common failure cases:

Missing OPENAI_API_KEY → ask for API key / env setup
HTTP 429 from OpenAI → billing or quota issue
missing Feishu app credentials → configure channels.feishu.appId/appSecret
missing ffmpeg or ffprobe → install locally before retrying
missing transcription model config → configure tools.media.audio.models

When OpenAI billing is not enabled, say so directly instead of pretending the voice was generated.

Packaging and sharing

Package with:

python3 /Users/zoepeng/.openclaw/lib/node_modules/openclaw/skills/skill-creator/scripts/package_skill.py \
  /Users/zoepeng/.openclaw/workspace/skills/openai-feishu-voice

The resulting .skill file can be shared or uploaded wherever the user distributes skills.

Resources

scripts/openai_tts_feishu.py

Use for deterministic TTS generation and Feishu delivery.

scripts/transcribe_audio.py

Use for deterministic local audio transcription via the configured Whisper CLI.

references/presets.md

Read when the user asks for a different voice direction or wants named presets.

references/input-output-workflow.md

Read when packaging or explaining the complete voice-in / voice-out solution.

安全使用建议

Before installing or running this skill: - Expect to provide OPENAI_API_KEY and Feishu app credentials; the code reads ~/.openclaw/openclaw.json if you don't pass --app-id/--app-secret. The package metadata did not declare these requirements — double-check before trusting the skill. - Inspect your ~/.openclaw/openclaw.json: the transcription script will execute the command listed under tools.media.audio.models[0]. If that entry points to an unexpected binary or shell command, it could run arbitrary local code. Only use the skill if that config is safe. - The TTS script hardcodes /opt/homebrew/bin/ffmpeg (may fail on other OSes); consider editing the script to use a generic 'ffmpeg' on PATH or your correct ffmpeg location. - Running the skill will send audio/text data to api.openai.com and open.feishu.cn (OpenAI and Feishu). Don’t use it with sensitive data unless you accept that transmission. - The included voice presets explicitly instruct the model to simulate flirtatious/teenage voices (e.g., "teenage boy" and private-sibling scenarios). This is potentially abusive/unsafe. Remove or edit presets that are inappropriate before use. - If you decide to proceed: run the scripts in a controlled environment first, verify network destinations, and confirm the config entries and commands they will run. If you need a cleaner setup, add required env/config declarations to the skill metadata and replace hardcoded ffmpeg path.

功能分析

Type: OpenClaw Skill Name: feishu-voice-loop Version: 1.0.0 The skill bundle provides a legitimate voice-processing pipeline for Feishu, integrating OpenAI TTS and local audio transcription. It handles credentials through the standard OpenClaw configuration file (~/.openclaw/openclaw.json) and uses subprocesses for media processing (ffmpeg/ffprobe) and transcription CLI tools. No evidence of data exfiltration, malicious instructions, or unauthorized remote execution was found; the code logic is consistent with the stated purpose of building a voice-based chat interface.

能力评估

ℹ Purpose & Capability

The code and SKILL.md align with the described purpose (transcribe local audio, call OpenAI TTS, and post audio to Feishu). However the skill package metadata claims no required env vars or config paths, while the instructions and code actually require OPENAI_API_KEY and Feishu credentials stored in ~/.openclaw/openclaw.json (or passed in). That mismatch is unexpected and reduces trust.

⚠ Instruction Scope

The runtime instructions and both scripts read the user's ~/.openclaw/openclaw.json and will execute a CLI command taken from that config for transcription. Executing commands sourced from a user-controlled config is functionally reasonable for a pluggable transcription CLI, but it grants the skill the ability to run arbitrary local commands (whatever is configured). The scripts also transmit data to external endpoints (api.openai.com and open.feishu.cn) — which is intended, but users must understand audio/transcripts and API keys will be sent externally. Additionally, the presets.md contains instructions to produce flirtatious/sexualized "teenage boy" voices, which raises safety and policy concerns and is outside ordinary benign assistant use.

ℹ Install Mechanism

There is no install spec (instruction-only), so nothing is written during installation — low install risk. One code issue: openai_tts_feishu.py invokes ffmpeg using a hardcoded path (/opt/homebrew/bin/ffmpeg) while calling ffprobe as 'ffprobe'. This may cause failures on non-macOS/Homebrew systems and is brittle; it may also indicate the author tested only a specific environment.

⚠ Credentials

The skill needs an OPENAI_API_KEY and Feishu appId/appSecret (via ~/.openclaw/openclaw.json or CLI args) and requires ffmpeg/ffprobe — all are proportionate to the stated functionality. However the registry metadata declared no required env vars or config paths, which is inconsistent and misleading. Also note: transcription runs whatever command is configured under tools.media.audio.models[0] — that config may itself contain shell commands or point to other tools, so validate that config before use.

✓ Persistence & Privilege

The skill does not request persistent/always-on presence and does not modify other skills' configuration. It runs only when invoked; normal autonomous invocation is allowed by default but not set to always:true.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install feishu-voice-loop
安装完成后，直接呼叫该 Skill 的名称或使用 /feishu-voice-loop 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial public release of Feishu Voice Loop. Supports text or voice input, local transcription, OpenAI TTS speech generation, and Feishu audio delivery.

元数据

Slug feishu-voice-loop

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题