/install liber-speechapi
Liber SpeechAPI
Use this skill for three related tasks:
- handle Telegram/openclaw voice-message workflows end to end
- convert user-provided text to speech on demand
- convert user-provided audio to text on demand
Follow this workflow
- Read
references/config.mdto resolve configuration from.envandconfig.json. - Read
references/workflow.mdfor Telegram/openclaw voice-message handling. - Read
references/api.mdwhen you need endpoint and payload details. - Read
references/parameters.mdfor detailed ASR/TTS parameter meanings and defaults. - Use
scripts/summarize_for_voice.pyonly when a reply must be shortened for voice playback. - Use
scripts/liber_speech_client.pyfor deterministic ASR/TTS calls instead of rewriting HTTP request logic.
Environment selection
Prefer a shared python-env skill if it is available in the current environment.
If python-env is not available, use the local Python environment for this skill.
When running local Python commands:
- use Python 3.11 if available
- allow Python 3.10 when 3.11 is unavailable
- install only the minimal dependencies required by the bundled scripts
- do not hardcode secrets; read them from
.env
Configuration model
.env
Load core service settings from .env in priority order:
- environment variables (
LIBER_API_BASE_URLandLIBER_API_KEY) ~/.openclaw/.envfile (for global configuration)- the skill directory's
.envfile - the current working directory's
.envfile
Environment variables take the highest priority, followed by the global config file ~/.openclaw/.env, then local skill directory, and finally the current working directory.
Required settings:
LIBER_API_BASE_URLLIBER_API_KEY
config.json
Load detailed defaults from speechapi_config.json in ~/.openclaw/workspace/config/ to prevent overwrites during skill updates.
Fallback to local config.json if the external config doesn't exist.
Key behavior:
- values of
"default"ornullare omitted from API requests - Telegram-specific voice replies use
global.telegram_tts_format - direct text-to-speech uses
tts.formatas its default output format - direct speech-to-text uses
global.asr_outputas its default output mode
Direct text-to-speech
When the user explicitly asks to convert text to speech:
- use
scripts/liber_speech_client.py tts - default to
wavunless the caller explicitly requests another format - include
audio_promptonly when clone audio is enabled and the file exists - return the TTS result URL or saved output path to the caller
Direct speech-to-text
When the user explicitly asks to convert audio to text:
- use
scripts/liber_speech_client.py asr - default to structured
jsonoutput - return plain text only when the caller explicitly wants transcript text only
Telegram/openclaw workflow
For incoming Telegram voice/audio:
- download or access the local audio file
- send it to ASR and extract the recognized
text - send the transcript to openclaw
- if the final reply is too long for voice, shorten it to within the configured summary limit
- synthesize the final spoken reply with Telegram-compatible
ogg_opus - return the resulting audio URL or saved output path to the caller
Telegram-specific guidance
For Telegram voice replies:
- force
ogg_opusoutput - keep spoken output concise and natural
- if the original answer is verbose, preserve intent and key facts but compress aggressively
- avoid reading markdown, code blocks, tables, or long lists verbatim
Safety and robustness
- never print or log API keys
- validate input file existence before ASR
- validate text is non-empty before TTS
- use request timeouts
- handle HTTP failures with clear error messages
- if TTS clone audio is configured but missing, continue without cloning instead of failing
- if summarization fails, fall back to conservative truncation rather than blocking the reply
- default direct ASR output to JSON and default direct TTS output to WAV unless the caller requests otherwise
Expected outputs
Depending on the task, return one of:
- structured ASR JSON
- plain transcript text
- concise voice-ready text
- TTS result URL
- saved audio file path
- a structured JSON object containing transcript, summary, and synthesis result
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install liber-speechapi - 安装完成后,直接呼叫该 Skill 的名称或使用
/liber-speechapi触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
liber-speechapi 是什么?
Handle Telegram voice messages with ASR, summarize replies, and provide TTS; also support direct text-to-speech and speech-to-text conversion with environmen... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 95 次。
如何安装 liber-speechapi?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install liber-speechapi」即可一键安装,无需额外配置。
liber-speechapi 是免费的吗?
是的,liber-speechapi 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
liber-speechapi 支持哪些平台?
liber-speechapi 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 liber-speechapi?
由 chang(@liberalchang)开发并维护,当前版本 v1.1.0。