/install liber-speechapi
Liber SpeechAPI
Use this skill for three related tasks:
- handle Telegram/openclaw voice-message workflows end to end
- convert user-provided text to speech on demand
- convert user-provided audio to text on demand
Follow this workflow
- Read
references/config.mdto resolve configuration from.envandconfig.json. - Read
references/workflow.mdfor Telegram/openclaw voice-message handling. - Read
references/api.mdwhen you need endpoint and payload details. - Read
references/parameters.mdfor detailed ASR/TTS parameter meanings and defaults. - Use
scripts/summarize_for_voice.pyonly when a reply must be shortened for voice playback. - Use
scripts/liber_speech_client.pyfor deterministic ASR/TTS calls instead of rewriting HTTP request logic.
Environment selection
Prefer a shared python-env skill if it is available in the current environment.
If python-env is not available, use the local Python environment for this skill.
When running local Python commands:
- use Python 3.11 if available
- allow Python 3.10 when 3.11 is unavailable
- install only the minimal dependencies required by the bundled scripts
- do not hardcode secrets; read them from
.env
Configuration model
.env
Load core service settings from .env in priority order:
- environment variables (
LIBER_API_BASE_URLandLIBER_API_KEY) ~/.openclaw/.envfile (for global configuration)- the skill directory's
.envfile - the current working directory's
.envfile
Environment variables take the highest priority, followed by the global config file ~/.openclaw/.env, then local skill directory, and finally the current working directory.
Required settings:
LIBER_API_BASE_URLLIBER_API_KEY
config.json
Load detailed defaults from speechapi_config.json in ~/.openclaw/workspace/config/ to prevent overwrites during skill updates.
Fallback to local config.json if the external config doesn't exist.
Key behavior:
- values of
"default"ornullare omitted from API requests - Telegram-specific voice replies use
global.telegram_tts_format - direct text-to-speech uses
tts.formatas its default output format - direct speech-to-text uses
global.asr_outputas its default output mode
Direct text-to-speech
When the user explicitly asks to convert text to speech:
- use
scripts/liber_speech_client.py tts - default to
wavunless the caller explicitly requests another format - include
audio_promptonly when clone audio is enabled and the file exists - return the TTS result URL or saved output path to the caller
Direct speech-to-text
When the user explicitly asks to convert audio to text:
- use
scripts/liber_speech_client.py asr - default to structured
jsonoutput - return plain text only when the caller explicitly wants transcript text only
Telegram/openclaw workflow
For incoming Telegram voice/audio:
- download or access the local audio file
- send it to ASR and extract the recognized
text - send the transcript to openclaw
- if the final reply is too long for voice, shorten it to within the configured summary limit
- synthesize the final spoken reply with Telegram-compatible
ogg_opus - return the resulting audio URL or saved output path to the caller
Telegram-specific guidance
For Telegram voice replies:
- force
ogg_opusoutput - keep spoken output concise and natural
- if the original answer is verbose, preserve intent and key facts but compress aggressively
- avoid reading markdown, code blocks, tables, or long lists verbatim
Safety and robustness
- never print or log API keys
- validate input file existence before ASR
- validate text is non-empty before TTS
- use request timeouts
- handle HTTP failures with clear error messages
- if TTS clone audio is configured but missing, continue without cloning instead of failing
- if summarization fails, fall back to conservative truncation rather than blocking the reply
- default direct ASR output to JSON and default direct TTS output to WAV unless the caller requests otherwise
Expected outputs
Depending on the task, return one of:
- structured ASR JSON
- plain transcript text
- concise voice-ready text
- TTS result URL
- saved audio file path
- a structured JSON object containing transcript, summary, and synthesis result
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install liber-speechapi - After installation, invoke the skill by name or use
/liber-speechapi - Provide required inputs per the skill's parameter spec and get structured output
What is liber-speechapi?
Handle Telegram voice messages with ASR, summarize replies, and provide TTS; also support direct text-to-speech and speech-to-text conversion with environmen... It is an AI Agent Skill for Claude Code / OpenClaw, with 95 downloads so far.
How do I install liber-speechapi?
Run "/install liber-speechapi" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is liber-speechapi free?
Yes, liber-speechapi is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does liber-speechapi support?
liber-speechapi is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created liber-speechapi?
It is built and maintained by chang (@liberalchang); the current version is v1.1.0.