← 返回 Skills 市场
liberalchang

liber-speechapi

作者 chang · GitHub ↗ · v1.1.0 · MIT-0
cross-platform ⚠ suspicious
95
总下载
0
收藏
0
当前安装
3
版本数
在 OpenClaw 中安装
/install liber-speechapi
功能描述
Handle Telegram voice messages with ASR, summarize replies, and provide TTS; also support direct text-to-speech and speech-to-text conversion with environmen...
使用说明 (SKILL.md)

Liber SpeechAPI

Use this skill for three related tasks:

  • handle Telegram/openclaw voice-message workflows end to end
  • convert user-provided text to speech on demand
  • convert user-provided audio to text on demand

Follow this workflow

  1. Read references/config.md to resolve configuration from .env and config.json.
  2. Read references/workflow.md for Telegram/openclaw voice-message handling.
  3. Read references/api.md when you need endpoint and payload details.
  4. Read references/parameters.md for detailed ASR/TTS parameter meanings and defaults.
  5. Use scripts/summarize_for_voice.py only when a reply must be shortened for voice playback.
  6. Use scripts/liber_speech_client.py for deterministic ASR/TTS calls instead of rewriting HTTP request logic.

Environment selection

Prefer a shared python-env skill if it is available in the current environment.

If python-env is not available, use the local Python environment for this skill.

When running local Python commands:

  • use Python 3.11 if available
  • allow Python 3.10 when 3.11 is unavailable
  • install only the minimal dependencies required by the bundled scripts
  • do not hardcode secrets; read them from .env

Configuration model

.env

Load core service settings from .env in priority order:

  1. environment variables (LIBER_API_BASE_URL and LIBER_API_KEY)
  2. ~/.openclaw/.env file (for global configuration)
  3. the skill directory's .env file
  4. the current working directory's .env file

Environment variables take the highest priority, followed by the global config file ~/.openclaw/.env, then local skill directory, and finally the current working directory.

Required settings:

  • LIBER_API_BASE_URL
  • LIBER_API_KEY

config.json

Load detailed defaults from speechapi_config.json in ~/.openclaw/workspace/config/ to prevent overwrites during skill updates. Fallback to local config.json if the external config doesn't exist.

Key behavior:

  • values of "default" or null are omitted from API requests
  • Telegram-specific voice replies use global.telegram_tts_format
  • direct text-to-speech uses tts.format as its default output format
  • direct speech-to-text uses global.asr_output as its default output mode

Direct text-to-speech

When the user explicitly asks to convert text to speech:

  1. use scripts/liber_speech_client.py tts
  2. default to wav unless the caller explicitly requests another format
  3. include audio_prompt only when clone audio is enabled and the file exists
  4. return the TTS result URL or saved output path to the caller

Direct speech-to-text

When the user explicitly asks to convert audio to text:

  1. use scripts/liber_speech_client.py asr
  2. default to structured json output
  3. return plain text only when the caller explicitly wants transcript text only

Telegram/openclaw workflow

For incoming Telegram voice/audio:

  1. download or access the local audio file
  2. send it to ASR and extract the recognized text
  3. send the transcript to openclaw
  4. if the final reply is too long for voice, shorten it to within the configured summary limit
  5. synthesize the final spoken reply with Telegram-compatible ogg_opus
  6. return the resulting audio URL or saved output path to the caller

Telegram-specific guidance

For Telegram voice replies:

  • force ogg_opus output
  • keep spoken output concise and natural
  • if the original answer is verbose, preserve intent and key facts but compress aggressively
  • avoid reading markdown, code blocks, tables, or long lists verbatim

Safety and robustness

  • never print or log API keys
  • validate input file existence before ASR
  • validate text is non-empty before TTS
  • use request timeouts
  • handle HTTP failures with clear error messages
  • if TTS clone audio is configured but missing, continue without cloning instead of failing
  • if summarization fails, fall back to conservative truncation rather than blocking the reply
  • default direct ASR output to JSON and default direct TTS output to WAV unless the caller requests otherwise

Expected outputs

Depending on the task, return one of:

  • structured ASR JSON
  • plain transcript text
  • concise voice-ready text
  • TTS result URL
  • saved audio file path
  • a structured JSON object containing transcript, summary, and synthesis result
安全使用建议
This skill behaves like a legitimate ASR/TTS client, but there are several red flags you should consider before installing: - The SKILL.md and scripts require LIBER_API_BASE_URL and LIBER_API_KEY, yet the registry metadata declares no required env vars — treat the client as needing an API key even if the registry doesn't show it. - The code will search and load ~/.openclaw/.env and ~/.openclaw/workspace/config/speechapi_config.json; if you store other secrets in those files, the skill could read them. Review those files and remove unrelated secrets before use, or run the skill in an isolated account/profile. - The client disables proxy env vars (requests.Session().trust_env = False), which prevents use of system http_proxy/https_proxy. If your environment relies on a proxy for auditing or egress controls, this behavior could bypass it; consider running in a network-isolated environment or blocking outbound access except to a trusted base URL. - The packaged .env contains a test local API URL and key; do not assume those are real credentials. Only provide your real LIBER_API_KEY to the skill if you trust the service endpoint (LIBER_API_BASE_URL) and have inspected the code. Recommended actions: inspect the full scripts yourself or run the skill in a disposable container/VM, restrict network egress to known endpoints, and do not place other secrets in ~/.openclaw files while testing. If you need to proceed, update the registry metadata to declare the required env vars (LIBER_API_BASE_URL and LIBER_API_KEY) so the requirement is explicit.
功能分析
Type: OpenClaw Skill Name: liber-speechapi Version: 1.1.0 The liber-speechapi skill provides a well-structured integration for Speech-to-Text (ASR) and Text-to-Speech (TTS) workflows using the Liber SpeechAPI. The bundle includes a Python client (liber_speech_client.py) for API interaction and a summarization script (summarize_for_voice.py) to optimize text for voice replies. While the skill accesses a global configuration file in the user's home directory (~/.openclaw/.env), this behavior is explicitly documented as a feature for shared credential management across skills. The code follows safety best practices, such as explicitly instructing the AI agent not to log API keys and validating input formats, with no evidence of malicious intent or data exfiltration.
能力标签
requires-oauth-tokenrequires-sensitive-credentials
能力评估
Purpose & Capability
The skill's declared registry metadata lists no required environment variables or primary credential, but SKILL.md and the included scripts clearly require LIBER_API_BASE_URL and LIBER_API_KEY. Requesting an API key and base URL is coherent with a remote ASR/TTS client, but the metadata omission is an incoherence that could mislead users about what secrets are needed.
Instruction Scope
Runtime instructions and the bundled client read multiple locations: environment variables, ~/.openclaw/.env, the skill directory .env, current working directory .env, and ~/.openclaw/workspace/config/speechapi_config.json. Reading global ~/.openclaw files and the user's current .env expands scope beyond a single-skill sandbox and can surface unrelated configuration; the instructions also ask the agent to prefer a shared python-env skill or fall back to local Python, which gives the skill flexibility to execute code in different environments.
Install Mechanism
There is no install spec (instruction-only install), but code files are included. requirements.txt is minimal (requests). Absence of an install step reduces supply-chain complexity, but the included Python scripts will be executed at runtime and perform network calls; this is a moderate risk compared with fully reviewed packaged installs.
Credentials
Requesting LIBER_API_BASE_URL and LIBER_API_KEY is proportionate to an ASR/TTS client, but the skill also reads global config locations (~/.openclaw/.env and ~/.openclaw/workspace/config/speechapi_config.json) that may contain other user settings. The client session sets requests.Session().trust_env = False (ignores http_proxy/https_proxy), which can bypass system proxy/monitoring and is notable for network egress/credential exfiltration risk.
Persistence & Privilege
The skill does not request always:true and does not modify other skills. It reads persisted config files in the user's home directory and writes results under the skill directory if downloading TTS. Reading/writing those user-scoped paths is reasonable for configuration and results, but the global config access increases the blast radius if credentials are present there.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install liber-speechapi
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /liber-speechapi 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.1.0
**Changelog for liber-speechapi v1.1.0** - Updated configuration priority for `.env`: environment variables > `~/.openclaw/.env` > local skill directory > current directory. - Now loads `speechapi_config.json` from `~/.openclaw/workspace/config/` as the primary config, with fallback to local `config.json`. - Removed `config_template.json` and added `speechapi_config.json` as the canonical config file. - Clarified documentation to resolve config precedence and external config locations.
v1.0.1
- Replaced example configuration files: added .env_template and config_template.json, removed .env and config.json. - No changes to user-facing workflows or feature set. - Documentation remains unchanged except for updated sample configuration support.
v1.0.0
Initial release of liber-speechapi skill supporting three core speech workflows. - End-to-end handling of Telegram and openclaw voice messages with ASR, summarization, and Telegram-compatible TTS. - On-demand text-to-speech synthesis and speech-to-text transcription with configurable output formats. - Supports environment-based configuration, including environment variable and JSON config management. - Optional voice cloning using a reference audio file. - Robust error handling with fallback behavior and privacy safeguards. - Prefers shared python-env skill if available; otherwise, uses local Python environment with minimal dependencies.
元数据
Slug liber-speechapi
版本 1.1.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 3
常见问题

liber-speechapi 是什么?

Handle Telegram voice messages with ASR, summarize replies, and provide TTS; also support direct text-to-speech and speech-to-text conversion with environmen... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 95 次。

如何安装 liber-speechapi?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install liber-speechapi」即可一键安装,无需额外配置。

liber-speechapi 是免费的吗?

是的,liber-speechapi 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

liber-speechapi 支持哪些平台?

liber-speechapi 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 liber-speechapi?

由 chang(@liberalchang)开发并维护,当前版本 v1.1.0。

💬 留言讨论