← 返回 Skills 市场

Speech to Text (Yandex SpeechKit)

Name: Speech to Text (Yandex SpeechKit)
Author: bzsega

作者 Sergey Mikhaylov · GitHub ↗ · v1.1.8

cross-platform ✓ 安全检测通过

708

总下载

当前安装

版本数

在 OpenClaw 中安装

/install sergei-mikhailov-stt

功能描述

Speech recognition from voice messages using Yandex SpeechKit (with an extensible architecture for other providers). Use when you need to convert a voice mes...

安全使用建议

This skill appears to be what it says: a local Python/FFmpeg-based STT plugin that calls Yandex SpeechKit and expects your Yandex API key and folder ID. Before installing, ensure you trust the skill source (the repo owner), run the included bash check.sh and setup.sh in a safe environment, and prefer injecting credentials into ~/.openclaw/openclaw.json (as documented) rather than leaving them in plaintext in the skill folder. If you want extra caution, inspect setup.sh before running and run the skill inside an isolated environment (dedicated user or container).

功能分析

Type: OpenClaw Skill Name: sergei-mikhailov-stt Version: 1.1.8 The skill provides legitimate Speech-to-Text functionality using Yandex SpeechKit and follows OpenClaw best practices. It includes well-structured scripts for setup (setup.sh), diagnostics (check.sh), and audio processing (audio_processor.py) using FFmpeg. Security instructions in SKILL.md are defensive, explicitly directing the AI agent to protect API keys and avoid unauthorized file modifications. No evidence of data exfiltration, malicious command execution, or prompt injection attacks was found; all network activity is directed to official Yandex Cloud endpoints.

能力评估

✓ Purpose & Capability

Name and description match the implementation: the code converts audio with ffmpeg and calls Yandex SpeechKit. Required binaries (ffmpeg, python3) and env vars (YANDEX_API_KEY, YANDEX_FOLDER_ID) are expected and justified for this purpose.

✓ Instruction Scope

SKILL.md and scripts limit activity to validating/converting local audio files, loading config, and calling the Yandex API. The skill reads config from the skill folder and (for diagnostics) ~/.openclaw/openclaw.json to find injected env entries — this is appropriate for setup/validation and is documented. There are no instructions to read unrelated system files, shell history, or to transmit data to arbitrary endpoints beyond Yandex SpeechKit.

ℹ Install Mechanism

No registry install spec was provided (instruction-only), but the package includes setup.sh which creates a local Python venv and installs dependencies from requirements.txt. This is normal for Python skills; there are no opaque remote downloads or extracted archives from untrusted URLs in the manifest.

✓ Credentials

Only YANDEX_API_KEY and YANDEX_FOLDER_ID (plus optional STT_* vars) are required. These are the expected credentials for calling Yandex SpeechKit. The SKILL.md and code avoid exposing keys and instruct users how to store them in OpenClaw config or a .env.

✓ Persistence & Privilege

The skill does not request always:true or other elevated persistent privileges. It creates/uses a local venv and config files in its own skill directory and does not modify other skills or system-wide settings without explicit user action.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install sergei-mikhailov-stt
安装完成后，直接呼叫该 Skill 的名称或使用 /sergei-mikhailov-stt 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.1.8

Update outdated README instructions (setup.sh, clawhub update); add Security section to SKILL.md

v1.1.7

Fix: eliminate cd && chaining to avoid approval prompts; add API connectivity check to diagnostics; fix temp_dir to use absolute path

v1.1.6

Fix: generate .env inline when env.example is missing from ClawHub package; add check.sh diagnostic script

v1.1.5

Fix: generate .env inline when env.example is missing from package

v1.1.4

Fix: generate .env inline when env.example is missing from package

v1.1.3

Add check.sh diagnostic script for setup verification

v1.1.2

- Updated README.md with clarified documentation and usage steps. - Improved instructions for setting up environment and configuration. - No code or functional changes; this is a documentation update only.

v1.1.1

- Added a setup.sh script for one-command installation and setup (creates virtual environment, installs dependencies, and copies config examples). - Updated documentation for streamlined setup and first-run experience, including a new "Quick Start" section. - Clarified file size limits and emphasized the Yandex SpeechKit 1 MB request limit. - Minor corrections and formatting improvements in configuration and troubleshooting docs.

v1.1.0

- Added CLAUDE.md documentation file. - Updated assets/config.example.json and scripts/stt_processor.py with unspecified changes. - No user-facing feature changes detailed.

v1.0.4

- Broadened skill description from "Telegram voice messages" to "voice messages" for any messenger. - Updated metadata structure for improved compatibility. - Clarified that the skill works with all OpenClaw-connected messengers, not just Telegram. - No changes to core functionality or requirements.

v1.0.3

- No code or configuration changes in this release. - Documentation (SKILL.md) updated for clarity and troubleshooting. - Usage instructions, configuration steps, and error handling information revised.

v1.0.2

- Added detailed installation instructions, including skill installation and Python virtual environment setup. - Updated configuration guidance to prioritize using OpenClaw config (openclaw.json) for API keys. - Expanded error handling section with user-friendly error messages and actionable next steps. - Improved troubleshooting section for owners, specifying log checks and service account roles. - Clarified usage of the `.env` file as an alternative configuration method.

v1.0.1

- Updated SKILL.md metadata section to YAML format for improved compatibility. - No functional changes to skill logic or usage. Documentation only.

v1.0.0

sergei-mikhailov-stt version 1.0.0 - Initial release providing speech-to-text conversion for Telegram voice messages. - Supports audio file validation and processing (OGG, WAV, MP3) using ffmpeg. - Integrates Yandex SpeechKit as the default STT provider, with the option to add more providers. - Handles file size, format verification, error reporting, and usage of environment-based API credentials. - Provides structured results including recognized text, language, confidence, provider, and processing time information.

元数据

Slug sergei-mikhailov-stt

版本 1.1.8

许可证 —

累计安装 0

当前安装数 0

历史版本数 14

常见问题

Speech to Text (Yandex SpeechKit) 是什么？

Speech recognition from voice messages using Yandex SpeechKit (with an extensible architecture for other providers). Use when you need to convert a voice mes... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 708 次。

如何安装 Speech to Text (Yandex SpeechKit)？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install sergei-mikhailov-stt」即可一键安装，无需额外配置。

Speech to Text (Yandex SpeechKit) 是免费的吗？

是的，Speech to Text (Yandex SpeechKit) 完全免费（开源免费），可自由下载、安装和使用。

Speech to Text (Yandex SpeechKit) 支持哪些平台？

Speech to Text (Yandex SpeechKit) 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Speech to Text (Yandex SpeechKit)？

由 Sergey Mikhaylov（@bzsega）开发并维护，当前版本 v1.1.8。