← Back to Skills Marketplace
okaris

Speech To Text

by Ömer Karışman · GitHub ↗ · v0.1.5
cross-platform ✓ Security Clean
3704
Downloads
0
Stars
35
Active Installs
2
Versions
Install in OpenClaw
/install speech-to-text
Description
Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,...
Usage Guidance
Install the inference.sh CLI only if you trust the provider, and prefer the documented manual checksum verification path when possible. Review `infsh` commands before running them, and do not submit confidential, regulated, or private recordings unless inference.sh is approved for that data under your account.
Capability Analysis
Type: OpenClaw Skill Name: speech-to-text Version: 0.1.5 The skill bundle is classified as suspicious primarily due to the installation method described in `SKILL.md`. It instructs the AI agent to execute `curl -fsSL https://cli.inference.sh | sh`, which downloads and runs a shell script directly from a remote server. This practice is a significant security risk, creating a supply chain vulnerability and potential for Remote Code Execution (RCE) if the `cli.inference.sh` domain or server were compromised. While the documentation attempts to explain the script's benign nature, the method itself is inherently insecure and could lead to arbitrary code execution on the host system.
Capability Assessment
Purpose & Capability
The name, description, examples, and model list consistently describe transcribing, translating, timestamping, and captioning audio through inference.sh Whisper apps.
Instruction Scope
Runtime tool scope is limited to `Bash(infsh *)`, which is broader than only transcription commands but still confined to the inference.sh CLI and matches the documented workflows.
Install Mechanism
The quick start recommends `curl -fsSL https://cli.inference.sh | sh && infsh login`; the artifact discloses this remote installer and mentions checksum/manual verification, but users still need to trust the installer source.
Credentials
Submitting audio URLs to inference.sh is expected for a cloud transcription skill, but recordings may contain sensitive meeting, interview, or voice-note content and the privacy warning could be clearer.
Persistence & Privilege
The skill requires `infsh login`, implying a local service session, but the artifact shows no elevated permissions, background processes, destructive actions, unrelated credential access, or hidden persistence.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install speech-to-text
  3. After installation, invoke the skill by name or use /speech-to-text
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.5
- Updated documentation for clear setup instructions using inference.sh CLI. - Detailed available Whisper model options, usage examples, and input formats. - Added new sections on extracting audio from video, translation, and video subtitle workflows. - Enhanced guidance for supported languages and output structure. - Improved 'Related Skills' for easy access to complementary AI tools.
v0.1.0
- Initial release of speech-to-text skill. - Transcribe audio to text using Whisper models via inference.sh CLI. - Supports transcription, translation, multi-language, and timestamps. - Includes Fast Whisper Large V3 and Whisper V3 Large model options. - Provides example workflows for meetings, podcasts, subtitles, and more. - Output is returned as structured JSON with text, segments, and detected language.
Metadata
Slug speech-to-text
Version 0.1.5
License
All-time Installs 35
Active Installs 35
Total Versions 2
Frequently Asked Questions

What is Speech To Text?

Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,... It is an AI Agent Skill for Claude Code / OpenClaw, with 3704 downloads so far.

How do I install Speech To Text?

Run "/install speech-to-text" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Speech To Text free?

Yes, Speech To Text is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Speech To Text support?

Speech To Text is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Speech To Text?

It is built and maintained by Ömer Karışman (@okaris); the current version is v0.1.5.

💬 Comments