← Back to Skills Marketplace

Speech To Text

Name: Speech To Text
Author: okaris

by Ömer Karışman · GitHub ↗ · v0.1.5

cross-platform ✓ Security Clean

3704

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install speech-to-text

Description

Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,...

Usage Guidance

Install the inference.sh CLI only if you trust the provider, and prefer the documented manual checksum verification path when possible. Review `infsh` commands before running them, and do not submit confidential, regulated, or private recordings unless inference.sh is approved for that data under your account.

Capability Analysis

Type: OpenClaw Skill Name: speech-to-text Version: 0.1.5 The skill bundle is classified as suspicious primarily due to the installation method described in `SKILL.md`. It instructs the AI agent to execute `curl -fsSL https://cli.inference.sh | sh`, which downloads and runs a shell script directly from a remote server. This practice is a significant security risk, creating a supply chain vulnerability and potential for Remote Code Execution (RCE) if the `cli.inference.sh` domain or server were compromised. While the documentation attempts to explain the script's benign nature, the method itself is inherently insecure and could lead to arbitrary code execution on the host system.

Capability Assessment

✓ Purpose & Capability

The name, description, examples, and model list consistently describe transcribing, translating, timestamping, and captioning audio through inference.sh Whisper apps.

ℹ Instruction Scope

Runtime tool scope is limited to `Bash(infsh *)`, which is broader than only transcription commands but still confined to the inference.sh CLI and matches the documented workflows.

ℹ Install Mechanism

The quick start recommends `curl -fsSL https://cli.inference.sh | sh && infsh login`; the artifact discloses this remote installer and mentions checksum/manual verification, but users still need to trust the installer source.

ℹ Credentials

Submitting audio URLs to inference.sh is expected for a cloud transcription skill, but recordings may contain sensitive meeting, interview, or voice-note content and the privacy warning could be clearer.

ℹ Persistence & Privilege

The skill requires `infsh login`, implying a local service session, but the artifact shows no elevated permissions, background processes, destructive actions, unrelated credential access, or hidden persistence.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install speech-to-text
After installation, invoke the skill by name or use /speech-to-text
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.1.5

- Updated documentation for clear setup instructions using inference.sh CLI. - Detailed available Whisper model options, usage examples, and input formats. - Added new sections on extracting audio from video, translation, and video subtitle workflows. - Enhanced guidance for supported languages and output structure. - Improved 'Related Skills' for easy access to complementary AI tools.

v0.1.0

- Initial release of speech-to-text skill. - Transcribe audio to text using Whisper models via inference.sh CLI. - Supports transcription, translation, multi-language, and timestamps. - Includes Fast Whisper Large V3 and Whisper V3 Large model options. - Provides example workflows for meetings, podcasts, subtitles, and more. - Output is returned as structured JSON with text, segments, and detected language.

Metadata

Slug speech-to-text

Version 0.1.5

License —

All-time Installs 35

Active Installs 35

Total Versions 2

Frequently Asked Questions

What is Speech To Text?

Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,... It is an AI Agent Skill for Claude Code / OpenClaw, with 3704 downloads so far.

How do I install Speech To Text?

Run "/install speech-to-text" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Speech To Text free?

Yes, Speech To Text is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Speech To Text support?

Speech To Text is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Speech To Text?

It is built and maintained by Ömer Karışman (@okaris); the current version is v0.1.5.

More Skills