← Back to Skills Marketplace

Speech to text

Name: Speech to text
Author: bohnwuks

by Ian Santos · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

466

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install stt

Description

Transcreve arquivos de áudio em português brasileiro para texto, suportando vários formatos e incluindo timestamps.

Usage Guidance

This skill appears to do exactly what it says: local transcription using OpenAI Whisper. Before installing, consider: (1) Whisper will download model files on first run (can be large and requires network access); (2) you must install FFmpeg separately and ensure the 'inbound' folder path matches the script's computed path (SKILL.md's mkdir path may not match your environment); (3) the script will read and move any audio files placed in that folder and will write JSON transcriptions to an output directory—do not place sensitive audio there unless you are comfortable with it being processed and stored locally; (4) review requirements.txt and install dependencies in an isolated environment (virtualenv/container) to limit systemic impact. I reviewed the provided files and saw no code that exfiltrates data or requests unrelated credentials; if you want extra assurance, run the script in a sandboxed environment and inspect the full (non-truncated) stt_processor.py before production use.

Capability Analysis

Type: OpenClaw Skill Name: stt Version: 1.0.0 The skill provides a legitimate speech-to-text service using the OpenAI Whisper library. The core logic in `stt_processor.py` implements standard file-processing patterns, including monitoring an inbound directory, transcribing audio files, and moving processed files to success/failure folders. No evidence of data exfiltration, malicious execution, or prompt injection was found; the code and instructions in `SKILL.md` are consistent with the stated purpose.

Capability Assessment

✓ Purpose & Capability

Name and description match the included code and SKILL.md: the package implements a Whisper-based transcriber, supports the listed audio formats and timestamps, and saves/moves files as expected. No unrelated credentials, binaries, or config paths are requested.

ℹ Instruction Scope

SKILL.md instructions are narrowly scoped to installing Python deps, FFmpeg, creating an inbound folder, and running the script. The script operates on a local media/inbound folder and writes transcriptions to an output directory. Two minor issues to be aware of: (1) SKILL.md asks to create ../../../media/inbound — the script computes a media path relative to the script location (workspace_dir = Path(__file__).parent.parent.parent then .parent / 'media'), so you should verify the exact folder path used in your deployment to avoid missed files; (2) Whisper will download model weights on first run (network and significant disk usage), which is expected but notable.

✓ Install Mechanism

There is no installer in the registry spec; installation is via pip install -r requirements.txt and a separate FFmpeg install. Dependencies come from PyPI and standard package managers — no suspicious external URLs, archive downloads, or extract-on-disk steps are present in the manifest.

✓ Credentials

The skill declares no environment variables or credentials and the code does not read secret env vars. It only reads/writes local filesystem paths (inbound, output, processed/failed). There are no requests for unrelated credentials.

✓ Persistence & Privilege

The skill does not request always:true and is user-invocable only. It does not modify other skills or system-wide agent settings. Its runtime behavior (processing local files, saving results) is consistent with its purpose.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install stt
After installation, invoke the skill by name or use /stt
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Speech-to-Text (STT) Skill – Initial Release - Transcribes audio files to text using OpenAI Whisper, optimized for Brazilian Portuguese. - Supports audio formats: OGG, WAV, MP3, M4A, FLAC, AAC, OPUS. - Offers transcription with timestamps. - Provides tools for transcribing individual files, batch processing, and folder monitoring. - Includes setup instructions and usage examples.

Metadata

Slug stt

Version 1.0.0

License MIT-0

All-time Installs 5

Active Installs 5

Total Versions 1

Frequently Asked Questions

What is Speech to text?

Transcreve arquivos de áudio em português brasileiro para texto, suportando vários formatos e incluindo timestamps. It is an AI Agent Skill for Claude Code / OpenClaw, with 466 downloads so far.

How do I install Speech to text?

Run "/install stt" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Speech to text free?

Yes, Speech to text is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Speech to text support?

Speech to text is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Speech to text?

It is built and maintained by Ian Santos (@bohnwuks); the current version is v1.0.0.

More Skills