Description

Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.

README (SKILL.md)

Speech is Cheap (SIC) Skill

Name: Speech is Cheap Transcribe
Author: ilyakam

Fast, accurate, and incredibly inexpensive automatic speech-to-text transcription service.

🚀 Why use this skill?

Disruptive Pricing: $0.06 - $0.12 per hour (2-15x cheaper than Deepgram or OpenAI).
Extreme Speed: 100 minutes of audio transcribes in ~1 minute.
Multilingual: Supports 100 languages with auto-detection.
Agent-Ready: Designed for high-volume, automated pipelines.

🛠 Setup

1. Get an API Key

Sign up at speechischeap.com. Use code CH5 for $5 off.

2. Configure Authentication

This skill looks for your API key in the SIC_API_KEY environment variable.

Add this to your .env or agent config:

SIC_API_KEY=your_key_here

📖 Usage

🤖 TL;DR for Agents

When this skill is installed, you can transcribe any URL from an OpenClaw session and get the JSON results immediately by running: ./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3"

Transcribe a URL

# Basic transcription
./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3"

# Advanced transcription with options
./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3" \
  --speakers --words --labels \
  --language "en" \
  --format "srt" \
  --private

Transcribe a Local File

Perfect for processing audio already on your disk. This handles the upload automatically.

# Upload and transcribe local media
./skills/asr/scripts/asr.sh transcribe --file "./local-audio.wav"

# Upload with webhook callback
./skills/asr/scripts/asr.sh transcribe --file "./local-audio.wav" --webhook "https://mysite.com/callback"

# Note: For local files, the skill handles the multi-part upload to
# https://upload.speechischeap.com before starting the transcription.

Supported Options

--speakers: Enable speaker diarization
--words: Enable word-level timestamps
--labels: Enable audio labeling (music, noise, etc.)
--stream: Enable streaming output
--private: Do not store audio/transcript (privacy mode)
--language \x3Ccode>: ISO language code (e.g., 'en', 'es')
--confidence \x3Cfloat>: Minimum confidence threshold (default 0.5)
--format \x3Cfmt>: Output format (json, srt, vtt, webvtt)
--webhook \x3Curl>: URL to receive job completion payload
--segment-duration \x3Cn>: Segment duration in seconds (default 30)

Check Job Status

./skills/asr/scripts/asr.sh status "job-id-here"

🤖 For Agents

The asr.sh command-line tool returns JSON by default when successful, making it easy to pipe into other tools or parse directly.

If the SIC_API_KEY is missing, the tool will provide a clear error message and a direct link to the signup page.

Usage Guidance

This skill appears to do exactly what it claims: it uploads audio (local files or URLs) to Speech is Cheap and returns transcription job JSON. Before installing, consider: (1) Trust the vendor — your audio and potentially sensitive content will be sent to api.speechischeap.com / upload.speechischeap.com; (2) Protect the SIC_API_KEY like any API secret; (3) Be cautious when passing URLs — the remote service will fetch them (risk of exposing internal endpoints); (4) If you use webhooks, the skill will include your webhook URL in job requests (ensure the target is trusted); (5) Ensure curl is available on the agent host (script expects it). If you need stronger privacy, verify the provider's 'private' behavior and retention policy or avoid sending sensitive audio.

Capability Analysis

Type: OpenClaw Skill Name: asr Version: 1.2.0 The skill bundle provides an automatic speech-to-text transcription service. The `SKILL.md` and `README.md` files contain clear, functional instructions for the OpenClaw agent to use the `asr.sh` script, without any evidence of prompt injection attempts to subvert the agent's behavior or exfiltrate data. The `asr.sh` script correctly handles the `SIC_API_KEY` environment variable and interacts with the `speechischeap.com` API for transcription, including local file uploads and webhook callbacks. While the script can upload local files and accept arbitrary webhook URLs, these are core functionalities of an ASR service and are not used with malicious intent within the provided code or instructions. There is no evidence of data exfiltration to unauthorized endpoints, malicious execution patterns, or obfuscation.

Capability Assessment

✓ Purpose & Capability

Name, SKILL.md, manifest, and the included scripts all implement an automatic speech-to-text client that calls Speech is Cheap APIs. The single required secret (SIC_API_KEY) is appropriate for an external ASR service. There are no unrelated credentials, surprising binaries, or unrelated install steps.

ℹ Instruction Scope

Runtime instructions and the script only perform expected actions: submit a URL or upload a local file to the service, poll job status, and accept a webhook URL for callbacks. Important operational notes: (1) local files are uploaded (input_file=@...), so any local audio passed to the skill will be transmitted to upload.speechischeap.com; (2) when transcribing by URL, the remote API will fetch the provided URL — supplying internal/private URLs could allow that external service to retrieve internal resources (SSRF risk). These behaviours are expected for a transcription client but are privacy/safety considerations rather than incoherence.

ℹ Install Mechanism

There is no install spec (instruction-only skill) and the script is included as a plain bash file — this is low risk. Minor inconsistency: the script uses curl but the skill metadata lists no required binaries; agents should ensure curl (or an equivalent HTTP client) is available.

✓ Credentials

Only SIC_API_KEY is declared and used by the script. The environment requirement is proportional to the skill's purpose and is documented in SKILL.md and manifest.json. No other environment variables or secret-scoped config paths are accessed.

✓ Persistence & Privilege

The skill does not request always:true, does not declare elevated persistence, and does not modify other skills or global config. It runs as a CLI wrapper and relies on the agent invoking its commands.

Version History

v1.2.0

tag 1.2.0 Tagger: Ilya Kaminsky <[email protected]> ## [1.2.0] - 2026-02-01 ### Added - Add `publish.sh` script for publishing the skill to ClawHub ### Fixed - Ensure the script is executable by publishing it with OpenClaw

v1.1.0

## [1.1.0] - 2026-02-01 ### Changed - Update `SKILL.md` based on feedback from an OpenClaw agent ### Fixed - Make `asr.sh` executable

v1.0.0

## [1.0.0] - 2026-02-01 ### Added - Main functionality for the Speech is Cheap ASR skill

Metadata

Slug asr

Version 1.2.0

License —

All-time Installs 10

Active Installs 10

Total Versions 3

Frequently Asked Questions

What is Speech is Cheap Transcribe?

Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats. It is an AI Agent Skill for Claude Code / OpenClaw, with 2734 downloads so far.

How do I install Speech is Cheap Transcribe?

Run "/install asr" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Speech is Cheap Transcribe free?

Yes, Speech is Cheap Transcribe is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Speech is Cheap Transcribe support?

Speech is Cheap Transcribe is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Speech is Cheap Transcribe?

It is built and maintained by ilyakam (@ilyakam); the current version is v1.2.0.

More Skills

Speech is Cheap Transcribe