← 返回 Skills 市场

Speech is Cheap Transcribe

Name: Speech is Cheap Transcribe
Author: ilyakam

作者 ilyakam · GitHub ↗ · v1.2.0

cross-platform ✓ 安全检测通过

2734

总下载

当前安装

版本数

在 OpenClaw 中安装

/install asr

功能描述

Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.

使用说明 (SKILL.md)

Speech is Cheap (SIC) Skill

Fast, accurate, and incredibly inexpensive automatic speech-to-text transcription service.

🚀 Why use this skill?

Disruptive Pricing: $0.06 - $0.12 per hour (2-15x cheaper than Deepgram or OpenAI).
Extreme Speed: 100 minutes of audio transcribes in ~1 minute.
Multilingual: Supports 100 languages with auto-detection.
Agent-Ready: Designed for high-volume, automated pipelines.

🛠 Setup

1. Get an API Key

2. Configure Authentication

This skill looks for your API key in the SIC_API_KEY environment variable.

Add this to your .env or agent config:

SIC_API_KEY=your_key_here

📖 Usage

🤖 TL;DR for Agents

When this skill is installed, you can transcribe any URL from an OpenClaw session and get the JSON results immediately by running: ./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3"

Transcribe a URL

# Basic transcription
./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3"

# Advanced transcription with options
./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3" \
  --speakers --words --labels \
  --language "en" \
  --format "srt" \
  --private

Transcribe a Local File

Perfect for processing audio already on your disk. This handles the upload automatically.

# Upload and transcribe local media
./skills/asr/scripts/asr.sh transcribe --file "./local-audio.wav"

# Upload with webhook callback
./skills/asr/scripts/asr.sh transcribe --file "./local-audio.wav" --webhook "https://mysite.com/callback"

# Note: For local files, the skill handles the multi-part upload to
# https://upload.speechischeap.com before starting the transcription.

Supported Options

--speakers: Enable speaker diarization
--words: Enable word-level timestamps
--labels: Enable audio labeling (music, noise, etc.)
--stream: Enable streaming output
--private: Do not store audio/transcript (privacy mode)
--language \x3Ccode>: ISO language code (e.g., 'en', 'es')
--confidence \x3Cfloat>: Minimum confidence threshold (default 0.5)
--format \x3Cfmt>: Output format (json, srt, vtt, webvtt)
--webhook \x3Curl>: URL to receive job completion payload
--segment-duration \x3Cn>: Segment duration in seconds (default 30)

Check Job Status

./skills/asr/scripts/asr.sh status "job-id-here"

🤖 For Agents

The asr.sh command-line tool returns JSON by default when successful, making it easy to pipe into other tools or parse directly.

If the SIC_API_KEY is missing, the tool will provide a clear error message and a direct link to the signup page.

安全使用建议

This skill appears to do exactly what it claims: it uploads audio (local files or URLs) to Speech is Cheap and returns transcription job JSON. Before installing, consider: (1) Trust the vendor — your audio and potentially sensitive content will be sent to api.speechischeap.com / upload.speechischeap.com; (2) Protect the SIC_API_KEY like any API secret; (3) Be cautious when passing URLs — the remote service will fetch them (risk of exposing internal endpoints); (4) If you use webhooks, the skill will include your webhook URL in job requests (ensure the target is trusted); (5) Ensure curl is available on the agent host (script expects it). If you need stronger privacy, verify the provider's 'private' behavior and retention policy or avoid sending sensitive audio.

功能分析

Type: OpenClaw Skill Name: asr Version: 1.2.0 The skill bundle provides an automatic speech-to-text transcription service. The `SKILL.md` and `README.md` files contain clear, functional instructions for the OpenClaw agent to use the `asr.sh` script, without any evidence of prompt injection attempts to subvert the agent's behavior or exfiltrate data. The `asr.sh` script correctly handles the `SIC_API_KEY` environment variable and interacts with the `speechischeap.com` API for transcription, including local file uploads and webhook callbacks. While the script can upload local files and accept arbitrary webhook URLs, these are core functionalities of an ASR service and are not used with malicious intent within the provided code or instructions. There is no evidence of data exfiltration to unauthorized endpoints, malicious execution patterns, or obfuscation.

能力评估

✓ Purpose & Capability

Name, SKILL.md, manifest, and the included scripts all implement an automatic speech-to-text client that calls Speech is Cheap APIs. The single required secret (SIC_API_KEY) is appropriate for an external ASR service. There are no unrelated credentials, surprising binaries, or unrelated install steps.

ℹ Instruction Scope

Runtime instructions and the script only perform expected actions: submit a URL or upload a local file to the service, poll job status, and accept a webhook URL for callbacks. Important operational notes: (1) local files are uploaded (input_file=@...), so any local audio passed to the skill will be transmitted to upload.speechischeap.com; (2) when transcribing by URL, the remote API will fetch the provided URL — supplying internal/private URLs could allow that external service to retrieve internal resources (SSRF risk). These behaviours are expected for a transcription client but are privacy/safety considerations rather than incoherence.

ℹ Install Mechanism

There is no install spec (instruction-only skill) and the script is included as a plain bash file — this is low risk. Minor inconsistency: the script uses curl but the skill metadata lists no required binaries; agents should ensure curl (or an equivalent HTTP client) is available.

✓ Credentials

Only SIC_API_KEY is declared and used by the script. The environment requirement is proportional to the skill's purpose and is documented in SKILL.md and manifest.json. No other environment variables or secret-scoped config paths are accessed.

✓ Persistence & Privilege

The skill does not request always:true, does not declare elevated persistence, and does not modify other skills or global config. It runs as a CLI wrapper and relies on the agent invoking its commands.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install asr
安装完成后，直接呼叫该 Skill 的名称或使用 /asr 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.2.0

tag 1.2.0 Tagger: Ilya Kaminsky <[email protected]> ## [1.2.0] - 2026-02-01 ### Added - Add `publish.sh` script for publishing the skill to ClawHub ### Fixed - Ensure the script is executable by publishing it with OpenClaw

v1.1.0

## [1.1.0] - 2026-02-01 ### Changed - Update `SKILL.md` based on feedback from an OpenClaw agent ### Fixed - Make `asr.sh` executable

v1.0.0

## [1.0.0] - 2026-02-01 ### Added - Main functionality for the Speech is Cheap ASR skill

元数据

Slug asr

版本 1.2.0

许可证 —

累计安装 10

当前安装数 10

历史版本数 3

常见问题