← Back to Skills Marketplace

Elevenlabs Transcribe

Name: Elevenlabs Transcribe
Author: paulasjes

by PaulAsjes · GitHub ↗ · v1.0.1

cross-platform ⚠ suspicious

2551

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install elevenlabs-transcribe

Description

Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.

README (SKILL.md)

ElevenLabs Speech-to-Text

Official ElevenLabs skill for speech-to-text transcription.

Convert audio to text with state-of-the-art accuracy. Supports 90+ languages, speaker diarization, and realtime streaming.

Prerequisites

ffmpeg installed (brew install ffmpeg on macOS)
ELEVENLABS_API_KEY environment variable set
Python 3.8+ (dependencies auto-install on first run)

Usage

{baseDir}/scripts/transcribe.sh \x3Caudio_file> [options]
{baseDir}/scripts/transcribe.sh --url \x3Cstream_url> [options]
{baseDir}/scripts/transcribe.sh --mic [options]

Examples

Batch Transcription

Transcribe a local audio file:

{baseDir}/scripts/transcribe.sh recording.mp3

With speaker identification:

{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize

Get full JSON response with timestamps:

{baseDir}/scripts/transcribe.sh interview.wav --diarize --json

Realtime Streaming

Stream from a URL (e.g., live radio, podcast):

{baseDir}/scripts/transcribe.sh --url https://npr-ice.streamguys1.com/live.mp3

Transcribe from microphone:

{baseDir}/scripts/transcribe.sh --mic

Stream a local file in realtime (useful for testing):

{baseDir}/scripts/transcribe.sh audio.mp3 --realtime

Quiet Mode for Agents

Suppress status messages on stderr:

{baseDir}/scripts/transcribe.sh --mic --quiet

Options

Option	Description
`--diarize`	Identify different speakers in the audio
`--lang CODE`	ISO language hint (e.g., `en`, `pt`, `es`, `fr`)
`--json`	Output full JSON with timestamps and metadata
`--events`	Tag audio events (laughter, music, applause)
`--realtime`	Stream local file instead of batch processing
`--partials`	Show interim transcripts during realtime mode
`-q, --quiet`	Suppress status messages (recommended for agents)

Output Format

Text Mode (default)

Plain text transcription:

The quick brown fox jumps over the lazy dog.

JSON Mode (`--json`)

{
  "text": "The quick brown fox jumps over the lazy dog.",
  "language_code": "eng",
  "language_probability": 0.98,
  "words": [
    {"text": "The", "start": 0.0, "end": 0.15, "type": "word", "speaker_id": "speaker_0"}
  ]
}

Realtime Mode

Final transcripts print as they're committed. With --partials:

[partial] The quick
[partial] The quick brown fox
The quick brown fox jumps over the lazy dog.

Supported Formats

Audio: MP3, WAV, M4A, FLAC, OGG, WebM, AAC, AIFF, Opus Video: MP4, AVI, MKV, MOV, WMV, FLV, WebM, MPEG, 3GPP

Limits: Up to 3GB file size, 10 hours duration

Error Handling

The script exits with non-zero status on errors:

Missing API key: Set ELEVENLABS_API_KEY environment variable
File not found: Check the file path exists
Missing ffmpeg: Install with your package manager
API errors: Check API key validity and rate limits

When to Use Each Mode

Scenario	Command
Transcribe a recording	`./transcribe.sh file.mp3`
Meeting with multiple speakers	`./transcribe.sh meeting.mp3 --diarize`
Live radio/podcast stream	`./transcribe.sh --url \x3Curl>`
Voice input from user	`./transcribe.sh --mic --quiet`
Need word timestamps	`./transcribe.sh file.mp3 --json`

Usage Guidance

This skill's code behaves like a normal ElevenLabs transcription client, but before installing: 1) Confirm the publisher — SKILL.md claims 'Official ElevenLabs' but the source/owner are not ElevenLabs; prefer official plugins from the vendor when possible. 2) Review and protect your ELEVENLABS_API_KEY (use a scoped/test key if possible). 3) Be aware the script will create a local .venv and pip-install packages from PyPI (network activity); consider installing in an isolated environment/container. 4) Note load_dotenv() will read a .env file in the skill directory and could load other env vars — remove secrets you don't want read. 5) If you need stronger supply-chain guarantees, request that all requirements be pinned with verified hashes for every platform-specific package or run the code review/install inside an isolated sandbox first.

Capability Analysis

Type: OpenClaw Skill Name: elevenlabs-transcribe Version: 1.0.1 The skill bundle is benign. It provides audio transcription functionality using the ElevenLabs API, supporting local files, URLs, and microphone input. The `SKILL.md` contains no prompt injection attempts. The `transcribe.sh` script securely sets up a Python virtual environment and installs dependencies from `requirements.txt` with SHA256 hashes for supply chain security. The `transcribe.py` script uses `os.getenv` for the API key and `sounddevice` for microphone access, both of which are directly aligned with the skill's stated purpose and do not show any evidence of data exfiltration to unauthorized endpoints, malicious execution, or persistence mechanisms.

Capability Assessment

ℹ Purpose & Capability

The scripts and declared requirements (ffmpeg, python3, ELEVENLABS_API_KEY) align with a speech-to-text skill using ElevenLabs. However, SKILL.md calls this the 'Official ElevenLabs skill' while the registry 'Source' is unknown and the owner ID does not obviously belong to ElevenLabs — possible impersonation or mislabeling.

✓ Instruction Scope

The runtime instructions and scripts stay within the stated purpose: convert audio (file, mic, URL) to text and send audio to ElevenLabs via their SDK. One minor scope note: the Python code calls load_dotenv(), which will read a local .env file if present — that can surface other environment variables from disk (not declared in requires.env).

ℹ Install Mechanism

There is no platform install spec, but the provided shell wrapper auto-creates a local virtualenv and runs pip install -r requirements.txt. Main dependencies are pinned with hashes for supply-chain integrity (elevenlabs, pydub, python-dotenv), but some platform-specific packages (sounddevice, numpy) are not hashed. pip installs from PyPI on first run (network activity) and writes a .venv directory under the skill folder.

✓ Credentials

Only ELEVENLABS_API_KEY is declared and used; that is appropriate for a transcription client. Note that load_dotenv() may read a .env file from disk and load additional env vars implicitly. The code will transmit audio and the API key (via the ElevenLabs SDK) to ElevenLabs' service — this is expected behavior but worth confirming you're comfortable sending audio to that provider.

✓ Persistence & Privilege

The skill does not request always:true and won't be force-included. It sets up a per-skill .venv and an installed marker in the skill directory; it doesn't modify other skills or system-wide agent settings.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install elevenlabs-transcribe
After installation, invoke the skill by name or use /elevenlabs-transcribe
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

- Updated script and file locations to the new scripts/ directory for better organization. - Usage examples and documentation now reference {baseDir}/scripts/transcribe.sh. - requirements.txt, transcribe.py, and transcribe.sh moved into scripts/ directory. - Old top-level script and requirement files removed; new versions added in scripts/. - No changes to user-facing options or functionality.

v1.0.0

Initial release of elevenlabs-transcribe skill: - Transcribe audio to text using ElevenLabs Scribe with support for over 90 languages. - Batch process local files, transcribe from URLs, or record directly from microphone in real time. - Features include speaker diarization, event tagging (e.g., laughter, music), and JSON output with word-level timestamps. - Supports a wide range of audio and video formats up to 3GB/10 hours in duration. - Quiet mode available to suppress status messages for seamless automation and agent integration. - Automatic error handling for missing prerequisites (API key, ffmpeg, file not found, API errors).

Metadata

Slug elevenlabs-transcribe

Version 1.0.1

License —

All-time Installs 3

Active Installs 2

Total Versions 2

Frequently Asked Questions

What is Elevenlabs Transcribe?

Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files. It is an AI Agent Skill for Claude Code / OpenClaw, with 2551 downloads so far.

How do I install Elevenlabs Transcribe?

Run "/install elevenlabs-transcribe" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Elevenlabs Transcribe free?

Yes, Elevenlabs Transcribe is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Elevenlabs Transcribe support?

Elevenlabs Transcribe is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Elevenlabs Transcribe?

It is built and maintained by PaulAsjes (@paulasjes); the current version is v1.0.1.

More Skills