← Back to Skills Marketplace
paulasjes

Elevenlabs Transcribe

by PaulAsjes · GitHub ↗ · v1.0.1
cross-platform ⚠ suspicious
2551
Downloads
2
Stars
2
Active Installs
2
Versions
Install in OpenClaw
/install elevenlabs-transcribe
Description
Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.
README (SKILL.md)

ElevenLabs Speech-to-Text

Official ElevenLabs skill for speech-to-text transcription.

Convert audio to text with state-of-the-art accuracy. Supports 90+ languages, speaker diarization, and realtime streaming.

Prerequisites

  • ffmpeg installed (brew install ffmpeg on macOS)
  • ELEVENLABS_API_KEY environment variable set
  • Python 3.8+ (dependencies auto-install on first run)

Usage

{baseDir}/scripts/transcribe.sh \x3Caudio_file> [options]
{baseDir}/scripts/transcribe.sh --url \x3Cstream_url> [options]
{baseDir}/scripts/transcribe.sh --mic [options]

Examples

Batch Transcription

Transcribe a local audio file:

{baseDir}/scripts/transcribe.sh recording.mp3

With speaker identification:

{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize

Get full JSON response with timestamps:

{baseDir}/scripts/transcribe.sh interview.wav --diarize --json

Realtime Streaming

Stream from a URL (e.g., live radio, podcast):

{baseDir}/scripts/transcribe.sh --url https://npr-ice.streamguys1.com/live.mp3

Transcribe from microphone:

{baseDir}/scripts/transcribe.sh --mic

Stream a local file in realtime (useful for testing):

{baseDir}/scripts/transcribe.sh audio.mp3 --realtime

Quiet Mode for Agents

Suppress status messages on stderr:

{baseDir}/scripts/transcribe.sh --mic --quiet

Options

Option Description
--diarize Identify different speakers in the audio
--lang CODE ISO language hint (e.g., en, pt, es, fr)
--json Output full JSON with timestamps and metadata
--events Tag audio events (laughter, music, applause)
--realtime Stream local file instead of batch processing
--partials Show interim transcripts during realtime mode
-q, --quiet Suppress status messages (recommended for agents)

Output Format

Text Mode (default)

Plain text transcription:

The quick brown fox jumps over the lazy dog.

JSON Mode (--json)

{
  "text": "The quick brown fox jumps over the lazy dog.",
  "language_code": "eng",
  "language_probability": 0.98,
  "words": [
    {"text": "The", "start": 0.0, "end": 0.15, "type": "word", "speaker_id": "speaker_0"}
  ]
}

Realtime Mode

Final transcripts print as they're committed. With --partials:

[partial] The quick
[partial] The quick brown fox
The quick brown fox jumps over the lazy dog.

Supported Formats

Audio: MP3, WAV, M4A, FLAC, OGG, WebM, AAC, AIFF, Opus Video: MP4, AVI, MKV, MOV, WMV, FLV, WebM, MPEG, 3GPP

Limits: Up to 3GB file size, 10 hours duration

Error Handling

The script exits with non-zero status on errors:

  • Missing API key: Set ELEVENLABS_API_KEY environment variable
  • File not found: Check the file path exists
  • Missing ffmpeg: Install with your package manager
  • API errors: Check API key validity and rate limits

When to Use Each Mode

Scenario Command
Transcribe a recording ./transcribe.sh file.mp3
Meeting with multiple speakers ./transcribe.sh meeting.mp3 --diarize
Live radio/podcast stream ./transcribe.sh --url \x3Curl>
Voice input from user ./transcribe.sh --mic --quiet
Need word timestamps ./transcribe.sh file.mp3 --json
Usage Guidance
This skill's code behaves like a normal ElevenLabs transcription client, but before installing: 1) Confirm the publisher — SKILL.md claims 'Official ElevenLabs' but the source/owner are not ElevenLabs; prefer official plugins from the vendor when possible. 2) Review and protect your ELEVENLABS_API_KEY (use a scoped/test key if possible). 3) Be aware the script will create a local .venv and pip-install packages from PyPI (network activity); consider installing in an isolated environment/container. 4) Note load_dotenv() will read a .env file in the skill directory and could load other env vars — remove secrets you don't want read. 5) If you need stronger supply-chain guarantees, request that all requirements be pinned with verified hashes for every platform-specific package or run the code review/install inside an isolated sandbox first.
Capability Analysis
Type: OpenClaw Skill Name: elevenlabs-transcribe Version: 1.0.1 The skill bundle is benign. It provides audio transcription functionality using the ElevenLabs API, supporting local files, URLs, and microphone input. The `SKILL.md` contains no prompt injection attempts. The `transcribe.sh` script securely sets up a Python virtual environment and installs dependencies from `requirements.txt` with SHA256 hashes for supply chain security. The `transcribe.py` script uses `os.getenv` for the API key and `sounddevice` for microphone access, both of which are directly aligned with the skill's stated purpose and do not show any evidence of data exfiltration to unauthorized endpoints, malicious execution, or persistence mechanisms.
Capability Assessment
Purpose & Capability
The scripts and declared requirements (ffmpeg, python3, ELEVENLABS_API_KEY) align with a speech-to-text skill using ElevenLabs. However, SKILL.md calls this the 'Official ElevenLabs skill' while the registry 'Source' is unknown and the owner ID does not obviously belong to ElevenLabs — possible impersonation or mislabeling.
Instruction Scope
The runtime instructions and scripts stay within the stated purpose: convert audio (file, mic, URL) to text and send audio to ElevenLabs via their SDK. One minor scope note: the Python code calls load_dotenv(), which will read a local .env file if present — that can surface other environment variables from disk (not declared in requires.env).
Install Mechanism
There is no platform install spec, but the provided shell wrapper auto-creates a local virtualenv and runs pip install -r requirements.txt. Main dependencies are pinned with hashes for supply-chain integrity (elevenlabs, pydub, python-dotenv), but some platform-specific packages (sounddevice, numpy) are not hashed. pip installs from PyPI on first run (network activity) and writes a .venv directory under the skill folder.
Credentials
Only ELEVENLABS_API_KEY is declared and used; that is appropriate for a transcription client. Note that load_dotenv() may read a .env file from disk and load additional env vars implicitly. The code will transmit audio and the API key (via the ElevenLabs SDK) to ElevenLabs' service — this is expected behavior but worth confirming you're comfortable sending audio to that provider.
Persistence & Privilege
The skill does not request always:true and won't be force-included. It sets up a per-skill .venv and an installed marker in the skill directory; it doesn't modify other skills or system-wide agent settings.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install elevenlabs-transcribe
  3. After installation, invoke the skill by name or use /elevenlabs-transcribe
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
- Updated script and file locations to the new scripts/ directory for better organization. - Usage examples and documentation now reference {baseDir}/scripts/transcribe.sh. - requirements.txt, transcribe.py, and transcribe.sh moved into scripts/ directory. - Old top-level script and requirement files removed; new versions added in scripts/. - No changes to user-facing options or functionality.
v1.0.0
Initial release of elevenlabs-transcribe skill: - Transcribe audio to text using ElevenLabs Scribe with support for over 90 languages. - Batch process local files, transcribe from URLs, or record directly from microphone in real time. - Features include speaker diarization, event tagging (e.g., laughter, music), and JSON output with word-level timestamps. - Supports a wide range of audio and video formats up to 3GB/10 hours in duration. - Quiet mode available to suppress status messages for seamless automation and agent integration. - Automatic error handling for missing prerequisites (API key, ffmpeg, file not found, API errors).
Metadata
Slug elevenlabs-transcribe
Version 1.0.1
License
All-time Installs 3
Active Installs 2
Total Versions 2
Frequently Asked Questions

What is Elevenlabs Transcribe?

Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files. It is an AI Agent Skill for Claude Code / OpenClaw, with 2551 downloads so far.

How do I install Elevenlabs Transcribe?

Run "/install elevenlabs-transcribe" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Elevenlabs Transcribe free?

Yes, Elevenlabs Transcribe is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Elevenlabs Transcribe support?

Elevenlabs Transcribe is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Elevenlabs Transcribe?

It is built and maintained by PaulAsjes (@paulasjes); the current version is v1.0.1.

💬 Comments