← 返回 Skills 市场

Elevenlabs Transcribe

Name: Elevenlabs Transcribe
Author: paulasjes

作者 PaulAsjes · GitHub ↗ · v1.0.1

cross-platform ⚠ suspicious

2551

总下载

当前安装

版本数

在 OpenClaw 中安装

/install elevenlabs-transcribe

功能描述

Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.

使用说明 (SKILL.md)

ElevenLabs Speech-to-Text

Official ElevenLabs skill for speech-to-text transcription.

Convert audio to text with state-of-the-art accuracy. Supports 90+ languages, speaker diarization, and realtime streaming.

Prerequisites

ffmpeg installed (brew install ffmpeg on macOS)
ELEVENLABS_API_KEY environment variable set
Python 3.8+ (dependencies auto-install on first run)

Usage

{baseDir}/scripts/transcribe.sh \x3Caudio_file> [options]
{baseDir}/scripts/transcribe.sh --url \x3Cstream_url> [options]
{baseDir}/scripts/transcribe.sh --mic [options]

Examples

Batch Transcription

Transcribe a local audio file:

{baseDir}/scripts/transcribe.sh recording.mp3

With speaker identification:

{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize

Get full JSON response with timestamps:

{baseDir}/scripts/transcribe.sh interview.wav --diarize --json

Realtime Streaming

Stream from a URL (e.g., live radio, podcast):

{baseDir}/scripts/transcribe.sh --url https://npr-ice.streamguys1.com/live.mp3

Transcribe from microphone:

{baseDir}/scripts/transcribe.sh --mic

Stream a local file in realtime (useful for testing):

{baseDir}/scripts/transcribe.sh audio.mp3 --realtime

Quiet Mode for Agents

Suppress status messages on stderr:

{baseDir}/scripts/transcribe.sh --mic --quiet

Options

Option	Description
`--diarize`	Identify different speakers in the audio
`--lang CODE`	ISO language hint (e.g., `en`, `pt`, `es`, `fr`)
`--json`	Output full JSON with timestamps and metadata
`--events`	Tag audio events (laughter, music, applause)
`--realtime`	Stream local file instead of batch processing
`--partials`	Show interim transcripts during realtime mode
`-q, --quiet`	Suppress status messages (recommended for agents)

Output Format

Text Mode (default)

Plain text transcription:

The quick brown fox jumps over the lazy dog.

JSON Mode (`--json`)

{
  "text": "The quick brown fox jumps over the lazy dog.",
  "language_code": "eng",
  "language_probability": 0.98,
  "words": [
    {"text": "The", "start": 0.0, "end": 0.15, "type": "word", "speaker_id": "speaker_0"}
  ]
}

Realtime Mode

Final transcripts print as they're committed. With --partials:

[partial] The quick
[partial] The quick brown fox
The quick brown fox jumps over the lazy dog.

Supported Formats

Audio: MP3, WAV, M4A, FLAC, OGG, WebM, AAC, AIFF, Opus Video: MP4, AVI, MKV, MOV, WMV, FLV, WebM, MPEG, 3GPP

Limits: Up to 3GB file size, 10 hours duration

Error Handling

The script exits with non-zero status on errors:

Missing API key: Set ELEVENLABS_API_KEY environment variable
File not found: Check the file path exists
Missing ffmpeg: Install with your package manager
API errors: Check API key validity and rate limits

When to Use Each Mode

Scenario	Command
Transcribe a recording	`./transcribe.sh file.mp3`
Meeting with multiple speakers	`./transcribe.sh meeting.mp3 --diarize`
Live radio/podcast stream	`./transcribe.sh --url \x3Curl>`
Voice input from user	`./transcribe.sh --mic --quiet`
Need word timestamps	`./transcribe.sh file.mp3 --json`

安全使用建议

This skill's code behaves like a normal ElevenLabs transcription client, but before installing: 1) Confirm the publisher — SKILL.md claims 'Official ElevenLabs' but the source/owner are not ElevenLabs; prefer official plugins from the vendor when possible. 2) Review and protect your ELEVENLABS_API_KEY (use a scoped/test key if possible). 3) Be aware the script will create a local .venv and pip-install packages from PyPI (network activity); consider installing in an isolated environment/container. 4) Note load_dotenv() will read a .env file in the skill directory and could load other env vars — remove secrets you don't want read. 5) If you need stronger supply-chain guarantees, request that all requirements be pinned with verified hashes for every platform-specific package or run the code review/install inside an isolated sandbox first.

功能分析

Type: OpenClaw Skill Name: elevenlabs-transcribe Version: 1.0.1 The skill bundle is benign. It provides audio transcription functionality using the ElevenLabs API, supporting local files, URLs, and microphone input. The `SKILL.md` contains no prompt injection attempts. The `transcribe.sh` script securely sets up a Python virtual environment and installs dependencies from `requirements.txt` with SHA256 hashes for supply chain security. The `transcribe.py` script uses `os.getenv` for the API key and `sounddevice` for microphone access, both of which are directly aligned with the skill's stated purpose and do not show any evidence of data exfiltration to unauthorized endpoints, malicious execution, or persistence mechanisms.

能力评估

ℹ Purpose & Capability

The scripts and declared requirements (ffmpeg, python3, ELEVENLABS_API_KEY) align with a speech-to-text skill using ElevenLabs. However, SKILL.md calls this the 'Official ElevenLabs skill' while the registry 'Source' is unknown and the owner ID does not obviously belong to ElevenLabs — possible impersonation or mislabeling.

✓ Instruction Scope

The runtime instructions and scripts stay within the stated purpose: convert audio (file, mic, URL) to text and send audio to ElevenLabs via their SDK. One minor scope note: the Python code calls load_dotenv(), which will read a local .env file if present — that can surface other environment variables from disk (not declared in requires.env).

ℹ Install Mechanism

There is no platform install spec, but the provided shell wrapper auto-creates a local virtualenv and runs pip install -r requirements.txt. Main dependencies are pinned with hashes for supply-chain integrity (elevenlabs, pydub, python-dotenv), but some platform-specific packages (sounddevice, numpy) are not hashed. pip installs from PyPI on first run (network activity) and writes a .venv directory under the skill folder.

✓ Credentials

Only ELEVENLABS_API_KEY is declared and used; that is appropriate for a transcription client. Note that load_dotenv() may read a .env file from disk and load additional env vars implicitly. The code will transmit audio and the API key (via the ElevenLabs SDK) to ElevenLabs' service — this is expected behavior but worth confirming you're comfortable sending audio to that provider.

✓ Persistence & Privilege

The skill does not request always:true and won't be force-included. It sets up a per-skill .venv and an installed marker in the skill directory; it doesn't modify other skills or system-wide agent settings.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install elevenlabs-transcribe
安装完成后，直接呼叫该 Skill 的名称或使用 /elevenlabs-transcribe 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.1

- Updated script and file locations to the new scripts/ directory for better organization. - Usage examples and documentation now reference {baseDir}/scripts/transcribe.sh. - requirements.txt, transcribe.py, and transcribe.sh moved into scripts/ directory. - Old top-level script and requirement files removed; new versions added in scripts/. - No changes to user-facing options or functionality.

v1.0.0

Initial release of elevenlabs-transcribe skill: - Transcribe audio to text using ElevenLabs Scribe with support for over 90 languages. - Batch process local files, transcribe from URLs, or record directly from microphone in real time. - Features include speaker diarization, event tagging (e.g., laughter, music), and JSON output with word-level timestamps. - Supports a wide range of audio and video formats up to 3GB/10 hours in duration. - Quiet mode available to suppress status messages for seamless automation and agent integration. - Automatic error handling for missing prerequisites (API key, ffmpeg, file not found, API errors).

元数据

Slug elevenlabs-transcribe

版本 1.0.1

许可证 —

累计安装 3

当前安装数 2

历史版本数 2

常见问题