← 返回 Skills 市场
paulasjes

Elevenlabs Transcribe

作者 PaulAsjes · GitHub ↗ · v1.0.1
cross-platform ⚠ suspicious
2551
总下载
2
收藏
2
当前安装
2
版本数
在 OpenClaw 中安装
/install elevenlabs-transcribe
功能描述
Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.
使用说明 (SKILL.md)

ElevenLabs Speech-to-Text

Official ElevenLabs skill for speech-to-text transcription.

Convert audio to text with state-of-the-art accuracy. Supports 90+ languages, speaker diarization, and realtime streaming.

Prerequisites

  • ffmpeg installed (brew install ffmpeg on macOS)
  • ELEVENLABS_API_KEY environment variable set
  • Python 3.8+ (dependencies auto-install on first run)

Usage

{baseDir}/scripts/transcribe.sh \x3Caudio_file> [options]
{baseDir}/scripts/transcribe.sh --url \x3Cstream_url> [options]
{baseDir}/scripts/transcribe.sh --mic [options]

Examples

Batch Transcription

Transcribe a local audio file:

{baseDir}/scripts/transcribe.sh recording.mp3

With speaker identification:

{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize

Get full JSON response with timestamps:

{baseDir}/scripts/transcribe.sh interview.wav --diarize --json

Realtime Streaming

Stream from a URL (e.g., live radio, podcast):

{baseDir}/scripts/transcribe.sh --url https://npr-ice.streamguys1.com/live.mp3

Transcribe from microphone:

{baseDir}/scripts/transcribe.sh --mic

Stream a local file in realtime (useful for testing):

{baseDir}/scripts/transcribe.sh audio.mp3 --realtime

Quiet Mode for Agents

Suppress status messages on stderr:

{baseDir}/scripts/transcribe.sh --mic --quiet

Options

Option Description
--diarize Identify different speakers in the audio
--lang CODE ISO language hint (e.g., en, pt, es, fr)
--json Output full JSON with timestamps and metadata
--events Tag audio events (laughter, music, applause)
--realtime Stream local file instead of batch processing
--partials Show interim transcripts during realtime mode
-q, --quiet Suppress status messages (recommended for agents)

Output Format

Text Mode (default)

Plain text transcription:

The quick brown fox jumps over the lazy dog.

JSON Mode (--json)

{
  "text": "The quick brown fox jumps over the lazy dog.",
  "language_code": "eng",
  "language_probability": 0.98,
  "words": [
    {"text": "The", "start": 0.0, "end": 0.15, "type": "word", "speaker_id": "speaker_0"}
  ]
}

Realtime Mode

Final transcripts print as they're committed. With --partials:

[partial] The quick
[partial] The quick brown fox
The quick brown fox jumps over the lazy dog.

Supported Formats

Audio: MP3, WAV, M4A, FLAC, OGG, WebM, AAC, AIFF, Opus Video: MP4, AVI, MKV, MOV, WMV, FLV, WebM, MPEG, 3GPP

Limits: Up to 3GB file size, 10 hours duration

Error Handling

The script exits with non-zero status on errors:

  • Missing API key: Set ELEVENLABS_API_KEY environment variable
  • File not found: Check the file path exists
  • Missing ffmpeg: Install with your package manager
  • API errors: Check API key validity and rate limits

When to Use Each Mode

Scenario Command
Transcribe a recording ./transcribe.sh file.mp3
Meeting with multiple speakers ./transcribe.sh meeting.mp3 --diarize
Live radio/podcast stream ./transcribe.sh --url \x3Curl>
Voice input from user ./transcribe.sh --mic --quiet
Need word timestamps ./transcribe.sh file.mp3 --json
安全使用建议
This skill's code behaves like a normal ElevenLabs transcription client, but before installing: 1) Confirm the publisher — SKILL.md claims 'Official ElevenLabs' but the source/owner are not ElevenLabs; prefer official plugins from the vendor when possible. 2) Review and protect your ELEVENLABS_API_KEY (use a scoped/test key if possible). 3) Be aware the script will create a local .venv and pip-install packages from PyPI (network activity); consider installing in an isolated environment/container. 4) Note load_dotenv() will read a .env file in the skill directory and could load other env vars — remove secrets you don't want read. 5) If you need stronger supply-chain guarantees, request that all requirements be pinned with verified hashes for every platform-specific package or run the code review/install inside an isolated sandbox first.
功能分析
Type: OpenClaw Skill Name: elevenlabs-transcribe Version: 1.0.1 The skill bundle is benign. It provides audio transcription functionality using the ElevenLabs API, supporting local files, URLs, and microphone input. The `SKILL.md` contains no prompt injection attempts. The `transcribe.sh` script securely sets up a Python virtual environment and installs dependencies from `requirements.txt` with SHA256 hashes for supply chain security. The `transcribe.py` script uses `os.getenv` for the API key and `sounddevice` for microphone access, both of which are directly aligned with the skill's stated purpose and do not show any evidence of data exfiltration to unauthorized endpoints, malicious execution, or persistence mechanisms.
能力评估
Purpose & Capability
The scripts and declared requirements (ffmpeg, python3, ELEVENLABS_API_KEY) align with a speech-to-text skill using ElevenLabs. However, SKILL.md calls this the 'Official ElevenLabs skill' while the registry 'Source' is unknown and the owner ID does not obviously belong to ElevenLabs — possible impersonation or mislabeling.
Instruction Scope
The runtime instructions and scripts stay within the stated purpose: convert audio (file, mic, URL) to text and send audio to ElevenLabs via their SDK. One minor scope note: the Python code calls load_dotenv(), which will read a local .env file if present — that can surface other environment variables from disk (not declared in requires.env).
Install Mechanism
There is no platform install spec, but the provided shell wrapper auto-creates a local virtualenv and runs pip install -r requirements.txt. Main dependencies are pinned with hashes for supply-chain integrity (elevenlabs, pydub, python-dotenv), but some platform-specific packages (sounddevice, numpy) are not hashed. pip installs from PyPI on first run (network activity) and writes a .venv directory under the skill folder.
Credentials
Only ELEVENLABS_API_KEY is declared and used; that is appropriate for a transcription client. Note that load_dotenv() may read a .env file from disk and load additional env vars implicitly. The code will transmit audio and the API key (via the ElevenLabs SDK) to ElevenLabs' service — this is expected behavior but worth confirming you're comfortable sending audio to that provider.
Persistence & Privilege
The skill does not request always:true and won't be force-included. It sets up a per-skill .venv and an installed marker in the skill directory; it doesn't modify other skills or system-wide agent settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install elevenlabs-transcribe
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /elevenlabs-transcribe 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
- Updated script and file locations to the new scripts/ directory for better organization. - Usage examples and documentation now reference {baseDir}/scripts/transcribe.sh. - requirements.txt, transcribe.py, and transcribe.sh moved into scripts/ directory. - Old top-level script and requirement files removed; new versions added in scripts/. - No changes to user-facing options or functionality.
v1.0.0
Initial release of elevenlabs-transcribe skill: - Transcribe audio to text using ElevenLabs Scribe with support for over 90 languages. - Batch process local files, transcribe from URLs, or record directly from microphone in real time. - Features include speaker diarization, event tagging (e.g., laughter, music), and JSON output with word-level timestamps. - Supports a wide range of audio and video formats up to 3GB/10 hours in duration. - Quiet mode available to suppress status messages for seamless automation and agent integration. - Automatic error handling for missing prerequisites (API key, ffmpeg, file not found, API errors).
元数据
Slug elevenlabs-transcribe
版本 1.0.1
许可证
累计安装 3
当前安装数 2
历史版本数 2
常见问题

Elevenlabs Transcribe 是什么?

Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 2551 次。

如何安装 Elevenlabs Transcribe?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install elevenlabs-transcribe」即可一键安装,无需额外配置。

Elevenlabs Transcribe 是免费的吗?

是的,Elevenlabs Transcribe 完全免费(开源免费),可自由下载、安装和使用。

Elevenlabs Transcribe 支持哪些平台?

Elevenlabs Transcribe 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Elevenlabs Transcribe?

由 PaulAsjes(@paulasjes)开发并维护,当前版本 v1.0.1。

💬 留言讨论