โ Back to Skills Marketplace
Transcribee ๐
by
itsfabioroma
ยท GitHub โ
ยท v1.2.1
3185
Downloads
6
Stars
10
Active Installs
4
Versions
Install in OpenClaw
/install transcribee
Description
Transcribe YouTube videos and local audio/video files with speaker diarization. Use when user asks to transcribe a YouTube URL, podcast, video, or audio file. Outputs clean speaker-labeled transcripts ready for LLM analysis.
Usage Guidance
Before installing or enabling this skill: 1) Expect it to require two API keys (ELEVEN_LABS_API_KEY and ANTHROPIC_API_KEY) and system binaries (yt-dlp, ffmpeg) even though the registry listing omitted them โ verify and supply keys only if you trust those services. 2) Be aware audio and transcripts are uploaded to external services; if privacy-sensitive audio will be transcribed, consider running a local-only alternative. 3) The skill will read your ~/Documents/transcripts/ library and write new transcript folders there โ review or sandbox it if you don't want that folder modified. 4) Verify the skill's source (there is no homepage listed) โ prefer installing from a trusted repo and inspect the .env.example and index.ts for any extra endpoints or hardcoded secrets. 5) If you allow autonomous agent invocation, consider restricting its access or running the skill manually the first few times to confirm behavior. If you want to go ahead, run it in an isolated environment, provide least-privilege API keys, and review the code for any hidden network calls not documented in README/CLAUDE.md.
Capability Analysis
Type: OpenClaw Skill
Name: transcribee
Version: 1.2.1
The skill is classified as benign. It transparently uses `yt-dlp` and `ffmpeg` for media processing and `ElevenLabs` and `Anthropic` APIs for transcription and categorization, which aligns with its stated purpose. External command execution is handled using `execFileAsync`, which is a safer method than `exec` as it prevents shell injection. Output files are saved to a user-owned directory (`~/Documents/transcripts`). There is no evidence of data exfiltration, malicious execution, persistence mechanisms, or prompt injection attempts against the OpenClaw agent itself; the prompt engineering observed is for the internal Anthropic LLM used for categorization.
Capability Assessment
Purpose & Capability
Name/description (transcribing YouTube/local files with diarization and auto-organization) matches the included code. However the registry metadata claims no required env vars/binaries while the code clearly requires ELEVEN_LABS_API_KEY and ANTHROPIC_API_KEY and expects yt-dlp/ffmpeg. README/CLAUDE.md mention Instagram and TikTok support, but the shipped wrapper (transcribe.sh) warns only about YouTube โ inconsistent scope/claims.
Instruction Scope
Runtime instructions and scripts run yt-dlp/ffmpeg (downloads/extracts media), call ElevenLabs and Anthropic SDKs, and read/write the user's library at ~/Documents/transcripts/. The code reads existing transcripts to decide categories. It does not appear to access unrelated system credentials, but it will transmit user audio/transcripts to external services (ElevenLabs and Anthropic) โ a privacy/telemetry consideration that is expected but worth noting.
Install Mechanism
There is no install spec; the package includes a package.json and pnpm lock (uses npm packages 'elevenlabs' and '@anthropic-ai/sdk'). This is a moderate-risk, standard npm dependency surface โ no arbitrary download URLs or extract-from-remote artifacts were found. Running pnpm install will pull dependencies from public registries.
Credentials
The code requires ELEVEN_LABS_API_KEY and ANTHROPIC_API_KEY (and expects a local .env in the skill directory), and expects system binaries yt-dlp and ffmpeg. The registry metadata reported no required env vars or binaries โ that's a clear mismatch. Requesting API keys for the transcription and classification services is reasonable for the stated purpose, but the omission in metadata is a red flag (it hides required credentials).
Persistence & Privilege
always:false and the skill does not request elevated system privileges. It reads/writes a folder under the user's home (~/Documents/transcripts) and creates temporary audio in OS tmpdir. Autonomous invocation is allowed (platform default) โ combine that with external API access if you intend to allow agent-initiated runs.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install transcribee - After installation, invoke the skill by name or use
/transcribee - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.2.1
Add TikTok support to docs
v1.2.0
Rich metadata + cleaner output (2 files default, --raw for timestamps)
v1.1.0
Add Instagram Reels support
v1.0.0
- Initial release of Transcribee.
- Transcribe YouTube videos and local audio/video files with speaker diarization.
- Outputs clean, speaker-labeled transcripts ready for analysis.
- Supports various audio (mp3, m4a, wav, ogg, flac) and video (mp4, mkv, webm, mov, avi) formats.
- Organizes transcripts and metadata in a structured output directory.
- Includes troubleshooting steps and clear usage instructions.
Metadata
Frequently Asked Questions
What is Transcribee ๐?
Transcribe YouTube videos and local audio/video files with speaker diarization. Use when user asks to transcribe a YouTube URL, podcast, video, or audio file. Outputs clean speaker-labeled transcripts ready for LLM analysis. It is an AI Agent Skill for Claude Code / OpenClaw, with 3185 downloads so far.
How do I install Transcribee ๐?
Run "/install transcribee" in the OpenClaw or Claude Code chat to install it in one step โ no extra setup required.
Is Transcribee ๐ free?
Yes, Transcribee ๐ is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Transcribee ๐ support?
Transcribee ๐ is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Transcribee ๐?
It is built and maintained by itsfabioroma (@itsfabioroma); the current version is v1.2.1.
More Skills