← Back to Skills Marketplace

transcription

Name: transcription
Author: djismgaming

by djismgaming · GitHub ↗ · v1.0.1 · MIT-0

cross-platform ⚠ suspicious

482

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install transcription

Description

Transcribe audio and video files using OpenAI Whisper API. Use when user wants to transcribe audio/video files, extract speech from media, or get text from r...

Usage Guidance

This skill will run included Python scripts and invoke ffmpeg, then upload whatever audio/video you provide to the hardcoded endpoint http://192.168.0.11:8080/v1. Before installing or using it: 1) Confirm that 192.168.0.11 is a trusted, intended Whisper service (or modify the scripts to point to a trusted endpoint or to read endpoint from a config/env var). 2) Ensure ffmpeg is installed and the Python 'requests' package is available (the skill metadata does not declare these). 3) Do not send sensitive audio to the skill until you verify the receiving host and its retention/privacy policies. 4) If you need to use an official OpenAI-hosted API, update the endpoint/auth appropriately rather than relying on the hardcoded address. 5) For extra caution, run the scripts on a sandbox machine and test with non-sensitive files first.

Capability Analysis

Type: OpenClaw Skill Name: transcription Version: 1.0.1 The transcription skill bundle is designed to transcribe audio and video files using a local OpenAI Whisper API endpoint (192.168.0.11). It utilizes ffmpeg for audio extraction from video files and standard Python libraries (requests, subprocess) for processing. The code logic is transparent, aligns with the stated purpose in SKILL.md, and contains no evidence of malicious intent, data exfiltration, or prompt injection. A minor unreachable code block in scripts/transcribe_audio.py appears to be a harmless copy-paste error.

Capability Assessment

ℹ Purpose & Capability

Name/description say 'OpenAI Whisper API' and the code calls a Whisper-compatible endpoint; this is coherent. Minor mismatch: the SKILL.md and scripts point to a hardcoded local endpoint (http://192.168.0.11:8080/v1) rather than the public OpenAI cloud API — this is plausible (self-hosted Whisper) but should be explicit to users.

⚠ Instruction Scope

Runtime instructions and scripts will upload user-supplied audio/video files to a fixed HTTP endpoint (192.168.0.11:8080) and call ffmpeg locally. That means any file you provide will be transmitted to that host; ensure that host is trusted and network access is intended. The instructions do not provide an option to override the endpoint via an environment variable or configuration.

ℹ Install Mechanism

There is no install spec (instruction-only), which limits disk writes, but the skill ships Python scripts that require external runtime ingredients. The SKILL metadata does not declare required binaries or Python deps even though scripts call ffmpeg and import the requests library.

ℹ Credentials

The skill requests no credentials or environment variables (good), but it hardcodes an HTTP endpoint and model name. Because no auth is declared, the endpoint is assumed unauthenticated; ensure this is correct for your environment. Also the skill fails to declare that it requires ffmpeg and Python requests.

✓ Persistence & Privilege

always is false and the skill doesn't request persistent platform privileges or modify other skills. It runs as-invoked and does not declare any elevated host privileges.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install transcription
After installation, invoke the skill by name or use /transcription
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

- Added a new script: scripts/transcribe.sh. - Updated documentation to include manual script usage instructions for direct command-line transcription. - Removed language selection and auto-detect language features from feature list. - Simplified usage examples to focus on running scripts via command line.

v1.0.0

Transcription skill initial release: - Transcribe audio and video files using a local OpenAI Whisper API. - Supports a wide range of audio (mp3, wav, ogg, etc.) and video (mp4, mov, mkv, etc.) formats. - Automatic language detection or specify language for transcription. - Extract timestamps, choose output formats (text, JSON, SRT, VTT). - Batch processing: send multiple files for simultaneous transcription. - Automatically extracts audio from video files before transcription.

Metadata

Slug transcription

Version 1.0.1

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 2

Frequently Asked Questions

What is transcription?

Transcribe audio and video files using OpenAI Whisper API. Use when user wants to transcribe audio/video files, extract speech from media, or get text from r... It is an AI Agent Skill for Claude Code / OpenClaw, with 482 downloads so far.

How do I install transcription?

Run "/install transcription" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is transcription free?

Yes, transcription is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does transcription support?

transcription is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created transcription?

It is built and maintained by djismgaming (@djismgaming); the current version is v1.0.1.

More Skills