← 返回 Skills 市场

Douyin Transcriber

Name: Douyin Transcriber
Author: don068589

作者 Don Li · GitHub ↗ · v1.0.5 · MIT-0

cross-platform ⚠ suspicious

116

总下载

当前安装

版本数

在 OpenClaw 中安装

/install douyin-transcriber

功能描述

Transcribe speech from audio or video files, automatically extracting audio and converting to text using Docker Whisper ASR for Douyin/TikTok media.

使用说明 (SKILL.md)

Douyin Transcriber

Transcribe audio/video files to text using local Docker Whisper ASR.

Quick Start

curl -X POST "http://localhost:PORT/asr" -F "audio_file=@/path/to/video.mp4"

The container has built-in ffmpeg for automatic audio extraction.

Prerequisites

Tool	Purpose	Install
Docker	Whisper ASR	Docker Desktop
ffmpeg	Audio extraction	`winget install Gyan.FFmpeg`

Deploy Whisper ASR:

docker run -d -p PORT:PORT -e ASR_MODEL=small -e ASR_ENGINE=faster_whisper --name whisper-asr onerahmet/openai-whisper-asr-webservice:latest

Workflow

Step 1: Extract Audio from Video

ffmpeg -i video.mp4 -ar 16000 -ac 1 -c:a pcm_s16le audio.wav -y

Parameters:

-ar 16000: 16kHz sample rate
-ac 1: Mono channel
-c:a pcm_s16le: 16-bit PCM

Step 2: Transcribe

curl -X POST "http://localhost:PORT/asr" -F "[email protected]"

Optional: specify language

curl -X POST "http://localhost:PORT/asr" -F "[email protected]" -F "language=zh"

Step 3: Parse Result

Response format:

{
  "text": "Transcribed content...",
  "segments": [
    {"start": 0.0, "end": 2.5, "text": "First sentence"},
    {"start": 2.5, "end": 5.0, "text": "Second sentence"}
  ],
  "language": "zh"
}

Model Selection

Model	Size	5-min video	Accuracy
tiny	75MB	~30s	Fair
base	142MB	~1min	Good
small	466MB	~3min	Better (recommended)
medium	1.5GB	~8min	Best

Change model via environment variable: -e ASR_MODEL=medium

Supported Formats

Video: mp4, mkv, avi, mov, flv, wmv, webm, m4v

Audio: wav, m4a, mp3, aac, ogg, flac, wma, opus

Troubleshooting

Issue	Solution
Docker not available	Install Docker Desktop
Container start fails	Check port availability
Transcription timeout	Use smaller model or split audio
ffmpeg not found	`winget install Gyan.FFmpeg`

Related Modules

douyin-fetcher - Video download
douyin-analyzer - Content analysis
douyin-orchestrator - Workflow coordination

安全使用建议

This skill appears to do what it says (local transcription) but has several practical and security gaps you should address before running it: - Metadata mismatch: the SKILL.md requires Docker and ffmpeg but the skill metadata lists none. Assume you need Docker and ffmpeg. - Untrusted image: the instructions pull onerahmet/openai-whisper-asr-webservice:latest from Docker Hub. Prefer a well-known repo or a pinned digest (sha256) and inspect the Dockerfile/source before running. Avoid :latest. - Run safely: execute the container in an isolated VM or sandbox, not on a critical host. Use --rm, drop capabilities, run as non-root user, bind-mount only the directory with audio (read-only if possible), and restrict network access if you don't want the container to contact the internet. - Scan the image: use tools like trivy/snyk/clair to scan the image for vulnerabilities and malware signatures before running. - Port and config: the SKILL.md uses a PORT placeholder—confirm what port to expose and avoid binding to privileged or widely routable host ports. - Ask the author for provenance: request a homepage or source repository, a specific release/tag or digest, and minimal runtime flags recommended for secure execution. If you cannot verify the image or source, run a locally built, audited ASR container instead. Given these issues (metadata omissions and an unpinned third‑party Docker image), treat the skill as suspicious until you can verify the container source and run it in a hardened environment.

功能分析

Type: OpenClaw Skill Name: douyin-transcriber Version: 1.0.5 The skill bundle provides standard instructions for transcribing audio and video files using ffmpeg and a local Docker-based Whisper ASR service (onerahmet/openai-whisper-asr-webservice). All commands, including the ffmpeg parameters and curl requests to localhost, are consistent with the stated purpose of media transcription and do not exhibit any signs of malicious intent, data exfiltration, or prompt injection.

能力评估

ℹ Purpose & Capability

Name/description (Douyin Transcriber using Docker Whisper ASR) matches the SKILL.md workflow (ffmpeg -> Docker container ASR -> curl to localhost). However the registry metadata claims no required binaries or env vars while the instructions clearly require Docker and ffmpeg and recommend container env vars (ASR_MODEL/ASR_ENGINE). This metadata/instruction mismatch is inconsistent.

⚠ Instruction Scope

Instructions ask operators to run 'docker run' to pull and run an HTTP ASR service and to run ffmpeg locally and curl audio to localhost. They do not request unrelated system files or credentials, but they do (a) use an unspecified placeholder PORT, (b) assume ability to run Docker (which implies daemon/root access), and (c) direct pulling/execution of a remote image. The steps grant the container network/host-execution potential that isn't described in metadata.

⚠ Install Mechanism

No formal install spec (instruction-only), but the SKILL.md instructs pulling a Docker image 'onerahmet/openai-whisper-asr-webservice:latest' from Docker Hub. Pulling and running an unpinned, third‑party image (latest tag, unknown maintainer) is higher risk because images can contain arbitrary code. No guidance to pin a digest, verify source, or run the container with reduced privileges.

ℹ Credentials

The skill does not request credentials or secret environment variables. It recommends container env vars for model selection (ASR_MODEL, ASR_ENGINE) which are non-sensitive. However, running Docker implies access to the Docker daemon (privileged), which can be used to access the host; that privilege is disproportionate relative to a metadata claim of 'no required binaries'.

✓ Persistence & Privilege

The skill is not marked always:true and has no install that forces persistent presence. It instructs running a container that exposes an HTTP port (user-controlled). The skill itself does not request elevated platform privileges beyond normal Docker usage, but the act of running arbitrary containers increases blast radius if the image is malicious.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install douyin-transcriber
安装完成后，直接呼叫该 Skill 的名称或使用 /douyin-transcriber 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.5

- Added clear usage instructions and workflow for audio/video transcription using Docker Whisper ASR. - Detailed prerequisite tools and installation steps. - Included command examples for extracting audio, transcribing, specifying language, and parsing results. - Provided table for model selection, supported formats, and troubleshooting common issues. - Listed related modules for extended Douyin/TikTok workflows.

v1.0.4

- Added detailed usage instructions and quick start guide for transcribing media files with Docker Whisper ASR. - Included prerequisites, installation steps, and workflow for extracting and transcribing audio/video. - Provided model selection table and format support list. - Added troubleshooting section for common issues. - Linked related modules for an integrated workflow.

v1.0.3

- Improved documentation for setup and usage, including quick start instructions and example commands. - Added details on model selection, supported formats, and configuration options. - Clarified integration with Docker Whisper ASR and automatic audio extraction using ffmpeg. - Listed related modules and expanded guidance for transcription workflows.

v1.0.2

- Updated documentation to improve clarity and provide a concise English overview. - Added quick start guide and streamlined usage instructions. - Listed supported audio and video formats explicitly. - Provided model selection table and performance estimates. - Summarized prerequisite tools and deployment steps. - Removed redundant/obsolete information and improved configuration examples.

v1.0.1

- Added comprehensive documentation for skill features, usage, and configuration. - Clarified support for both local Docker Whisper ASR and optional cloud APIs. - Provided detailed setup and deployment instructions, including Docker commands. - Included example commands for curl and Python usage. - Listed dependencies, related modules, and expected transcription times. - License information (MIT-0) clearly stated.

v2.0.0

v2.0.0 - Major upgrade: Modular architecture, browser DOM extraction, DASH support, Docker Whisper, structured output format, extended troubleshooting guide

v1.0.0

Initial release - Audio transcription module with Whisper support

元数据

Slug douyin-transcriber

版本 1.0.5

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 7

常见问题