← Back to Skills Marketplace

抖音视频提取文案

Name: 抖音视频提取文案
Author: liuzheng60

by liuzheng60 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install douyin-transcribe-lz

Description

从抖音短视频链接中提取音频并使用Whisper转录为中文文本。当用户提供抖音短链接（v.douyin.com/xxx）并要求提取、转换或转录视频语音为文本时，应使用此技能。通过 Playwright视频元素捕获处理登录墙绕过。

README (SKILL.md)

抖音视频转录技能

从抖音短链接中提取语音并通过Whisper转换为中文文本。

工作流程

步骤 1：通过Playwright捕获视频URL

首选方法——从video元素src中提取（最可靠，可绕过登录墙）：

import asyncio
from playwright.async_api import async_playwright

async def get_douyin_video_url(short_url):
    video_src = None
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        context = await browser.new_context(
            user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
        )
        page = await context.new_page()
        await page.goto(short_url, wait_until="domcontentloaded", timeout=30000)
        await asyncio.sleep(8)  # 等待JS渲染并填充video src

        video_src = await page.evaluate("""
            () => {
                const videos = document.querySelectorAll('video');
                for (const v of videos) {
                    if (v.src && v.src.includes('douyin') && v.src.includes('.mp4')) return v.src;
                    const sources = v.querySelectorAll('source');
                    for (const s of sources) { if (s.src) return s.src; }
                }
                return null;
            }
        """)

        await browser.close()
    return video_src

为何选择此方法而非网络拦截： 即使登录模态框覆盖了视频元素，video.src也已填充。网络拦截在登录墙下会失败。

步骤 2：下载视频

import requests

def download_douyin_video(url, output_path, referer):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
        "Referer": referer,
    }
    resp = requests.get(url, headers=headers, stream=True, timeout=60)
    with open(output_path, "wb") as f:
        for chunk in resp.iter_content(chunk_size=1024*1024):
            if chunk:
                f.write(chunk)
    return output_path

referer = "https://www.douyin.com/" 或解析后的视频页面URL
抖音CDN使用v26-web.douyinvod.com——这些URL有效时间为数小时

步骤 3：使用Whisper转录

import whisper, os, imageio_ffmpeg

# 确保ffmpeg在PATH中
ffmpeg_exe = imageio_ffmpeg.get_ffmpeg_exe()
ffmpeg_dir = os.path.dirname(ffmpeg_exe)
os.environ["PATH"] = ffmpeg_dir + os.pathsep + os.environ.get("PATH", "")
import shutil
shutil.copy(ffmpeg_exe, os.path.join(ffmpeg_dir, "ffmpeg.exe"))  # 确保可访问

model = whisper.load_model("medium")  # ~1.4GB，首次运行后缓存
result = model.transcribe(video_path, language="zh", verbose=True, task="transcribe")

保存输出：

transcript.txt — 完整文本 + 带时间戳的片段（供用户阅读）
transcript.json — Whisper原始输出（供程序使用）

必需依赖

安装一次：

pip install playwright openai-whisper imageio[ffmpeg] requests
playwright install chromium

关键要点

登录墙解决方法： video.src由JS在登录模态框出现前填充。网络拦截会遗漏，直接DOM查询可捕获。
CDN URL有效期： 抖音签名CDN URL（v26-web.douyinvod.com/...?a=...）有效约24小时。捕获后立即下载。
模型选择： 中文使用medium；base更快但准确度较低。首次运行下载模型（~1.4GB），之后缓存。
FFmpeg要求： Whisper需要ffmpeg；imageio[ffmpeg]自动提供——通过imageio_ffmpeg.get_ffmpeg_exe()获取路径。
备用方案： 如果Playwright失败，使用捆绑的fetch_douyin_video.py脚本进行替代URL提取。

捆绑资源

scripts/fetch_douyin_video.py — 完整端到端脚本（捕获 → 下载 → 转录）
references/whisper_usage.md — Whisper API选项和中文语言提示

Usage Guidance

Before installing, make sure you are comfortable running local Python/browser tooling and downloading unpinned dependencies and Whisper models. Use the skill only for Douyin videos you are allowed to access and transcribe, and run it in a dedicated folder because it writes fixed output filenames such as douyin_video.mp4, transcript.txt, and transcript.json.

Capability Analysis

Type: OpenClaw Skill Name: douyin-transcribe-lz Version: 1.0.0 The skill bundle provides a legitimate tool for downloading and transcribing Douyin videos using Playwright and OpenAI's Whisper model. The code in `scripts/fetch_douyin_video.py` and the instructions in `SKILL.md` are well-documented and align with the stated purpose. There are no signs of data exfiltration, malicious execution, or prompt injection; the inclusion of `imageio_ffmpeg` and the modification of the system PATH are standard practices for ensuring the availability of the FFmpeg binary required by Whisper.

Capability Assessment

ℹ Purpose & Capability

The artifacts are coherent with the stated purpose: they capture a Douyin video URL, download the MP4, and run Whisper transcription. Notable evidence: SKILL.md says it extracts from the video element and is '可绕过登录墙' ('can bypass the login wall').

✓ Instruction Scope

The skill instructions are scoped to cases where the user provides a Douyin short link and asks to extract/convert/transcribe speech; the bundled script requires a CLI URL argument and does not show autonomous use beyond that workflow.

ℹ Install Mechanism

There is no formal install spec, but SKILL.md instructs users to install unpinned Python packages and Playwright Chromium via 'pip install playwright openai-whisper imageio[ffmpeg] requests' and 'playwright install chromium'.

ℹ Credentials

The local browser launch, network download, FFmpeg setup, and Whisper model loading are proportionate to video transcription, but they do involve local execution, network access, and large dependency/model downloads.

✓ Persistence & Privilege

No credentials, browser profiles, cookies, background agents, or privilege escalation are requested. The script writes expected local output files such as douyin_video.mp4, transcript.txt, and transcript.json.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install douyin-transcribe-lz
After installation, invoke the skill by name or use /douyin-transcribe-lz
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

- Initial release of douyin-transcribe: extract audio from Douyin short links and transcribe to Chinese text using Whisper. - Uses Playwright to extract video URLs directly from <video> elements, bypassing login walls. - Downloads signed Douyin CDN video URLs, handling Referer headers and short-lived URLs. - Integrates Whisper for accurate Mandarin speech-to-text transcription, with medium model default for quality. - Automatic FFmpeg setup via imageio for audio extraction. - Includes fallback script and documentation for full workflow guidance.

Metadata

Slug douyin-transcribe-lz

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is 抖音视频提取文案?

从抖音短视频链接中提取音频并使用Whisper转录为中文文本。当用户提供抖音短链接（v.douyin.com/xxx）并要求提取、转换或转录视频语音为文本时，应使用此技能。通过 Playwright视频元素捕获处理登录墙绕过。 It is an AI Agent Skill for Claude Code / OpenClaw, with 77 downloads so far.

How do I install 抖音视频提取文案?

Run "/install douyin-transcribe-lz" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is 抖音视频提取文案 free?

Yes, 抖音视频提取文案 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does 抖音视频提取文案 support?

抖音视频提取文案 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created 抖音视频提取文案?

It is built and maintained by liuzheng60 (@liuzheng60); the current version is v1.0.0.

More Skills