← 返回 Skills 市场

抖音视频提取文案

Name: 抖音视频提取文案
Author: liuzheng60

作者 liuzheng60 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install douyin-transcribe-lz

功能描述

从抖音短视频链接中提取音频并使用Whisper转录为中文文本。当用户提供抖音短链接（v.douyin.com/xxx）并要求提取、转换或转录视频语音为文本时，应使用此技能。通过 Playwright视频元素捕获处理登录墙绕过。

使用说明 (SKILL.md)

抖音视频转录技能

从抖音短链接中提取语音并通过Whisper转换为中文文本。

工作流程

步骤 1：通过Playwright捕获视频URL

首选方法——从video元素src中提取（最可靠，可绕过登录墙）：

import asyncio
from playwright.async_api import async_playwright

async def get_douyin_video_url(short_url):
    video_src = None
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        context = await browser.new_context(
            user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
        )
        page = await context.new_page()
        await page.goto(short_url, wait_until="domcontentloaded", timeout=30000)
        await asyncio.sleep(8)  # 等待JS渲染并填充video src

        video_src = await page.evaluate("""
            () => {
                const videos = document.querySelectorAll('video');
                for (const v of videos) {
                    if (v.src && v.src.includes('douyin') && v.src.includes('.mp4')) return v.src;
                    const sources = v.querySelectorAll('source');
                    for (const s of sources) { if (s.src) return s.src; }
                }
                return null;
            }
        """)

        await browser.close()
    return video_src

为何选择此方法而非网络拦截： 即使登录模态框覆盖了视频元素，video.src也已填充。网络拦截在登录墙下会失败。

步骤 2：下载视频

import requests

def download_douyin_video(url, output_path, referer):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
        "Referer": referer,
    }
    resp = requests.get(url, headers=headers, stream=True, timeout=60)
    with open(output_path, "wb") as f:
        for chunk in resp.iter_content(chunk_size=1024*1024):
            if chunk:
                f.write(chunk)
    return output_path

referer = "https://www.douyin.com/" 或解析后的视频页面URL
抖音CDN使用v26-web.douyinvod.com——这些URL有效时间为数小时

步骤 3：使用Whisper转录

import whisper, os, imageio_ffmpeg

# 确保ffmpeg在PATH中
ffmpeg_exe = imageio_ffmpeg.get_ffmpeg_exe()
ffmpeg_dir = os.path.dirname(ffmpeg_exe)
os.environ["PATH"] = ffmpeg_dir + os.pathsep + os.environ.get("PATH", "")
import shutil
shutil.copy(ffmpeg_exe, os.path.join(ffmpeg_dir, "ffmpeg.exe"))  # 确保可访问

model = whisper.load_model("medium")  # ~1.4GB，首次运行后缓存
result = model.transcribe(video_path, language="zh", verbose=True, task="transcribe")

保存输出：

transcript.txt — 完整文本 + 带时间戳的片段（供用户阅读）
transcript.json — Whisper原始输出（供程序使用）

必需依赖

安装一次：

pip install playwright openai-whisper imageio[ffmpeg] requests
playwright install chromium

关键要点

登录墙解决方法： video.src由JS在登录模态框出现前填充。网络拦截会遗漏，直接DOM查询可捕获。
CDN URL有效期： 抖音签名CDN URL（v26-web.douyinvod.com/...?a=...）有效约24小时。捕获后立即下载。
模型选择： 中文使用medium；base更快但准确度较低。首次运行下载模型（~1.4GB），之后缓存。
FFmpeg要求： Whisper需要ffmpeg；imageio[ffmpeg]自动提供——通过imageio_ffmpeg.get_ffmpeg_exe()获取路径。
备用方案： 如果Playwright失败，使用捆绑的fetch_douyin_video.py脚本进行替代URL提取。

捆绑资源

scripts/fetch_douyin_video.py — 完整端到端脚本（捕获 → 下载 → 转录）
references/whisper_usage.md — Whisper API选项和中文语言提示

安全使用建议

Before installing, make sure you are comfortable running local Python/browser tooling and downloading unpinned dependencies and Whisper models. Use the skill only for Douyin videos you are allowed to access and transcribe, and run it in a dedicated folder because it writes fixed output filenames such as douyin_video.mp4, transcript.txt, and transcript.json.

功能分析

Type: OpenClaw Skill Name: douyin-transcribe-lz Version: 1.0.0 The skill bundle provides a legitimate tool for downloading and transcribing Douyin videos using Playwright and OpenAI's Whisper model. The code in `scripts/fetch_douyin_video.py` and the instructions in `SKILL.md` are well-documented and align with the stated purpose. There are no signs of data exfiltration, malicious execution, or prompt injection; the inclusion of `imageio_ffmpeg` and the modification of the system PATH are standard practices for ensuring the availability of the FFmpeg binary required by Whisper.

能力评估

ℹ Purpose & Capability

The artifacts are coherent with the stated purpose: they capture a Douyin video URL, download the MP4, and run Whisper transcription. Notable evidence: SKILL.md says it extracts from the video element and is '可绕过登录墙' ('can bypass the login wall').

✓ Instruction Scope

The skill instructions are scoped to cases where the user provides a Douyin short link and asks to extract/convert/transcribe speech; the bundled script requires a CLI URL argument and does not show autonomous use beyond that workflow.

ℹ Install Mechanism

There is no formal install spec, but SKILL.md instructs users to install unpinned Python packages and Playwright Chromium via 'pip install playwright openai-whisper imageio[ffmpeg] requests' and 'playwright install chromium'.

ℹ Credentials

The local browser launch, network download, FFmpeg setup, and Whisper model loading are proportionate to video transcription, but they do involve local execution, network access, and large dependency/model downloads.

✓ Persistence & Privilege

No credentials, browser profiles, cookies, background agents, or privilege escalation are requested. The script writes expected local output files such as douyin_video.mp4, transcript.txt, and transcript.json.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install douyin-transcribe-lz
安装完成后，直接呼叫该 Skill 的名称或使用 /douyin-transcribe-lz 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of douyin-transcribe: extract audio from Douyin short links and transcribe to Chinese text using Whisper. - Uses Playwright to extract video URLs directly from <video> elements, bypassing login walls. - Downloads signed Douyin CDN video URLs, handling Referer headers and short-lived URLs. - Integrates Whisper for accurate Mandarin speech-to-text transcription, with medium model default for quality. - Automatic FFmpeg setup via imageio for audio extraction. - Includes fallback script and documentation for full workflow guidance.

元数据

Slug douyin-transcribe-lz

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题