← 返回 Skills 市场
liuzheng60

抖音视频提取文案

作者 liuzheng60 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
77
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install douyin-transcribe-lz
功能描述
从抖音短视频链接中提取音频并使用Whisper转录为中文文本。 当用户提供抖音短链接(v.douyin.com/xxx)并要求 提取、转换或转录视频语音为文本时,应使用此技能。通过 Playwright视频元素捕获处理登录墙绕过。
使用说明 (SKILL.md)

抖音视频转录技能

从抖音短链接中提取语音并通过Whisper转换为中文文本。

工作流程

步骤 1:通过Playwright捕获视频URL

首选方法——从video元素src中提取(最可靠,可绕过登录墙):

import asyncio
from playwright.async_api import async_playwright

async def get_douyin_video_url(short_url):
    video_src = None
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        context = await browser.new_context(
            user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
        )
        page = await context.new_page()
        await page.goto(short_url, wait_until="domcontentloaded", timeout=30000)
        await asyncio.sleep(8)  # 等待JS渲染并填充video src

        video_src = await page.evaluate("""
            () => {
                const videos = document.querySelectorAll('video');
                for (const v of videos) {
                    if (v.src && v.src.includes('douyin') && v.src.includes('.mp4')) return v.src;
                    const sources = v.querySelectorAll('source');
                    for (const s of sources) { if (s.src) return s.src; }
                }
                return null;
            }
        """)

        await browser.close()
    return video_src

为何选择此方法而非网络拦截: 即使登录模态框覆盖了视频元素,video.src也已填充。网络拦截在登录墙下会失败。

步骤 2:下载视频

import requests

def download_douyin_video(url, output_path, referer):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
        "Referer": referer,
    }
    resp = requests.get(url, headers=headers, stream=True, timeout=60)
    with open(output_path, "wb") as f:
        for chunk in resp.iter_content(chunk_size=1024*1024):
            if chunk:
                f.write(chunk)
    return output_path
  • referer = "https://www.douyin.com/" 或解析后的视频页面URL
  • 抖音CDN使用v26-web.douyinvod.com——这些URL有效时间为数小时

步骤 3:使用Whisper转录

import whisper, os, imageio_ffmpeg

# 确保ffmpeg在PATH中
ffmpeg_exe = imageio_ffmpeg.get_ffmpeg_exe()
ffmpeg_dir = os.path.dirname(ffmpeg_exe)
os.environ["PATH"] = ffmpeg_dir + os.pathsep + os.environ.get("PATH", "")
import shutil
shutil.copy(ffmpeg_exe, os.path.join(ffmpeg_dir, "ffmpeg.exe"))  # 确保可访问

model = whisper.load_model("medium")  # ~1.4GB,首次运行后缓存
result = model.transcribe(video_path, language="zh", verbose=True, task="transcribe")

保存输出:

  • transcript.txt — 完整文本 + 带时间戳的片段(供用户阅读)
  • transcript.json — Whisper原始输出(供程序使用)

必需依赖

安装一次:

pip install playwright openai-whisper imageio[ffmpeg] requests
playwright install chromium

关键要点

  1. 登录墙解决方法: video.src由JS在登录模态框出现前填充。网络拦截会遗漏,直接DOM查询可捕获。
  2. CDN URL有效期: 抖音签名CDN URL(v26-web.douyinvod.com/...?a=...)有效约24小时。捕获后立即下载。
  3. 模型选择: 中文使用mediumbase更快但准确度较低。首次运行下载模型(~1.4GB),之后缓存。
  4. FFmpeg要求: Whisper需要ffmpeg;imageio[ffmpeg]自动提供——通过imageio_ffmpeg.get_ffmpeg_exe()获取路径。
  5. 备用方案: 如果Playwright失败,使用捆绑的fetch_douyin_video.py脚本进行替代URL提取。

捆绑资源

  • scripts/fetch_douyin_video.py — 完整端到端脚本(捕获 → 下载 → 转录)
  • references/whisper_usage.md — Whisper API选项和中文语言提示
安全使用建议
Before installing, make sure you are comfortable running local Python/browser tooling and downloading unpinned dependencies and Whisper models. Use the skill only for Douyin videos you are allowed to access and transcribe, and run it in a dedicated folder because it writes fixed output filenames such as douyin_video.mp4, transcript.txt, and transcript.json.
功能分析
Type: OpenClaw Skill Name: douyin-transcribe-lz Version: 1.0.0 The skill bundle provides a legitimate tool for downloading and transcribing Douyin videos using Playwright and OpenAI's Whisper model. The code in `scripts/fetch_douyin_video.py` and the instructions in `SKILL.md` are well-documented and align with the stated purpose. There are no signs of data exfiltration, malicious execution, or prompt injection; the inclusion of `imageio_ffmpeg` and the modification of the system PATH are standard practices for ensuring the availability of the FFmpeg binary required by Whisper.
能力评估
Purpose & Capability
The artifacts are coherent with the stated purpose: they capture a Douyin video URL, download the MP4, and run Whisper transcription. Notable evidence: SKILL.md says it extracts from the video element and is '可绕过登录墙' ('can bypass the login wall').
Instruction Scope
The skill instructions are scoped to cases where the user provides a Douyin short link and asks to extract/convert/transcribe speech; the bundled script requires a CLI URL argument and does not show autonomous use beyond that workflow.
Install Mechanism
There is no formal install spec, but SKILL.md instructs users to install unpinned Python packages and Playwright Chromium via 'pip install playwright openai-whisper imageio[ffmpeg] requests' and 'playwright install chromium'.
Credentials
The local browser launch, network download, FFmpeg setup, and Whisper model loading are proportionate to video transcription, but they do involve local execution, network access, and large dependency/model downloads.
Persistence & Privilege
No credentials, browser profiles, cookies, background agents, or privilege escalation are requested. The script writes expected local output files such as douyin_video.mp4, transcript.txt, and transcript.json.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install douyin-transcribe-lz
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /douyin-transcribe-lz 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of douyin-transcribe: extract audio from Douyin short links and transcribe to Chinese text using Whisper. - Uses Playwright to extract video URLs directly from <video> elements, bypassing login walls. - Downloads signed Douyin CDN video URLs, handling Referer headers and short-lived URLs. - Integrates Whisper for accurate Mandarin speech-to-text transcription, with medium model default for quality. - Automatic FFmpeg setup via imageio for audio extraction. - Includes fallback script and documentation for full workflow guidance.
元数据
Slug douyin-transcribe-lz
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

抖音视频提取文案 是什么?

从抖音短视频链接中提取音频并使用Whisper转录为中文文本。 当用户提供抖音短链接(v.douyin.com/xxx)并要求 提取、转换或转录视频语音为文本时,应使用此技能。通过 Playwright视频元素捕获处理登录墙绕过。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 77 次。

如何安装 抖音视频提取文案?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install douyin-transcribe-lz」即可一键安装,无需额外配置。

抖音视频提取文案 是免费的吗?

是的,抖音视频提取文案 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

抖音视频提取文案 支持哪些平台?

抖音视频提取文案 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 抖音视频提取文案?

由 liuzheng60(@liuzheng60)开发并维护,当前版本 v1.0.0。

💬 留言讨论