← Back to Skills Marketplace
terrycarter1985

抖音爆款爬虫

by terrycarter1985 · GitHub ↗ · v1.1.0 · MIT-0
cross-platform ⚠ suspicious
40
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install douyin-scraper-2
Description
爬取抖音爆款视频和文案数据,支持自然语言搜索请求(如"搜索一下海鲜视频"),通过浏览器自动化或脚本获取数据。
README (SKILL.md)

抖音爆款爬虫 Skill

自然语言入口

当用户发出类似以下请求时,本 skill 自动激活:

  • "搜索一下海鲜视频"
  • "看看抖音热榜有什么"
  • "找一些海鲜售卖相关的视频文案"
  • "帮我搜抖音上关于XX的内容"

工作流程

Step 1: 解析意图

从用户自然语言中提取:

  • 操作类型: search(关键词搜索)| hot(热榜)| video(单个视频链接)
  • 关键词: 从请求中提取搜索词(如"海鲜视频" → keyword="海鲜")
  • 数量: 默认 10 条,用户指定则用指定值

Step 2: 执行搜索

方式 A:浏览器自动化(推荐,需用户已登录抖音)

使用 OpenClaw browser tool(profile="user",使用用户已登录的浏览器):

1. browser open → https://www.douyin.com/search/{keyword}?type=video
   (如需登录态,用 profile="user")
2. browser snapshot → 获取页面结构
3. 从 snapshot 中提取视频卡片数据
4. 整理结果返回

重要提示:

  • 抖音会检测自动化访问,未登录时大概率触发验证码
  • 使用 profile="user" 可复用用户已有的登录态,大幅降低风控概率
  • 如果遇到验证码页面,告知用户需要手动验证,或改用脚本 mock 模式

方式 B:脚本命令行(备选)

# Python 版本 — mock 模式(不启动浏览器,返回示例数据)
python3 scripts/scraper.py search --keyword "海鲜" --limit 10 --mock

# Python 版本 — 真实爬取(需 Playwright + 浏览器可用)
python3 scripts/scraper.py search --keyword "海鲜" --limit 10

# Node.js 版本
node scripts/douyin_scraper.js search "海鲜" 10

⚠️ 脚本在无可用浏览器或被拦截时自动降级为 mock 数据,会打印提示。

Step 3: 返回结果

将结果整理为可读格式:

🔍 搜索"海鲜"结果(共 N 条):

1. 🎬 标题 | 👤 作者 | ▶️ 播放量 | ❤️ 点赞数
   🔗 链接
   📝 描述/标签

2. ...

如果是 mock 数据,需明确标注:

⚠️ 以下为示例数据(未获取到真实结果,可能原因:验证码拦截/浏览器不可用)

1. 🎬 海鲜相关视频 1 | 👤 作者1 | ▶️ 10,000 | ❤️ 1,000
   🔗 https://www.douyin.com/search/海鲜

热榜获取

browser open → https://www.douyin.com/hot
browser snapshot → 提取热榜条目

或脚本:python3 scripts/scraper.py hot --limit 20 --mock

单视频信息

用户提供视频链接时,用 browser 打开链接提取详情。

脚本说明

脚本 语言 说明
scripts/scraper.py Python 支持 --mock 标志,无浏览器时自动降级
scripts/douyin_scraper.js Node.js 依赖 Playwright,无浏览器时返回 mock

输出格式

JSON:

[{"title": "视频标题", "author": "作者", "play_count": 1000000, "like_count": 50000, "url": "https://...", "tags": ["标签1"]}]

注意事项

  1. 遵守平台规则 — 合理使用,避免频繁请求
  2. 登录态 — 推荐使用 profile="user" 浏览器,避免验证码
  3. 请求间隔 — 连续操作间至少 2 秒
  4. 数据用途 — 仅供学习研究
  5. 风控 — 未登录访问抖音搜索/热榜大概率触发验证码,属正常现象

故障排除

问题 解决方案
验证码拦截 使用 profile="user" 浏览器,或告知用户需手动验证
浏览器超时 检查网络,增加等待时间
脚本返回 mock 正常(浏览器不可用),改用 browser tool
页面结构变化 更新 snapshot 选择器
Usage Guidance
Install only if you are comfortable with browser automation against Douyin. Prefer an isolated browser profile, do not reuse your main logged-in session unless you explicitly intend to, and treat script results as demo/mock data unless you verify that real page extraction was implemented.
Capability Assessment
Purpose & Capability
The stated purpose is Douyin search/hot-list scraping, and browser automation is coherent with that purpose, but the included scripts largely generate mock video records instead of extracting real page data.
Instruction Scope
The skill uses broad natural-language auto-activation examples and recommends profile="user" to reuse a logged-in session without a clear consent step or privacy warning.
Install Mechanism
Installation creates a Python virtual environment, installs Playwright/Chromium, may run npm install, and can build a Docker image; these are disclosed and broadly consistent with browser automation, though somewhat heavy.
Credentials
Network access to Douyin is expected, but using the user's authenticated browser profile to reduce anti-bot checks is higher-impact than needed for many searches and is under-scoped.
Persistence & Privilege
No background persistence, credential storage, destructive behavior, or privilege escalation was found; persistent changes are mainly installed dependencies, browser binaries, optional Docker image, and output files.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install douyin-scraper-2
  3. After installation, invoke the skill by name or use /douyin-scraper-2
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.1.0
支持自然语言搜索入口,新增 --mock 标志,浏览器不可用时自动降级,更新 SKILL.md/README
Metadata
Slug douyin-scraper-2
Version 1.1.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is 抖音爆款爬虫?

爬取抖音爆款视频和文案数据,支持自然语言搜索请求(如"搜索一下海鲜视频"),通过浏览器自动化或脚本获取数据。 It is an AI Agent Skill for Claude Code / OpenClaw, with 40 downloads so far.

How do I install 抖音爆款爬虫?

Run "/install douyin-scraper-2" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is 抖音爆款爬虫 free?

Yes, 抖音爆款爬虫 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does 抖音爆款爬虫 support?

抖音爆款爬虫 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created 抖音爆款爬虫?

It is built and maintained by terrycarter1985 (@terrycarter1985); the current version is v1.1.0.

💬 Comments