功能描述

通过本机 media-agent-crawler HTTP 服务搜集 B站/抖音/YouTube/知乎内容（不依赖 MCP 客户端安装）。当用户要搜集这些平台内容、并已在本机启动应用（默认 http://127.0.0.1:39002）时使用。

使用说明 (SKILL.md)

media-crawler-local

Name: 社交媒体研究助手Skill
Author: sansan-mei

直接调用本机 HTTP 服务，不走 OpenClaw/Cursor 的 MCP 客户端配置。

前置确认

先从用户消息或上下文中提取以下信息，缺少时再询问：

操作类型：搜集内容 / 查询归档 / 读取任务数据
目标链接或关键词
平台（可从链接自动推断）

工具清单

B 站系列

工具名	必填参数	说明
`crawl_bilibili`	`url`	视频 URL 或 BV 号
`crawl_bilibili_search`	`keyword`	按关键词触发搜索结果搜集
`crawl_bilibili_uploader`	`mid`	UP 主纯数字 ID，触发视频列表搜集
`crawl_bilibili_popular`	无	热门视频搜集
`crawl_bilibili_weekly`	无（可选 `number`）	每周必看，不传 `number` 则自动取最新一期
`crawl_bilibili_history`	无（可选 `max/view_at/business/ps/type/page_count`）	历史记录聚合搜集，不传 `page_count` 时跟随 `dailyRecommendPageCount`

所有 B 站工具均支持可选 cookies 参数（字符串，从浏览器插件获取）。

其他平台

工具名	必填参数	说明
`crawl_douyin`	`url`	抖音视频 URL
`crawl_youtube`	`url`	YouTube 视频 URL 或视频 ID
`crawl_zhihu`	`url`	知乎问题或回答 URL

归档与数据读取

工具名	必填参数	可选参数	说明
`list_archives`	无	`platform` / `keyword` / `limit` / `sort_by` / `created_after`	列出归档任务，默认返回最多 50 条，按时间倒序
`get_task_data`	`task_id`	`type`	读取任务目录下的数据文件

list_archives 参数说明：

sort_by：date（默认，创建时间倒序）或 status（running → failed → unknown → finished）
created_after：ISO 日期，如 2026-03-18 或 2026-03-18T10:00:00Z

get_task_data 的 type 支持以下值（含别名）：

type 值	读取的数据
`comments` / `comment`	评论数据
`danmaku`	弹幕数据
`subtitles` / `subtitle` / `caption` / `captions`	字幕数据
`detail` / `info`	视频/帖子详情
`all` / `full`	全量聚合数据
`summary` / `ai_summary`	AI 摘要
不传	返回目录下所有可识别文件

HTTP 端点

服务地址默认 http://127.0.0.1:39002，可通过环境变量 BIL_CRAWL_URL 覆盖。

搜集端点（REST）

POST /start-crawl/{platform}/{encodedUrl}
Content-Type: application/json

{ "source": "ai" }

encodedUrl 需要 encodeURIComponent 编码；platform 取值：bilibili / douyin / youtube / zhihu。

MCP 端点（JSON-RPC 2.0）

POST /mcp
Accept: application/json, text/event-stream
Content-Type: application/json

请求体格式：

{ "jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": { "name": "\x3Ctool>", "arguments": { } } }

脚本用法

所有脚本位于 skills/media-crawler-local/scripts/，工作目录为 openclaw workspace 根。

1. 快速搜集（REST，`crawl.sh`）

bash skills/media-crawler-local/scripts/crawl.sh \x3Cplatform> \x3Curl> [base_url]

示例：

bash skills/media-crawler-local/scripts/crawl.sh bilibili "https://www.bilibili.com/video/BV1xx411c7mD"

2. 通过 MCP 搜集（`crawl_mcp.sh`，仅支持带 url 的工具）

bash skills/media-crawler-local/scripts/crawl_mcp.sh \x3Ctool_name> \x3Ctarget_url> [base_url]

示例：

bash skills/media-crawler-local/scripts/crawl_mcp.sh crawl_bilibili "https://www.bilibili.com/video/BV1xx411c7mD"

支持工具：crawl_bilibili / crawl_douyin / crawl_youtube / crawl_zhihu

其余工具（bilibili_search / bilibili_uploader / bilibili_popular / bilibili_weekly / bilibili_history / list_archives / get_task_data）请用 mcp_tool.sh。

3. 归档查询（`list_archives_mcp.sh`）

bash skills/media-crawler-local/scripts/list_archives_mcp.sh [platform] [keyword] [limit] [base_url]

示例：

bash skills/media-crawler-local/scripts/list_archives_mcp.sh bilibili "蛋神" 20

4. 通用工具调用（`mcp_tool.sh`）

bash skills/media-crawler-local/scripts/mcp_tool.sh \x3Ctool_name> [args_json] [base_url]

示例：

# B 站搜索
bash skills/media-crawler-local/scripts/mcp_tool.sh crawl_bilibili_search '{"keyword":"蛋神"}'

# UP 主视频列表
bash skills/media-crawler-local/scripts/mcp_tool.sh crawl_bilibili_uploader '{"mid":"123456"}'

# 热门视频
bash skills/media-crawler-local/scripts/mcp_tool.sh crawl_bilibili_popular '{}'

# 每周必看（最新一期）
bash skills/media-crawler-local/scripts/mcp_tool.sh crawl_bilibili_weekly '{}'

# 每周必看（指定期数）
bash skills/media-crawler-local/scripts/mcp_tool.sh crawl_bilibili_weekly '{"number":364}'

# 历史记录（默认参数）
bash skills/media-crawler-local/scripts/mcp_tool.sh crawl_bilibili_history '{}'

# 历史记录（指定首屏 cursor）
bash skills/media-crawler-local/scripts/mcp_tool.sh crawl_bilibili_history '{"max":0,"view_at":0,"business":"","ps":20,"type":"all"}'

# 历史记录（指定采集页数）
bash skills/media-crawler-local/scripts/mcp_tool.sh crawl_bilibili_history '{"page_count":2}'

# 读取任务评论数据
bash skills/media-crawler-local/scripts/mcp_tool.sh get_task_data '{"task_id":"BV1xx411c7mD-123456","type":"comments"}'

执行流程

健康检查：GET /（连不上则提醒用户先启动应用）。
发起搜集：优先用 REST 端点（crawl.sh），需要额外工具参数时用 MCP（mcp_tool.sh）。
结果处理：
- 给用户简要摘要（任务 ID、状态、关键字段）
- 内容很多时仅展示前几条，说明可通过 get_task_data 继续读取或过滤

故障处理

错误	处理方式
连接失败	提醒先启动 Electron 应用（`bun run start` / `dev`）
401 / 403	提示检查 cookies 是否已在 store 中，或让用户重新从插件导入
429	按返回的 `Retry-After` 退避，不密集重试
5xx	最多重试 1 次，返回错误摘要与建议
`task_id` 不存在	先用 `list_archives` 查询正确的任务 ID

安全使用建议

This skill is a local client for a crawler running on your machine. Before installing or using it: 1) Ensure the crawler service (Electron app) is really running on localhost (or on a host you trust). If you set BIL_CRAWL_URL, do not point it at an untrusted remote server — the skill will send URLs/arguments there. 2) Be cautious when supplying 'cookies' strings (they can contain login tokens). 3) The package metadata omits required binaries: the scripts call curl and node; make sure those are present and review their versions. 4) If you want stronger assurance, inspect or run the actual media-agent-crawler service code (the skill only calls that service). Overall the skill appears coherent for its stated purpose, with the above operational caveats.

功能分析

Type: OpenClaw Skill Name: media-research-crawl-skill Version: 1.0.2 The skill bundle is a legitimate utility designed to interface with a local media crawler service (defaulting to http://127.0.0.1:39002) for platforms like Bilibili, Douyin, YouTube, and Zhihu. The provided bash scripts (e.g., crawl.sh, mcp_tool.sh) act as wrappers for curl commands to interact with the service's REST and JSON-RPC endpoints. The instructions in SKILL.md are well-aligned with the stated purpose, and there is no evidence of data exfiltration, malicious execution, or harmful prompt injection.

能力评估

✓ Purpose & Capability

The name/description match the actual behavior: the skill is a client for a local media-agent-crawler service (default http://127.0.0.1:39002). The included scripts and SKILL.md describe only crawl/list/get operations for the declared platforms and do not request unrelated cloud credentials or access to unrelated subsystems.

✓ Instruction Scope

Runtime instructions and scripts only construct JSON payloads and call REST (/start-crawl/...) or MCP (/mcp) endpoints on the configured base URL. They do not read arbitrary files, shell history, or system credentials. The SKILL.md correctly documents endpoints, parameters (including optional cookies), and expected behavior.

✓ Install Mechanism

There is no install spec (instruction-only skill) and no downloads/extraction. The provided scripts are simple wrappers that use curl and node via inline node -e snippets; they do not install external code.

ℹ Credentials

The skill does not require credentials and does not declare required env vars in registry metadata, but SKILL.md and scripts reference an optional BIL_CRAWL_URL environment variable. The scripts also assume command-line tools (curl, node) exist even though 'required binaries' lists none. Also note the optional 'cookies' parameter can contain session cookies — providing those grants the crawler access to account-protected content and should be considered sensitive.

✓ Persistence & Privilege

always:false and no special persistence is requested. The skill does not modify other skills or system settings. The agent may invoke it autonomously (default), which is normal — this simply allows the agent to call the local crawler when relevant.

版本历史

v1.0.2

- 移除了对小红书（xhs）平台的支持，相关工具说明与脚本调用均已更新 - 文档中所有相关参数和脚本说明同步删去了 xhs - 其余平台（B站/抖音/YouTube/知乎）功能保持不变

v1.0.1

- 新增 B 站历史记录聚合采集工具 crawl_bilibili_history，支持可选参数（如页数、首屏 cursor）。 - mcp_tool.sh 与相关文档示例补充了 crawl_bilibili_history 的多种参数调用用法。 - 说明 crawl_mcp.sh 仅支持部分工具，其余包含 crawl_bilibili_history 的工具需通过 mcp_tool.sh 调用。

v1.0.0

Initial release of media-crawler-local. - Enables local collection of Bilibili, Douyin, Xiaohongshu, YouTube, Zhihu content via a media-agent-crawler HTTP service running on the user's machine. - Operates directly through local HTTP endpoints without requiring MCP client configuration. - Adds bash scripts for common actions: collecting content, querying task archives, and tool invocation. - Supports a range of tools for content collection, archive listings, and data retrieval with flexible parameters. - Guidance included for error handling and normal operation workflow.

元数据

Slug media-research-crawl-skill

版本 1.0.2

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 3

常见问题

社交媒体研究助手Skill 是什么？

通过本机 media-agent-crawler HTTP 服务搜集 B站/抖音/YouTube/知乎内容（不依赖 MCP 客户端安装）。当用户要搜集这些平台内容、并已在本机启动应用（默认 http://127.0.0.1:39002）时使用。它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 272 次。

如何安装社交媒体研究助手Skill？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install media-research-crawl-skill」即可一键安装，无需额外配置。

社交媒体研究助手Skill 是免费的吗？

是的，社交媒体研究助手Skill 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

社交媒体研究助手Skill 支持哪些平台？

社交媒体研究助手Skill 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了社交媒体研究助手Skill？

由梅花三十三（@sansan-mei）开发并维护，当前版本 v1.0.2。

社交媒体研究助手Skill