← Back to Skills Marketplace
dr-xiaoming

Social Media Data Collector

by Dr-xiaoming · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
68
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install social-media-data-collector
Description
Multi-platform social media data collection and aggregation for content performance tracking. Use when: (1) collecting engagement metrics (views/likes/commen...
README (SKILL.md)

Social Media Data Collector

Overview

Collect engagement metrics from 13+ platforms, aggregate into structured format (飞书多维表格/CSV). Three-tier approach: API first → browser scrape fallback → manual flag.

Execution Flow

  1. Classify platforms by data access method (see references/platform-guide.md)
  2. API tier — call APIs for platforms with programmatic access
  3. Browser tier — Playwright render + text extraction for remaining
  4. Aggregate — normalize data, write to target (bitable/CSV)
  5. Cleanup — remove screenshots, temp files, browser cache

Platform Tiers

Tier Platforms Method
API-first 抖音, 微博, 快手, B站, 今日头条, 小红书 TikHub API / BlueAI Crawler
Browser-scrape 百家号, 汽车之家, 易车, 视频号, 斗鱼, 皮皮虾 Playwright headless
API+scrape 懂车帝 TikHub (limited) + scrape

Model Strategy (Token Optimization)

Problem

Using opus/sonnet for the entire pipeline wastes tokens on mechanical tasks.

Recommended Model Split

Phase Model Why
Planning & classification opus/sonnet Needs reasoning
API calls & JSON parsing haiku/flash Mechanical, no reasoning needed
Browser text extraction Code (no LLM) Pure Python, no model call
Data normalization haiku/flash Simple mapping
Report/summary sonnet Needs synthesis

Implementation

  • Use scripts/collect_api.py for API tier — zero LLM tokens (pure code)
  • Use scripts/collect_browser.py for browser tier — zero LLM tokens (pure code)
  • Only invoke LLM for: planning which platforms to hit, handling errors, writing summaries

Token Budget Estimate (per 13-platform run)

  • With current approach (all-opus): ~80k tokens
  • With optimized approach (code scripts + haiku routing): ~5k tokens
  • Savings: 94%

Key Commands

# Full collection run
python3 scripts/collect_api.py --config /tmp/sm-collect/config.json

# Browser scrape specific platforms  
python3 scripts/collect_browser.py --platforms "百家号,汽车之家,视频号"

# Write to bitable
python3 scripts/write_bitable.py --app-token XXX --table-id YYY --data /tmp/sm-collect/results.json

# Cleanup
rm -rf /tmp/sm-collect/ /tmp/screenshots/

Bitable Field Mapping

多维表格字段 类型 说明
播放量 text 带"万"后缀的文本
点赞 number 纯数字
评论 number 纯数字
分享 number 纯数字
收藏 number 纯数字
互动量合计 text 带"万"后缀的文本
数据统计日期 text 格式 "2026.5.15"

⚠️ 注意 播放量互动量合计 是 text 类型,不是 number!传数字会报 TextFieldConvFail。

Cleanup Protocol

After each collection run, delete:

  • /tmp/sm-collect/ (intermediate JSON)
  • /tmp/screenshots/ (browser screenshots)
  • /tmp/subagent-out/ (if spawned sub-agents)
  • Any .json temp files in workspace

Error Handling

  • API 403/401 → token expired, refresh and retry once
  • Browser timeout → increase to 25s, retry with wait_until="domcontentloaded"
  • Platform redirects → check URL is correct (易车 hao vs sv domain!)
  • Empty data → flag for manual check, don't guess

Platform-Specific Notes

See references/platform-guide.md for detailed per-platform experience including:

  • Authentication requirements
  • URL patterns and gotchas
  • Data extraction selectors
  • Known limitations
Usage Guidance
This skill appears safe for its stated purpose if you intend to collect social media metrics and update a Feishu Bitable. Before using it, confirm the exact URLs, table ID, and record IDs, use least-privilege TikHub and Feishu credentials, and restrict cleanup to files created by this run.
Capability Analysis
Type: OpenClaw Skill Name: social-media-data-collector Version: 1.0.0 The skill bundle is a social media data aggregator designed to collect engagement metrics from over 13 platforms and sync them to Feishu Bitable. It utilizes the TikHub API (scripts/collect_api.py) and Playwright-based scraping (scripts/collect_browser.py) as a fallback. The implementation is transparent, includes a cleanup protocol for temporary files, and lacks any indicators of malicious intent, such as unauthorized data exfiltration or hidden backdoors. The use of third-party APIs (api.tikhub.io) and browser automation is consistent with the stated goal of cross-platform data tracking.
Capability Tags
requires-oauth-tokenrequires-sensitive-credentials
Capability Assessment
Purpose & Capability
The scripts match the stated purpose: collect engagement metrics through APIs or browser scraping and write normalized results to Feishu Bitable. The Bitable write capability is expected but can modify user business data.
Instruction Scope
The execution flow is disclosed and task-focused. The cleanup instructions include broader temporary-file deletion language that should be interpreted narrowly.
Install Mechanism
There is no automatic install mechanism, but the browser tier requires Playwright/Chromium and suggests an unpinned pip install command if missing.
Credentials
Network access to TikHub, Feishu, and the target social platforms is proportional to the skill purpose. Users should still treat submitted URLs, metrics, and credentials as external data flows.
Persistence & Privilege
No background persistence or token storage is shown. The skill does use Feishu app credentials and Bitable table identifiers to update records, so least-privilege credentials are important.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install social-media-data-collector
  3. After installation, invoke the skill by name or use /social-media-data-collector
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release: 13-platform data collection with API + browser scraping, Feishu bitable integration
Metadata
Slug social-media-data-collector
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Social Media Data Collector?

Multi-platform social media data collection and aggregation for content performance tracking. Use when: (1) collecting engagement metrics (views/likes/commen... It is an AI Agent Skill for Claude Code / OpenClaw, with 68 downloads so far.

How do I install Social Media Data Collector?

Run "/install social-media-data-collector" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Social Media Data Collector free?

Yes, Social Media Data Collector is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Social Media Data Collector support?

Social Media Data Collector is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Social Media Data Collector?

It is built and maintained by Dr-xiaoming (@dr-xiaoming); the current version is v1.0.0.

💬 Comments