Description

Fetch social media content and save to Obsidian. Supports Twitter/X, Reddit, GitHub, HackerNews, Bilibili, Weibo, Xiaohongshu and 30+ platforms via bb-browse...

README (SKILL.md)

claw-social-feed

Name: Website Content Scraped into Obsidian
Author: glassmarbles

Fetch social media timelines into Obsidian vaults. Multi-platform, incremental sync, smart filtering, auto-tagging.

Core dependency: bb-browser (via --openclaw flag to reuse the OpenClaw browser session). Supports 36 platforms via bb-browser adapters — see references/platforms.md.

Workflow

User config (config.yaml)
      │
      ▼
fetch_save.py
      │
      ├── Dedup accounts
      ├── Read state.json (last fetch cursor)
      │
      ▼
bb-browser site \x3Cplatform>/\x3Ccmd> --openclaw --json
      │
      ▼
Filter → Tag → Write to Obsidian
      │
      ▼
Update state.json

Quick Start

1. Install bb-browser

# Requires Node.js 18+
npm install -g bb-browser

# Verify
bb-browser --version

2. Configure accounts

Edit config.yaml:

accounts:
  - platform: twitter
    username: your_target_handle
  - platform: hackernews
    username: your_username

vault_base: ~/Documents/Obsidian Vault/SocialFeed

fetch:
  count: 20

filters:
  min_text_length: 30
  skip_retweet_no_comment: true
  skip_link_only: true
  blocked_keywords: []

tagging:
  enabled: true
  keywords:
    AI / LLM / GPT / Claude: AI
    Python / JavaScript / Rust: coding

3. Run

python3 scripts/fetch_save.py --verbose

4. Check output

Content lands in vault_base/@username/ — one .md file per post, with Obsidian YAML frontmatter (platform, author, date, URL, likes, tags).

Config Reference

accounts

accounts:
  - platform: twitter
    username: dotey

platform: must match a bb-browser supported platform (see references/platforms.md)
username: the platform-native user identifier
Deduplication: platform + username must be unique within the list

filters

Field	Type	Default	Description
`min_text_length`	int	30	Skip posts below this character count
`skip_retweet_no_comment`	bool	true	Skip retweets with no original comment
`skip_link_only`	bool	true	Skip posts that are links/images with little text
`blocked_keywords`	list	[]	Skip posts containing any of these keywords

tagging

Auto-tag based on keyword matching (case-insensitive, / separated synonyms = OR):

tagging:
  enabled: true
  keywords:
    AI / LLM / 大模型: AI
    skill / Skills: skill
    Python / JavaScript: coding

fetch.count

fetch:
  count: 20  # default 20, max 100

twitter/tweets returns ~20 tweets newest-first by default. For scheduled syncs, set to 50–100 to avoid missing posts from high-frequency accounts between sync intervals.

Incremental Sync

state.json tracks the last-fetched timestamp per account. On re-run:

Skips posts with created_at ≤ last_fetch
Saves only new content
Updates last_fetch timestamp

Missed-run compensation: if a cron job missed a run (e.g., machine was off), the next run will backfill content within catchup_window_days (default 3 days).

To force re-fetch an account: delete its entry in state.json or delete the corresponding .md files.

Scheduled Sync

To enable automatic sync, ask the agent:

"Sync every morning at 9am" or "Sync every Monday at 8am"

The agent will create a cron job that runs in isolated mode with incremental sync — no duplicates.

Troubleshooting

bb-browser: command not found The script auto-detects bb-browser PATH. If it still fails, confirm npm global bin is in your PATH, or install via npm install -g bb-browser.

twitter/search returns webpack module error Use twitter/tweets instead of twitter/search. This is a known bb-browser adapter compatibility issue.

Platform returns 401 Unauthorized The OpenClaw browser needs to be logged into that platform. Open the site manually in the browser, log in once, then retry.

File already exists but want to re-fetch Delete the corresponding entry in state.json or delete the .md files for that account.

Usage Guidance

This skill appears to do what it says: it runs bb-browser to fetch posts, filters/tags them, and writes .md files into an Obsidian vault. Before installing or running it: (1) be prepared to install bb-browser (npm global) and ensure Node.js is acceptable on your machine; (2) review and set vault_base in config.yaml so files go where you expect; (3) run with --dry-run/--verbose first to observe behavior; (4) the skill’s scheduled sync flow will create cron jobs if the agent follows SKILL.md — only allow that if you want automatic system-level cron modifications; (5) be aware bb-browser --openclaw reuses the OpenClaw browser session (cookies/logins). If you do not want adapters to access logged-in sessions (bookmarks/notifications/private feeds), avoid reusing the browser session or log out of those sites first. Finally, inspect the config and the .claw-social-feed-state.json after a run so you understand what was fetched and when.

Capability Analysis

Type: OpenClaw Skill Name: claw-social-feed Version: 0.1.2 The skill bundle is a functional tool designed to fetch social media content and save it to an Obsidian vault. The core logic in `scripts/fetch_save.py` uses the `bb-browser` utility to retrieve data and performs local file operations and filtering that are entirely consistent with its stated purpose. While it uses `subprocess.run` and interacts with the filesystem, it does so safely (passing arguments as a list to prevent shell injection) and lacks any indicators of malicious intent, such as data exfiltration, credential theft, or unauthorized remote access.

Capability Assessment

✓ Purpose & Capability

Name/description, config.yaml, platforms.md and scripts/fetch_save.py all align: the skill calls bb-browser to fetch posts, filters/tags them, and writes markdown files to an Obsidian vault. There are no unrelated credentials or unexplained binaries requested.

ℹ Instruction Scope

SKILL.md directs installing and using bb-browser and instructs the agent to create a scheduled sync (cron job). The code itself reads/writes config.yaml, a local state file, and writes files into a user-specified vault path — all expected for this purpose. Two things to note: (1) SKILL.md promises the agent will create cron jobs (system scheduling changes) — that is beyond mere file I/O and should be consented to, and (2) the workflow relies on bb-browser --openclaw to reuse the OpenClaw browser session, which means any logged-in session/data in that browser could be used by bb-browser adapters.

ℹ Install Mechanism

The repository contains no automated install spec; the instructions ask the user to install bb-browser via npm (global). This is a typical approach but requires Node/npm and a global install; there is no opaque download or embedded binary. No files in the skill perform arbitrary remote downloads during install.

✓ Credentials

The skill declares no required environment variables or credentials and the Python script does not read secrets from env vars. It does probe common paths (home/.nvm, /usr/local/bin) to locate bb-browser and will read/write files under the user's home directory (config.yaml, state file, and the specified Obsidian vault). These accesses are proportionate to the stated purpose, but because bb-browser uses the OpenClaw browser session, it may access any web sessions/cookies present in that browser — that is a functional requirement but a privacy consideration.

ℹ Persistence & Privilege

always is false (normal). However SKILL.md states the agent will create cron jobs to enable scheduled syncs. Creating/modifying crontab entries is a system-level action outside the script itself; users should explicitly approve such changes. The skill does not claim or request permanent privileged presence beyond that.

Version History

v0.1.2

claw-social-feed 0.1.2 - Updated skill description and documentation to English for broader accessibility. - Expanded platform list to explicitly include HackerNews, Bilibili, Weibo, Xiaohongshu, and 30+ others via bb-browser. - Simplified setup instructions and configuration examples for clarity. - Improved config reference, workflow overview, and troubleshooting sections. - No code or logic changes in this version; documentation only.

v0.1.1

- Improved troubleshooting instructions by clarifying bb-browser path detection and setup steps. - No functional changes to the core logic or features.

v0.1.0

- Initial release of claw-social-feed. - Fetch and save social media timelines from 36 supported platforms (via bb-browser) into Obsidian vaults. - Supports incremental sync, content filtering, auto tagging, and scheduled sync. - Easy configuration via config.yaml, supporting account lists, custom filters, and tag rules. - Each post saved as an individual Markdown file with Obsidian YAML frontmatter. - Automatic compensation for missed cron runs and manual re-fetch controls provided.

Metadata

Slug claw-social-feed

Version 0.1.2

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 3

Frequently Asked Questions

What is Website Content Scraped into Obsidian?

Fetch social media content and save to Obsidian. Supports Twitter/X, Reddit, GitHub, HackerNews, Bilibili, Weibo, Xiaohongshu and 30+ platforms via bb-browse... It is an AI Agent Skill for Claude Code / OpenClaw, with 142 downloads so far.

How do I install Website Content Scraped into Obsidian?

Run "/install claw-social-feed" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Website Content Scraped into Obsidian free?

Yes, Website Content Scraped into Obsidian is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Website Content Scraped into Obsidian support?

Website Content Scraped into Obsidian is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Website Content Scraped into Obsidian?

It is built and maintained by Glassmarbles (@glassmarbles); the current version is v0.1.2.

More Skills

Website Content Scraped into Obsidian