Website Content Scraped into Obsidian
/install claw-social-feed
claw-social-feed
Fetch social media timelines into Obsidian vaults. Multi-platform, incremental sync, smart filtering, auto-tagging.
Core dependency: bb-browser (via --openclaw flag to reuse the OpenClaw browser session). Supports 36 platforms via bb-browser adapters — see references/platforms.md.
Workflow
User config (config.yaml)
│
▼
fetch_save.py
│
├── Dedup accounts
├── Read state.json (last fetch cursor)
│
▼
bb-browser site \x3Cplatform>/\x3Ccmd> --openclaw --json
│
▼
Filter → Tag → Write to Obsidian
│
▼
Update state.json
Quick Start
1. Install bb-browser
# Requires Node.js 18+
npm install -g bb-browser
# Verify
bb-browser --version
2. Configure accounts
Edit config.yaml:
accounts:
- platform: twitter
username: your_target_handle
- platform: hackernews
username: your_username
vault_base: ~/Documents/Obsidian Vault/SocialFeed
fetch:
count: 20
filters:
min_text_length: 30
skip_retweet_no_comment: true
skip_link_only: true
blocked_keywords: []
tagging:
enabled: true
keywords:
AI / LLM / GPT / Claude: AI
Python / JavaScript / Rust: coding
3. Run
python3 scripts/fetch_save.py --verbose
4. Check output
Content lands in vault_base/@username/ — one .md file per post, with Obsidian YAML frontmatter (platform, author, date, URL, likes, tags).
Config Reference
accounts
accounts:
- platform: twitter
username: dotey
platform: must match a bb-browser supported platform (see references/platforms.md)username: the platform-native user identifier- Deduplication:
platform + usernamemust be unique within the list
filters
| Field | Type | Default | Description |
|---|---|---|---|
min_text_length |
int | 30 | Skip posts below this character count |
skip_retweet_no_comment |
bool | true | Skip retweets with no original comment |
skip_link_only |
bool | true | Skip posts that are links/images with little text |
blocked_keywords |
list | [] | Skip posts containing any of these keywords |
tagging
Auto-tag based on keyword matching (case-insensitive, / separated synonyms = OR):
tagging:
enabled: true
keywords:
AI / LLM / 大模型: AI
skill / Skills: skill
Python / JavaScript: coding
fetch.count
fetch:
count: 20 # default 20, max 100
twitter/tweets returns ~20 tweets newest-first by default. For scheduled syncs, set to 50–100 to avoid missing posts from high-frequency accounts between sync intervals.
Incremental Sync
state.json tracks the last-fetched timestamp per account. On re-run:
- Skips posts with
created_at ≤ last_fetch - Saves only new content
- Updates
last_fetchtimestamp
Missed-run compensation: if a cron job missed a run (e.g., machine was off), the next run will backfill content within catchup_window_days (default 3 days).
To force re-fetch an account: delete its entry in state.json or delete the corresponding .md files.
Scheduled Sync
To enable automatic sync, ask the agent:
"Sync every morning at 9am" or "Sync every Monday at 8am"
The agent will create a cron job that runs in isolated mode with incremental sync — no duplicates.
Troubleshooting
bb-browser: command not found
The script auto-detects bb-browser PATH. If it still fails, confirm npm global bin is in your PATH, or install via npm install -g bb-browser.
twitter/search returns webpack module error
Use twitter/tweets instead of twitter/search. This is a known bb-browser adapter compatibility issue.
Platform returns 401 Unauthorized The OpenClaw browser needs to be logged into that platform. Open the site manually in the browser, log in once, then retry.
File already exists but want to re-fetch
Delete the corresponding entry in state.json or delete the .md files for that account.
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install claw-social-feed - After installation, invoke the skill by name or use
/claw-social-feed - Provide required inputs per the skill's parameter spec and get structured output
What is Website Content Scraped into Obsidian?
Fetch social media content and save to Obsidian. Supports Twitter/X, Reddit, GitHub, HackerNews, Bilibili, Weibo, Xiaohongshu and 30+ platforms via bb-browse... It is an AI Agent Skill for Claude Code / OpenClaw, with 142 downloads so far.
How do I install Website Content Scraped into Obsidian?
Run "/install claw-social-feed" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Website Content Scraped into Obsidian free?
Yes, Website Content Scraped into Obsidian is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Website Content Scraped into Obsidian support?
Website Content Scraped into Obsidian is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Website Content Scraped into Obsidian?
It is built and maintained by Glassmarbles (@glassmarbles); the current version is v0.1.2.