← 返回 Skills 市场
arbiger

Kb Collector

作者 arbiger · GitHub ↗ · v1.2.1
cross-platform ⚠ suspicious
455
总下载
0
收藏
1
当前安装
4
版本数
在 OpenClaw 中安装
/install kb-collector
功能描述
Knowledge Base Collector - save YouTube, URLs, text to Obsidian with AI summarization. Auto-transcribes videos, fetches pages, supports weekly/monthly digest...
使用说明 (SKILL.md)

KB Collector

Knowledge Base Collector - Save YouTube, URLs, and text to Obsidian with automatic transcription and summarization.

Features

  • YouTube Collection - Download audio, transcribe with Whisper, auto-summarize
  • URL Collection - Fetch and summarize web pages
  • Plain Text - Direct save with tags
  • Digest - Weekly/Monthly/Yearly review emails
  • Nightly Research - Automated AI/LLM/tech trend tracking

Installation

# Install dependencies
pip install yt-dlp faster-whisper requests beautifulsoup4

# For AI summarization (optional)
pip install openai anthropic

Usage (Python Version - Recommended)

# Collect YouTube video
python3 scripts/collect.py youtube "https://youtu.be/xxxxx" "stock,investing"

# Collect URL
python3 scripts/collect.py url "https://example.com/article" "python,api"

# Collect plain text
python3 scripts/collect.py text "My note content" "tag1,tag2"

Usage (Bash Version - Legacy)

# Collect YouTube
./scripts/collect.sh "https://youtu.be/xxxxx" "stock,investing" youtube

# Collect URL
./scripts/collect.sh "https://example.com/article" "python,api" url

# Collect plain text
./scripts/collect.sh "My note" "tag1,tag2" text

Nightly Research (New!)

Automated AI/LLM/tech trend tracking - runs daily and saves to Obsidian.

# Save to Obsidian only
./scripts/nightly-research.sh --save

# Save to Obsidian AND send email
./scripts/nightly-research.sh --save --send

# Send email only
./scripts/nightly-research.sh --send

Features

  • Searches multiple sources (Hacker News, Reddit, Twitter)
  • LLM summarization (optional)
  • Saves to Obsidian with tags
  • Optional email digest

Cron Setup (optional)

# Run every night at 10 PM
0 22 * * * /path/to/nightly-research.sh --save --send

Configuration

Edit the script to customize:

VAULT_PATH = os.path.expanduser("~/Documents/YourVault")
NOTE_AUTHOR = "YourName"

Output Format

Notes saved to: {VAULT_PATH}/yyyy-mm-dd-title.md

---
created: 2026-03-03T12:00:00
source: https://...
tags: [stock, investing]
author: George
---

# Title

> **TLDR:** Summary here...

---

Content...

---
*Saved: 2026-03-03*

Dependencies

  • yt-dlp
  • faster-whisper (for transcription)
  • requests + beautifulsoup4 (for URL fetching)
  • Optional: openai/anthropic (for AI summarization)

Credits

Automated note-taking workflow for Obsidian.

安全使用建议
What to check before installing/using this skill: - Expectation mismatch: The registry/metadata declare no env vars but the scripts read TAVILY_API_KEY, OBSIDIAN_VAULT, and RECIPIENT. Confirm whether you should provide any API keys and where those keys will be used. - Vault path & recipient: The Python and shell scripts default to a specific user's vault path (/Users/george/... or ~/Documents/Georges/Knowledge). Edit VAULT_PATH/VAULT/OBSIDIAN_VAULT to point to your own vault before running, and replace the hard-coded RECIPIENT ([email protected]) with your address or remove email sending if undesired. - External network and email: nightly-research.sh contacts https://api.tavily.com and digest/nightly can send mail via 'gog gmail send'. If you enable those features, you will send search queries and possibly note content to external services. Only set TAVILY_API_KEY if you trust Tavily and understand their data use. Inspect/verify how 'gog' is configured for Gmail on your machine — it may reuse stored credentials to send mail. - Data exfil channels: The main exfil vectors here are (1) posting queries/results to Tavily, and (2) sending digests via the 'gog' tool. There is no obfuscated code or hidden endpoints, but these channels can leak note contents if misconfigured. - Run in a safe environment first: Execute scripts in a sandbox or a test account/vault, with no sensitive notes present. Replace hard-coded values, and run with network disabled if you want only local behavior. - Dependency hygiene: The SKILL.md asks you to pip install yt-dlp, faster-whisper, etc. Those packages and the external binaries (yt-dlp, whisper, gog) will run code on your machine. Install them from official sources and review their own security considerations. - Ask the author / request metadata: The skill lacks homepage/author contact and doesn't declare env vars. If you plan to use it long-term, ask the publisher to add explicit docs for required credentials and configurable defaults (vault path, recipient), or update the skill to avoid hard-coded user-specific paths and recipients. If you are uncomfortable with network calls or automatic email sending, either remove/disable those parts of the scripts or decline to install. If you proceed, make the environment variables explicit and verify behavior with small, non-sensitive test data first.
功能分析
Type: OpenClaw Skill Name: kb-collector Version: 1.2.1 The skill bundle contains hardcoded configurations that pose a significant privacy and data exfiltration risk, most notably a fixed email recipient ('[email protected]') in 'scripts/digest.sh' that would send note summaries to a third party by default. Additionally, 'scripts/collect.sh' and 'scripts/collect.py' utilize hardcoded local file paths and include unused or undefined dependencies such as 'yfinance' and 'web_fetch'. While these appear to be artifacts of a personal workflow shared without proper sanitization, the hardcoded transmission of user data to an external domain is a high-risk behavior.
能力评估
Purpose & Capability
Name/description align with the included scripts: downloading YouTube audio, transcribing, fetching pages, saving to an Obsidian vault, and generating digests/nightly research. Some minor mismatches: SKILL.md claims it 'searches multiple sources (Hacker News, Reddit, Twitter)', but nightly-research.sh performs searches via the Tavily API only (it does not independently query those sites). Overall capabilities match the stated purpose.
Instruction Scope
SKILL.md and scripts instruct the agent to fetch remote web pages and call external services (Tavily API via curl, and send email via the 'gog gmail send' tool). The scripts also write files into an Obsidian vault path and remove temporary audio files. The SKILL metadata declared no required env vars, but the scripts read/expect environment variables (TAVILY_API_KEY, OBSIDIAN_VAULT, RECIPIENT) and use a hard-coded email recipient and hard-coded vault paths (/Users/george/... and ~/Documents/Georges/Knowledge). These runtime actions (external API calls and email sending) are outside the declared requirements and should be explicitly disclosed.
Install Mechanism
There is no formal install spec in the registry (instruction-only), which is lower risk from an automatic installer perspective. The SKILL.md tells the user to pip install packages (yt-dlp, faster-whisper, requests, beautifulsoup4, optional openai/anthropic). That is consistent with the code. No downloads from arbitrary URLs or archive extraction are present. Because the code relies on external binaries (yt-dlp, whisper) and a third-party CLI tool 'gog', the user must install those manually — the absence of an install spec means the skill won't auto-install them but the runtime will fail or behave unexpectedly if they are missing.
Credentials
Registry lists no required environment variables or credentials, yet scripts make use of environment vars: TAVILY_API_KEY (sent to api.tavily.com), OBSIDIAN_VAULT (overrides VAULT), and RECIPIENT. digest.sh and nightly-research.sh also use or assume the presence of an email-sending tool ('gog') which uses credentials not declared here. The scripts also include hard-coded local paths and a hard-coded recipient email ([email protected]). Asking for or using API keys and email-sending capabilities is proportionate to the feature set — but they should be declared, and the hard-coded recipient is suspicious/unexpected behavior that could lead to unintended data exfiltration.
Persistence & Privilege
The skill does not request permanent platform-level privileges (always: false) and does not modify other skills' configuration. It writes notes to a user-visible Obsidian vault and temporary files in /tmp, which is expected given its purpose. Autonomous invocation is allowed (default) — combined with the environment/credential concerns above this increases potential impact, but the skill alone does not request 'always' or system-wide config changes.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install kb-collector
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /kb-collector 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.2.1
- Minor update to nightly-research.sh script. - No user-facing changes to documentation or features. - Maintains all existing functionalities for nightly research automation.
v1.2.0
- Added nightly research automation script for AI/LLM/tech trend tracking, with options to save notes to Obsidian and send email digests. - Updated SKILL.md with documentation for the new nightly research feature, including usage instructions and cron setup. - Added package-lock.json for improved dependency management. - Enhanced digest script and overall skill documentation for clarity and new functionality.
v1.1.0
**Python-based collector script added, improved documentation and options.** - Added Python script (scripts/collect.py) for collecting YouTube, URLs, and text to Obsidian - SKILL.md updated: clearer instructions, Python usage recommended, improved formatting, new install/usage/configuration sections - Dependency list expanded (yt-dlp, faster-whisper, requests, beautifulsoup4; openai/anthropic optional) - Bash script instructions moved to a "legacy" section - Digest and summarization features mentioned but unchanged in function - Output note format documentation updated
v1.0.0
Version 1.0.0 of kb-collector - Initial release of Knowledge Base Collector. - Save articles, plain text, and YouTube videos to Obsidian vaults. - Auto-transcribe YouTube videos using Whisper and summarize content. - Fetch and summarize web page URLs; auto-tag content. - Supports digest emails (weekly, monthly, yearly) sent via Gmail. - Simple trigger commands for collecting and reviewing content.
元数据
Slug kb-collector
版本 1.2.1
许可证
累计安装 2
当前安装数 1
历史版本数 4
常见问题

Kb Collector 是什么?

Knowledge Base Collector - save YouTube, URLs, text to Obsidian with AI summarization. Auto-transcribes videos, fetches pages, supports weekly/monthly digest... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 455 次。

如何安装 Kb Collector?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install kb-collector」即可一键安装,无需额外配置。

Kb Collector 是免费的吗?

是的,Kb Collector 完全免费(开源免费),可自由下载、安装和使用。

Kb Collector 支持哪些平台?

Kb Collector 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Kb Collector?

由 arbiger(@arbiger)开发并维护,当前版本 v1.2.1。

💬 留言讨论