Description

Personal knowledge base that captures web content (articles, tweets/threads, videos, podcasts, images, PDFs) and makes it retrievable for future conversation...

README (SKILL.md)

Link Library — Personal Content Knowledge Base

Name: Link Library
Author: nowhitestar

Save web content with full original text, generate summaries and tags, retrieve semantically.

Core Rules

Always save original full text — summaries are for retrieval, originals are for re-reading
Detect interest, don't demand commands — if user engages with a link, offer to save
Twitter/X is first-class — tweets, threads, and articles are fully supported

Interest Detection

When user shares a link, evaluate interest signals:

Auto-save (no confirmation needed):

User explicitly says save/bookmark/记一下/放进知识库
User asks "帮我总结一下" (summarize implies save-worthy)

Offer to save (ask once):

User shares link + positive commentary ("这篇不错", "有意思", "学到了")
User asks follow-up questions about link content
User discusses link content substantively

Don't save:

User shares link just for quick reference in conversation
User says "不用保存" or similar

Data Location

All entries in ~/.openclaw/workspace-main/library/:

library/
├── articles/     # Web articles, blog posts, WeChat, Zhihu
├── tweets/       # Twitter/X posts and threads
├── videos/       # YouTube, Bilibili
├── podcasts/     # Podcast episodes
├── papers/       # Academic papers, PDFs
├── images/       # Infographics, visual content
└── misc/         # Everything else

Content Types & Fetch Methods

Type	URL Patterns	Fetch Method	Template
article	Generic web, blog, /post/	`web_fetch` or `curl -s "https://r.jina.ai/URL"`	`article.md`
wechat	mp.weixin.qq.com	`cd ~/.agent-reach/tools/wechat-article-for-ai && python3 main.py "URL"`	`article.md`
tweet	x.com, twitter.com /status/	`xreach tweet URL --json`	`tweet.md`
thread	x.com, twitter.com (thread)	`xreach thread URL --json`	`tweet.md`
video	youtube.com, youtu.be	`yt-dlp --dump-json "URL"` + subtitle extraction	`video.md`
bilibili	bilibili.com	`yt-dlp --dump-json "URL"` + subtitle extraction	`video.md`
paper	arxiv.org, .pdf links	`web_fetch` or browser	`paper.md`
podcast	Podcast platforms	`web_fetch` metadata	`podcast.md`
image	Image URLs	Download + describe	`image.md`

Twitter/X Fetch Details

# Single tweet
xreach tweet URL_OR_ID --json

# Full thread
xreach thread URL_OR_ID --json

# User timeline (for context)
xreach tweets @username -n 20 --json

Extract from JSON: full_text, user.screen_name, created_at, entities, media URLs. For threads: concatenate all tweets in order as full content.

Video Subtitle Extraction

# Download subtitles
yt-dlp --write-sub --write-auto-sub --sub-lang "zh-Hans,zh,en" \
  --convert-subs vtt --skip-download -o "/tmp/%(id)s" "URL"
# Then read the .vtt file as transcript

Entry Structure

Every entry has two parts:

1. YAML Frontmatter (structured metadata)

title: "..."
source: "..."           # Platform/domain
url: "..."              # Original URL
author: "..."           # Author or @handle
date_published: "..."   # When content was created
date_saved: "..."       # When we saved it
last_updated: "..."     # Last modification
type: article|tweet|video|podcast|paper|image
tags: [tag1, tag2, ...]
status: unread|read|reviewed
priority: low|normal|high
related: []             # Paths to related entries

2. Markdown Body (content)

# {title}

## Summary
2-3 sentence summary.

## Key Points
- Point 1
- Point 2

## Original Content
THE FULL ORIGINAL TEXT — not truncated, not summarized.
This is the authoritative source for re-reading and quoting.

## Quotes
> Notable quotes worth highlighting

## Notes
Personal observations, connections, action items.

## Related
- [[library/tweets/related-tweet]]
- [[library/articles/related-article]]

⚠️ MANDATORY: Always save original full text in "Original Content" section. Summaries and key points are for quick retrieval. The original text is for accurate re-reading and quoting. Never skip saving the full content.

Filename Convention

\x3Cslugified-title>-\x3CYYYY-MM-DD>.md

Examples:

library/articles/yc-why-not-work-and-startup-2026-03-12.md
library/tweets/garry-tan-on-yc-advice-2026-03-13.md
library/videos/how-to-build-agents-2026-03-13.md

Save Workflow

Detect URL — Parse link from user message
Identify type — Match URL pattern to content type
Check dedup — memory_search("URL or title") to avoid duplicates
Fetch content — Use appropriate method from table above
Generate metadata — Title, summary, key points, tags (3-7)
Write entry — Use template, fill frontmatter + full original text
Confirm — Tell user: title, tags, and where it's saved

Search & Retrieval

# Semantic search
memory_search("创业方法论")
memory_search("Garry Tan 的推文")
memory_search("AI agent 视频教程")

# Read specific entry
memory_get("library/tweets/garry-tan-on-yc-2026-03-13.md")

When returning search results, show:

Title + source + date
Summary (2 lines max)
Tags
Offer to show full original text

Writing Reference Mode

When user asks to write something using saved content:

Search library for relevant entries
Read full original text of top matches
Synthesize insights, cite sources inline
Format citations as [[library/type/entry-name]]

Templates

Located in templates/:

article.md — Web articles, blog posts, newsletters
tweet.md — Twitter/X posts and threads
video.md — Videos with transcript
podcast.md — Podcast episodes
paper.md — Academic papers
image.md — Visual content

Best Practices

Save originals religiously — summaries lose nuance
Tag consistently — reuse existing tags, keep vocabulary tight
Link related entries — build a knowledge graph over time
Don't over-ask — if interest is clear, just save and confirm

Usage Guidance

Before installing or enabling this skill: (1) Treat it as capable of writing persistent files under ~/.openclaw/workspace-main/library/ and of sending URLs/content to third parties (e.g., r.jina.ai) — do not use it with sensitive or corporate links until you trust those endpoints. (2) Confirm which binaries and local scripts it requires (yt-dlp, xreach, curl, python3, the local wechat script) and inspect/install them from trusted sources; the skill currently declares none. (3) Consider disabling or changing the auto-save policy — require explicit user confirmation before saving full original text. (4) If you need to use it, run it in a sandboxed account or VM and audit the files it writes and the network calls it makes. (5) What would increase confidence: an explicit dependency/install section, a clear list of required credentials and how they’re used/stored, and removal or opt-in control over third‑party fetch endpoints (or an option to fetch content locally rather than via r.jina.ai).

Capability Analysis

Type: OpenClaw Skill Name: link-library Version: 1.0.0 The link-library skill enables an AI agent to archive web content into a local knowledge base using high-risk shell commands (curl, yt-dlp, python3) and network access. While these capabilities are aligned with the stated purpose of saving articles and tweets, the instructions in SKILL.md provide command templates that lack input sanitization for URLs, potentially exposing the agent to shell injection vulnerabilities. The skill also performs extensive file operations in the ~/.openclaw/ directory and relies on external services like r.jina.ai for content extraction.

Capability Assessment

⚠ Purpose & Capability

The skill claims to capture web content and make it retrievable, which matches the instructions. However the registry metadata declares no required binaries or credentials, while the SKILL.md explicitly invokes many external tools (curl/r.jina.ai, yt-dlp, xreach, a local python wechat script, etc.). Those tools are necessary for the described capabilities but are not declared — an incoherence that should be resolved before trust.

⚠ Instruction Scope

Instructions instruct the agent to fetch remote content and to always save the "full original text" into a local library directory (~/.openclaw/workspace-main/library/). They also instruct calls to third‑party fetch endpoints (e.g., https://r.jina.ai/URL), run yt-dlp to download subtitles/media, and run a local script at ~/.agent-reach/tools/wechat-article-for-ai. The SKILL.md also defines an auto-save policy (sometimes save without confirmation). These behaviors expand scope into network I/O, file writes, and potential disclosure of URLs/content to external services.

ℹ Install Mechanism

This is an instruction-only skill (no install spec) which is low-install risk on its own. However the skill expects many third‑party command-line tools and a local script; there is no guidance to install them. The lack of a declared install mechanism or dependency list is an operational and security gap (an operator may unknowingly run commands that fail or run unreviewed CLIs).

⚠ Credentials

The skill declares no required environment variables or credentials, yet the fetch methods (xreach for Twitter/X, a WeChat python script, yt-dlp for some sites) commonly require API tokens, cookies, or authenticated access. Also using remote services like r.jina.ai to fetch page text will transmit user-shared URLs (and potentially page content) to a third party. The absence of declared credentials or mention of where sensitive tokens are stored is disproportionate and ambiguous.

⚠ Persistence & Privilege

The skill writes content persistently to a specific path under the user's home directory and mandates saving full original text (potentially storing sensitive or copyrighted material). Although always:false, the skill allows autonomous invocation and includes auto-save rules that can save without explicit confirmation in some cases, which combined with network fetches to third parties increases privacy/exfiltration risk.

Version History

v1.0.0

link-library v1.0.0 — Initial Release - Introduces a personal knowledge base for capturing and retrieving web content (articles, tweets, videos, podcasts, images, PDFs). - Automatically detects user interest signals to save content, even without explicit save commands. - Supports semantic search, retrieval, and citation of saved content for writing and discussions. - Organizes content by type and platform, always saving original full text with metadata, summaries, and tags. - Provides detailed templates, workflows, and best practices for consistent content management.

Metadata

Slug link-library

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Link Library?

Personal knowledge base that captures web content (articles, tweets/threads, videos, podcasts, images, PDFs) and makes it retrievable for future conversation... It is an AI Agent Skill for Claude Code / OpenClaw, with 222 downloads so far.

How do I install Link Library?

Run "/install link-library" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Link Library free?

Yes, Link Library is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Link Library support?

Link Library is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Link Library?

It is built and maintained by 不白 (@nowhitestar); the current version is v1.0.0.

More Skills

Link Library