← 返回 Skills 市场
nemo4110

llm-wiki SKILL inspired by Karpathy

作者 T0M0R1N · GitHub ↗ · v1.3.0 · MIT-0
cross-platform ⚠ suspicious
243
总下载
0
收藏
0
当前安装
8
版本数
在 OpenClaw 中安装
/install 041-llm-wiki
功能描述
Karpathy's llm-wiki pattern implementation — cumulative knowledge management for AI agents
使用说明 (SKILL.md)

CLI Reference

Protocol Mode (Recommended)

Use natural language with your agent:

"Please ingest sources/paper.pdf into wiki"
"Query wiki: What is the difference between Transformer and RNN?"
"Check wiki health"

CLI Mode (Optional)

After installing dependencies:

# Show wiki status overview
python -m src.llm_wiki status

# Run health check
python -m src.llm_wiki lint

# Show help
python -m src.llm_wiki --help

Note: ingest and query commands in CLI only provide auxiliary functions (like listing pages). Actual content processing requires natural language interaction with the agent.

LLM-Wiki

Karpathy's llm-wiki pattern implementation — cumulative knowledge management for AI agents.

Core Philosophy: LLM as programmer, Wiki as codebase, User as product manager.

Why SKILL Form?

Dimension Standalone App (e.g. Sage-Wiki) This SKILL Implementation
Architecture Go + SQLite + Embedded Frontend Pure Markdown
Deployment Requires running service Zero deployment
Integration Indirect via MCP Native commands
Code Size ~10k lines ~500 lines
Data Format Proprietary Plain text Markdown
Editor Locked in app Obsidian/VSCode/Any

Features

  • Protocol-driven: Works with natural language (no installation required)
  • Pure Markdown: No database, no lock-in, git-native
  • Wiki-style links: [[PageName]] format, Obsidian-compatible
  • Cumulative learning: Every query can create new knowledge
  • Health checks: Orphan pages, dead links, stale content detection
  • Optional CLI: Python scripts for automation and batch operations

Quick Start

# 1. Clone
git clone https://github.com/Nemo4110/llm-wiki.git
cd llm-wiki

# 2. Add source material
cp ~/Downloads/paper.pdf sources/

# 3. Tell your agent
"Please ingest sources/paper.pdf into wiki"

Installation

Protocol Mode (Recommended)

No installation needed. Agent reads CLAUDE.md and operates directly.

CLI Mode (Optional)

Using uv (Fastest)

# Create virtual environment and install dependencies
uv venv
uv pip install -r src/requirements.txt --python .venv/Scripts/python.exe

# Activate environment (Windows)
.venv\Scripts\activate
# Or Linux/macOS
source .venv/bin/activate

Using conda

# Create environment
conda create -n llm-wiki python=3.11

# Activate environment
conda activate llm-wiki

# Install dependencies
pip install -r src/requirements.txt

Using pip

# Create virtual environment
python -m venv .venv

# Activate environment
source .venv/bin/activate  # Linux/macOS
.venv\Scripts\activate     # Windows

# Install dependencies
pip install -r src/requirements.txt

Verify Installation

python -c "from src.llm_wiki.core import WikiManager; print('✓ Installation successful')"

Important Dependency Notes:

Dependency Version Purpose Notes
click >=8.0.0 CLI framework -
pyyaml >=6.0 YAML parsing -
pymupdf >=1.25.0 PDF processing Primary PDF engine, best for CJK

Optional dependencies (for enhanced features):

  • numpy >=1.24.0 — Vector operations for embedding retrieval
  • httpx >=0.27.0 — HTTP client for Ollama/local services
  • openai >=1.0.0 — OpenAI embedding API
  • mcp >=1.0.0 — MCP SDK for remote embedding providers

Fallback PDF dependency:

  • pdfplumber >=0.11.8 — Table extraction fallback (security version required for CVE-2025-64512)
  • pdfminer.six >=20251107 — PDF underlying library fallback

Project Structure

llm-wiki/
├── CLAUDE.md           # ⭐ Core protocol: Agent behavior guidelines
├── AGENTS.md           # Agent implementation guide (CLI usage)
├── SKILL.md            # This file, machine-readable specification
├── log.md              # Timeline log (append-only)
├── sources/            # Raw materials (user-managed + tool-fetched; Agent forbidden from writing LLM-generated content)
│   └── README.md
├── wiki/               # Generated knowledge pages (Agent-managed)
│   ├── index.md        # Entry index
│   └── *.md            # Topic pages
├── assets/             # Templates and configuration
│   ├── page_template.md
│   └── ingest_rules.md
├── src/                # SKILL implementation (optional, for CLI)
│   ├── llm_wiki/
│   └── requirements.txt
├── scripts/            # Auxiliary scripts
├── hooks/              # Platform hooks (optional)
└── examples/           # Example wiki

About sources/: Excluded from git by default to avoid repository bloat. Wiki only retains extracted knowledge; original files are managed separately (cloud storage, Zotero, etc.). See sources/README.md for tracking specific files.

How It Works

Data Flow

+----------+     +--------------------+     +--------------+
| sources/ |---->|   LLM Processing   |---->|    wiki/     |
|  (Raw)   |     | (Extract + Link)   |     | (Structured) |
+----------+     +--------------------+     +--------------+
                          |
                          v
                    +----------+
                    |  log.md  |
                    | (Record) |
                    +----------+

Key Design

  1. CLAUDE.md as Protocol: Defines Agent behavior standards, anyone/any Agent can follow
  2. Pure Markdown: No database, no lock-in, native git version control
  3. Bidirectional Links: [[PageName]] format, compatible with Obsidian
  4. Cumulative Learning: Each query can generate new wiki pages, knowledge continuously accumulates

Query Mechanism

Current Implementation: Symbolic Navigation + LLM Synthesis (Default)

By default, this SKILL does not require Embedding/vector retrieval. Queries are completed through:

User asks question
         |
         v
+-------------------------------+
|  1. Read index.md             |  \x3C-- Human/Agent-maintained category index
|     Locate relevant topics    |
+-------------------------------+
         |
         v
+-------------------------------+
|  2. Read relevant pages       |  \x3C-- Discover associations through [[links]]
|     and their link neighbors  |
+-------------------------------+
         |
         v
+-------------------------------+
|  3. LLM Synthesis             |  \x3C-- Generate answers based on read content
|     Generate with citations   |  Citation format: [[PageName]]
+-------------------------------+

Optional Enhancement: After enabling config.yaml embedding settings, CLI query --semantic adds hybrid search (Keyword Match + Vector Search + Link Traversal) for faster, more accurate retrieval.

Example Flow:

User asks: "What is LoRA?"

  1. Agent reads wiki/index.md, finds [[LoRA]] under "AI/ML" topic
  2. Agent reads wiki/LoRA.md, discovers links to [[Fine-tuning]], [[Adapter]]
  3. Agent synthesizes answer:

    LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning method — see [[LoRA]]. Compared to traditional [[Fine-tuning]], it only trains low-rank matrices...

Why is Embedding Optional?

Consideration Current Solution Embedding Solution
Dependencies Zero external dependencies Requires Embedding API or local model
Cost No additional fees Charged per token/request
Privacy Data not uploaded Must send content to external service
Accuracy Precise links, explainable Approximate similarity, may retrieve irrelevant content
Scale Suitable for 0-500 pages Essential for large scale (1000+ pages)

Conclusion: For personal/small team knowledge bases, maintaining index.md and page links is simpler and more effective than introducing Embedding. Embedding is available as an opt-in CLI enhancement when scale demands it.

When to Use Embedding?

Consider enabling config.yaml embedding settings when your wiki shows:

  • Page count > 500, manual indexing hard to maintain
  • Queries are mostly fuzzy semantic ("that paper about optimization...")
  • Need cross-topic associations (user didn't mention keywords, but semantically related)

See ROADMAP.md for details.

Example Workflows

Scenario 1: Researching New Domain

# Day 1: Add materials
sources/
├── 2026-04-10-transformer-paper.pdf
└── 2026-04-10-lora-blog.md
User: Please ingest new materials
Claude: Processed 2 materials
  - Created [[Transformer]] — Attention mechanism architecture
  - Created [[LoRA]] — Low-rank adaptation
  - Created [[Self-Attention]] — Self-attention details
  - Updated wiki/index.md

Scenario 2: Deep Exploration

User: What's the difference between LoRA and full fine-tuning?

Claude (reads wiki/index.md → finds [[LoRA]] → synthesizes answer):

Based on our wiki:

**LoRA** — [[LoRA]]
- Freezes pre-trained weights, only trains low-rank matrices
- Parameter count reduced ~10,000x

**Full Fine-tuning**
- Updates all parameters
- High computational cost

Key differences:
1. Parameters: LoRA only trains 0.1%~1% of parameters [[Parameter Efficiency]]
2. Storage: LoRA only saves adapters, not full model
3. Composition: Multiple LoRAs can be stacked [[LoRA Composition]]

Was this helpful? Should I archive this answer to wiki?

Scenario 3: Knowledge Accumulation

User: Yes, archive this answer

Claude: Created [[LoRA vs Full Fine-tuning]]
- Extracted comparison points from conversation
- Linked to [[LoRA]] and [[Fine-tuning]]
- Added to FAQ section in wiki/index.md

Using with Obsidian

  1. Open wiki/ directory in Obsidian
  2. Enjoy graph view, quick navigation, beautiful rendering
  3. Claude Code handles maintenance, Obsidian handles reading and thinking

Comparison with Alternatives

Solution Characteristics Best For
This SKILL Zero dependencies, pure text, Claude Code native Personal knowledge management, research notes
Sage-Wiki Full-featured, multimodal, standalone app Team knowledge base, enterprise deployment
Obsidian + Plugins Strong visualization, rich community Existing Obsidian workflow
Notion/Logseq Collaborative, real-time sync Multi-user collaboration, mobile access

Documentation

  • CLAUDE.md — User-facing protocol (read this first)
  • AGENTS.md — Implementation guide for agent developers
  • SKILL.md — This file, machine-readable specification
  • ROADMAP.md — Future plans

Contributing

Issues and PRs welcome!

Current TODO

  • MCP server wrapper (for other Agents)
  • Obsidian plugin (one-click sync)
  • Incremental embedding for faster retrieval
  • Multi-language support

License

MIT — free to use, modify, and distribute.


Inspired by Karpathy's llm-wiki

安全使用建议
This skill appears to implement the advertised LLM-wiki functionality, but it also instructs agents to fetch web pages, run Playwright/curl, and optionally call remote embedding providers (OpenAI/Ollama/MCP). Before installing or enabling it: - Verify network policy: the skill's spec did not declare network capability; confirm whether your agent environment will allow outbound network requests and whether you are comfortable with that. - Check config.yaml and defaults: embedding.enabled defaults to false, but if you enable embeddings you may need to provide API keys. Do NOT store sensitive secrets in repository files; prefer runtime environment variables in a secure store. The skill does not declare required env vars in registry metadata, so manually inspect config.yaml and any env interpolation before use. - Review scripts and provider code (create_provider / mcp integration): MCP can be configured to use stdio/command transports; confirm there is no unintended command execution path you don't want. - Limit write scope: the agent will write to wiki/ and log.md and may write fetched files to sources/. Keep sensitive files out of these directories or run the skill in an isolated sandbox/repo copy. - If you do not want external network calls, keep embedding.enabled: false and avoid invoking ingest flows that require web fetch. Prefer dry-run or CLI read-only modes first to observe behavior. If you want, I can highlight the exact lines/functions that enable network calls and where environment interpolation is used so you can audit them more easily.
能力标签
cryptocan-make-purchasesrequires-sensitive-credentials
能力评估
Purpose & Capability
Name/description, code, and SKILL.md consistently implement a local markdown-based wiki (ingest, link, relink, query, lint). Requested host capabilities (filesystem-read, filesystem-write, llm-completion) match the core purpose. However, the skill's documented workflows and code also assume the ability to fetch web resources and call remote embedding providers (httpx, OpenAI, Ollama, MCP), which is not declared in the top-level capabilities — an omission that should be clarified.
Instruction Scope
SKILL.md / AGENTS.md explicitly instruct agents to run network fetches (curl, Playwright, httpx), to write downloaded files into sources/, to create and back-update wiki pages, and to run CLI scripts. These instructions give the agent permission to download arbitrary URLs, spawn Playwright browsers, and modify the repo's wiki and log files. That scope is appropriate for an ingesting wiki, but the skill fails to explicitly declare or limit network usage and does not enumerate the external endpoints/providers it may contact, which is a transparency concern.
Install Mechanism
No registry-level install spec was provided (instruction-only skill), but the repository includes installation instructions and a requirements.txt. The declared dependencies are common (click, pyyaml, pymupdf, httpx, openai, mcp). There are no downloads from arbitrary URLs in the install spec. Overall install risk is moderate and typical for a Python CLI project; nothing obviously malicious in the provided install guidance.
Credentials
The skill lists providers (openai, ollama, mcp) and config.yaml interpolation supports environment variables (config.py), yet the registry metadata declares no required env vars or primary credential. That mismatch means the skill can be configured to use powerful remote APIs (which require API keys) without those needs being surfaced during installation. Users may accidentally enable/expose external providers. Also, MCP transport can be configured to run commands/stdio — this deserves attention before enabling.
Persistence & Privilege
The skill does not request always:true and does not appear to modify other skills or system-wide agent settings. It writes to local wiki/, log.md, and sources/ (when fetching) which are normal for its purpose. Model invocation is allowed (disable-model-invocation: false) — this is the platform default and expected for skills.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install 041-llm-wiki
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /041-llm-wiki 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.3.0
# llm-wiki v1.1.4 Changelog - Added a new file: `docs/README.cn.md` (Chinese README). - No changes to code or functionality; documentation improvement only.
v1.1.3
Version 1.3.0 introduces relationship linking and merging capabilities for wiki pages. - Added new CLI commands and core functions for dynamic linking (`link`) and batch relationship discovery (`relink`) between wiki pages. - Introduced `src/llm_wiki/linker.py` and `src/llm_wiki/merge.py` modules to support relationship discovery and merge strategies. - Enhanced `ingest` workflow to automatically suggest and update page relationships, with options for light or deep linking. - Updated function documentation to reflect new workflows, triggers, and relationship management steps. - Increased version number to 1.3.0 to reflect these significant new knowledge graph and linking features.
v1.1.2
- Added `log.md` for timeline/changelog tracking. - Added `wiki/index.md` as a structured index entry point. - Updated and reorganized dependency list to include `pymupdf`, `numpy`, `httpx`, `openai`, and `mcp` for enhanced PDF and embedding support. - Clarified and expanded CLI and installation instructions in documentation. - Updated security notes about fallback PDF handling in the README. - Improved documentation across AGENTS.md, ROADMAP.md, and new file handling details.
v1.1.0
- Introduced stub page creation for any new [[Dead Link]] encountered during content ingestion. - Added new source files: config.py, embeddings.py, and retrieval.py to enhance functionality. - Removed redundant files: log.md and wiki/index.md as part of project restructuring.
v1.0.4
- Updated the repository URL in metadata from "https://github.com/yourname/llm-wiki" to "https://github.com/Nemo4110/llm-wiki.git". - No functional, interface, or code changes were made.
v1.0.2
No changes detected in this version. - Version bumped to 1.0.2, but no code or documentation changes from previous release. - All features, structure, and instructions remain unchanged.
v1.0.1
Version 1.0.1 - Added `src/llm_wiki/__main__.py` to support direct CLI invocation via `python -m src.llm_wiki`. - Expanded documentation in SKILL.md to include a detailed CLI reference and installation notes. - Added security warnings and dependency version requirements for PDF processing libraries. - No changes to core functions or protocol—this is a documentation and usability update.
v1.0.0
Initial release of llm-wiki skill — cumulative knowledge management for AI agents. - Implements Karpathy’s llm-wiki pattern for managing and querying knowledge. - Supports ingestion of source material, wiki querying, and lint/health checks. - Compatible with Claude Code, OpenClaw, and generic LLM-agent platforms. - Offers both protocol (no dependencies) and CLI (Python + click/pyyaml) modes. - Provides clear separation of sources, wiki content, scripts, and examples. - Detailed installation options and accessible documentation included.
元数据
Slug 041-llm-wiki
版本 1.3.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 8
常见问题

llm-wiki SKILL inspired by Karpathy 是什么?

Karpathy's llm-wiki pattern implementation — cumulative knowledge management for AI agents. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 243 次。

如何安装 llm-wiki SKILL inspired by Karpathy?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install 041-llm-wiki」即可一键安装,无需额外配置。

llm-wiki SKILL inspired by Karpathy 是免费的吗?

是的,llm-wiki SKILL inspired by Karpathy 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

llm-wiki SKILL inspired by Karpathy 支持哪些平台?

llm-wiki SKILL inspired by Karpathy 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 llm-wiki SKILL inspired by Karpathy?

由 T0M0R1N(@nemo4110)开发并维护,当前版本 v1.3.0。

💬 留言讨论