← 返回 Skills 市场
hosuke

Llmwiki

作者 Huang Geyang · GitHub ↗ · v0.8.0 · MIT-0
cross-platform ⚠ suspicious
297
总下载
0
收藏
0
当前安装
17
版本数
在 OpenClaw 中安装
/install llmwiki
功能描述
LLM-powered personal knowledge base. Raw documents in, an LLM compiles them into a structured interlinked wiki with trilingual articles, emergent taxonomy, a...
使用说明 (SKILL.md)

llmwiki

A personal knowledge base that an LLM compiles, not just stores. Raw documents go in, an LLM writes trilingual (EN / 中文 / 日本語) wiki articles with [[wiki-links]], backlinks, and an emergent taxonomy. The MCP server dispatches every tool through llmwiki/operations.py; the CLI exposes the same registry via llmbase ops call; individual HTTP/CLI wrappers are being migrated onto the registry over time.

Setup

pip install llmwiki

mkdir my-kb && cd my-kb

cat > .env \x3C\x3C 'EOF'
LLMBASE_API_KEY=sk-your-key
LLMBASE_BASE_URL=https://your-endpoint/v1
LLMBASE_MODEL=your-model
# Optional: LLMBASE_FALLBACK_MODELS=backup-1,backup-2
EOF

cat > config.yaml \x3C\x3C 'EOF'
llm:
  max_tokens: 16384
paths:
  raw: "./raw"
  wiki: "./wiki"
EOF

Commands

Command Description
llmbase ingest url \x3Curl> Ingest a web article
llmbase ingest pdf \x3Cfile> Ingest a PDF (auto-chunks)
llmbase ingest file \x3Cfile> Ingest any local file
llmbase ingest dir \x3Cdir> Ingest all files from a directory
llmbase ingest cbeta-learn --batch 10 Corpus plugin: Buddhist canon
llmbase ingest ctext-book 论语 /analects/zh Corpus plugin: Chinese classics
llmbase compile new Compile new raw docs incrementally (3-layer dedup)
llmbase compile all Full rebuild
llmbase compile index Rebuild index + aliases
llmbase query "\x3Cq>" Ask a question (single-pass; add --deep for multi-step research)
llmbase query "\x3Cq>" --tone wenyan 📜 classical Chinese voice
llmbase query "\x3Cq>" --tone scholar 🎓 academic voice
llmbase query "\x3Cq>" --tone eli5 👶 simple voice
llmbase query "\x3Cq>" --tone caveman 🦴 primitive voice
llmbase query "\x3Cq>" --file-back File answer back into the wiki
llmbase lint check 8-category structural health check
llmbase lint heal Check → fix → re-check → report
llmbase lint deep LLM deep quality analysis
llmbase web Web UI at :5555
llmbase serve Agent HTTP API at :5556
llmbase mcp Start MCP server (stdio)
llmbase stats KB statistics

MCP Integration (for AI clients)

{
  "mcpServers": {
    "llmwiki": {
      "command": "python",
      "args": ["-m", "llmwiki", "--base-dir", "/path/to/my-kb"]
    }
  }
}

Tools exposed by the MCP server:

Tool Purpose
kb_search Full-text search over compiled concepts
kb_search_raw Verbatim full-text fallback over raw/ sources (v0.6.2+)
kb_ask Deep-research Q&A with tone modes
kb_get Get article by slug or alias (, kong, emptiness all work)
kb_list List articles, filter by tag
kb_backlinks Find articles citing a given article
kb_taxonomy Multilingual category tree
kb_stats Article count, word count
kb_xici Guided reading (导读)
kb_ingest Ingest a URL
kb_compile Compile raw → wiki
kb_lint Health check / auto-fix
kb_export / kb_export_article / kb_export_tag / kb_export_graph Structured export for downstream projects

All tools are declared in llmwiki/operations.py — downstream projects register custom ops via operations.register(...) and they become available on CLI + MCP automatically.

Agents mounted on this server can answer from compiled concepts, fall back to raw sources with kb_search_raw when compile glossed a detail, ingest new material mid-session, and trigger healing.

Workflows

Build a KB from scratch

llmbase ingest url https://example.com/topic
llmbase ingest pdf ./paper.pdf
llmbase compile new
llmbase query "What are the key concepts?"
llmbase lint heal

Autonomous mode (deploy once, server keeps learning)

# config.yaml
worker:
  enabled: true
  learn_source: cbeta         # built-in: cbeta | wikisource | both; custom via register_learn_source()
  learn_interval_hours: 6
  compile_interval_hours: 1
  health_check_interval_hours: 24

health:
  auto_fix_broken_links: true
  max_stubs_per_run: 10

The worker starts under the production WSGI entrypoint (wsgi.pystart_worker_thread). Deploy with gunicorn wsgi:app; llmbase web alone does not self-start the worker.

Daily use as agent memory

  1. Agent receives a task → calls kb_search for relevant concepts
  2. If the compiled answer is too abstract → calls kb_search_raw for verbatim detail
  3. Learns something new → calls kb_ingest with the URL
  4. Optionally kb_compile to fold it into concepts for next session
  5. Periodically kb_lint heals the graph

Key Concepts

  • Synthesis, not archiving — LLM reads raw material and writes composed articles; storage is the cheap part
  • Two-layer recallkb_search (concepts) + kb_search_raw (verbatim raw sources)
  • Trilingual default — every article has EN / 中文 / 日本語 sections
  • 叠加进化 — new data merges into existing concepts, never overwrites
  • Domain-agnostic — taxonomy emerges per-domain, nothing hardcoded
  • Self-healing — 7-step auto-fix pipeline repairs drift
  • Alias resolution[[参禅]]can-chan.md across scripts and simplified/traditional
  • Registry-backed ops — MCP dispatches every tool through operations.py; CLI exposes the same registry via llmbase ops list / llmbase ops call; direct HTTP/CLI wrappers are being migrated onto the registry

Tips

  • --file-back saves Q&A answers into the wiki so future queries benefit
  • --tone wenyan for Chinese users (classical Chinese responses)
  • Run llmbase lint heal after large ingestion batches
  • Web UI /health has buttons for every repair op
  • Knowledge graph at /graph — density slider for large KBs
  • Timeline at /explore — requires entities: { enabled: true } in config

Security & Privacy

  • All data stays local — wiki files are plain markdown on your filesystem
  • LLM API key — user-supplied, loaded from .env
  • Network access — user-initiated (URL ingest, SSRF-protected) plus corpus plugins (cbeta-learn, wikisource-learn, ctext-book) and the autonomous worker when enabled
  • Web server — optional; binds 0.0.0.0 so LAN-accessible by default — front with a reverse proxy or bind override for public exposure
  • API secret — cloud deployments (with PORT env) gate most mutating endpoints behind LLMBASE_API_SECRET (auto-generated if unset). Note: /api/ask is open by default and writes Q&A back via file_back; only promotion to concepts requires the secret
  • Autonomous worker — opt-in via config, disabled by default
  • No telemetry — nothing is sent anywhere except the configured LLM API
安全使用建议
This skill appears to be what it claims (a local, LLM-backed personal wiki) but there are two things to check before installing: 1) The SKILL.md asks you to 'pip install llmwiki' and to provide an LLM API key and optional base URL/model — verify the PyPI package matches the GitHub repo (review the repo and PyPI page) so you know what code will run. 2) The registry metadata did not list these env vars/install steps even though SKILL.md does — treat that as an inconsistency and prefer the SKILL.md but verify sources. Operational cautions: run the package in a sandboxed environment or VM if possible; don't enable the autonomous worker or public web UI until you confirm configuration (bind to localhost, require auth); avoid ingesting sensitive local files unless you understand where the wiki stores and transmits data; and restrict the LLM API key scope/usage and monitor network activity. If you want to proceed, audit the GitHub repo (especially any setup/operations scripts) or run tests in a controlled environment first.
功能分析
Type: OpenClaw Skill Name: llmwiki Version: 0.8.0 The llmwiki skill is a personal knowledge base tool that uses LLMs to compile documents into a structured wiki. It includes features for ingesting web content, PDFs, and specific historical corpora (CBETA, Wikisource). While it requests network and filesystem permissions, these are consistent with its stated functionality of fetching content and storing markdown files locally. The documentation in SKILL.md includes security notes regarding SSRF protection and web server exposure, and there are no indicators of data exfiltration, unauthorized execution, or malicious prompt injection.
能力标签
requires-sensitive-credentials
能力评估
Purpose & Capability
The SKILL.md describes an LLM-powered personal KB (ingest, compile, query, web UI, MCP). The declared requirements in SKILL.md (LLM API key, optional base URL/model, network/filesystem/server permissions) are coherent with that purpose. However, the registry metadata above lists no required env vars and no install spec, which conflicts with the SKILL.md.
Instruction Scope
Runtime instructions include pip install, setting LLMBASE_* env vars, ingesting arbitrary URLs and local files/dirs, compiling to local wiki paths, and starting optional servers and an autonomous worker. Those actions are expected for a KB but permit broad operations (fetching remote URLs, reading local files, and writing under raw/ and wiki/). The worker auto‑fetch behavior is opt‑in but could fetch external sources if enabled.
Install Mechanism
SKILL.md lists 'pip install llmwiki' (a standard PyPI install). Installing from PyPI is common but runs third‑party code on the host. The registry previously indicated 'No install spec' which is inconsistent with the instruction; that mismatch should be resolved (confirm package identity and provenance on PyPI/GitHub before installing).
Credentials
SKILL.md requires an LLM API key (LLMBASE_API_KEY) and offers optional LLMBASE_BASE_URL, LLMBASE_MODEL, and fallback list — these are proportionate for an LLM-driven tool. Again, registry metadata earlier reported no required env vars, so there's an inconsistency between declared runtime requirements and the registry record.
Persistence & Privilege
The skill can run a web UI, an agent HTTP API, and an MCP server and supports an autonomous worker when enabled. 'always' is false and autonomous invocation is platform default; the skill does not request forced/global persistence. Still, starting network services and a persistent worker are privileged actions the user should intentionally enable and configure (e.g., bind interfaces, firewall, authentication).
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install llmwiki
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /llmwiki 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.8.0
BREAKING: rename tools/ → llmwiki/ package namespace. All imports change from 'from tools.xxx' to 'from llmwiki.xxx'. MCP invocation: python -m llmwiki. CLI (llmbase) and PyPI name (llmwiki) unchanged.
v0.7.10
v0.7.10: wikisource — preserve {{*|content}} small-note template (王弼注/河上公章句/七家注 content was being stripped at ingest)
v0.7.9
tools.anchor: locate_span + normalize_text — annotation→annotated-span alignment primitive for kepan, citations, targeted comments. Pure string algorithm; offsets into ORIGINAL content; normalize_text exposed for JS frontend mirror. siwen 议 己.
v0.7.8
v0.7.8: chat_with_meta + reasoning_budget (议 5-甲乙). Siwen 5th-batch post-mortem — 11h wenguan failure root cause was absence of finish_reason='length' detection. chat_with_meta surfaces finish_reason/usage(incl. reasoning_tokens)/attempts/truncated. reasoning_budget(max_tokens, tokens_per_char, safety=0.8) is a pure calculator, no upstream model table. chat() is a thin wrapper — zero v0.7.x break. docs/pipelines.md adds 'Choosing the cid' + 'Sizing chunks' pattern sections (戊). 议 丙 aggregate_and_fallback deferred, 议 丁 siwen-side. 3 Codex rounds fixed dict-shaped usage / inf overflow / int-too-large-for-float. 419 tests.
v0.7.7
tools.pipeline 议 D — composable multi-stage primitives: run_stage contextmanager (driver guarantees ok/failed/partial terminal per run), rebuild_state (log is truth, state is view), StageLock (atomic tempfile+os.link + fcntl.flock breaker, strict TTL). Opaque stage/key/meta; no DAG, no scheduler. 13 Codex review rounds. 80 pipeline tests; 391 total.
v0.7.2
/api/articles/lite gains ?tag=<slug> server-side filter (index.json-backed, no frontmatter parse) + opt-in browser cache via LLMBASE_LITE_CACHE_MAX_AGE env var. ETag now keyed on tag param so distinct slices never share a 304. Driven by siwen.ink (~13k articles) sidebar payload pain. Default Cache-Control behaviour unchanged.
v0.6.8
Web-UI compile button survives navigation (closes #7): GET /api/worker/status reports {busy:bool}; Ingest.tsx polls on mount, recovers in-flight compile state, mounted-ref guards on every post-await setState; typed ApiError in lib/api.ts. (Backfilled to PyPI 2026-04-18.)
v0.6.7
Hardened /api/ask model override (require raw API_SECRET when secret set); fixed URL-slug corruption (#5) + heal_urly_slugs pass; LLMBASE_HTTP_TIMEOUT/CONNECT_TIMEOUT env vars (#6); LLMBASE_MODEL_ALLOWLIST; llmbase -v/-vv/-vvv CLI verbosity. (Backfilled to PyPI 2026-04-18.)
v0.7.1
Section-slicing API for long articles: tools/sections.py + kb_get_sections op + GET /api/articles/{slug}/sections + kb_get section= subtree extraction. Anchor format h{level}-{slug-short}-{hash6}, stable across cosmetic title edits + sibling reorder. Codex pre-commit caught 5 issues (HIGH: path-traversal x2; MEDIUM: fence-close, ATX heading edge cases; LOW: hash collisions) — all fixed.
v0.6.9
Mermaid render in Markdown component (lazy-loaded, theme-aware) + deep-nest CSS for ul/ol up to 8 layers (bullet rotation + outline rail). Frontend-only release driven by 斯文·太虛間 (太虛大師全書 reading library).
v0.6.6
v0.6.6: per-request model override on /api/ask + UTF-8 surrogate sanitize. (A) /api/ask body now accepts `model` field, threaded through kb_ask Operation → query()/query_with_search() → chat(). (C) Lone surrogates (U+D800-U+DFFF) sneaking in via half-decoded HTML/PDF ingest no longer crash deep RAG — sanitized at chat_with_context and at ingest write.
v0.6.5
v0.6.5: fix #4 — .env now discovered correctly under pipx/PyPI installs. New lookup order: LLMBASE_ENV_FILE → $PWD/.env (when config.yaml declares llmbase paths) → ~/.config/llmbase/.env → package dir. Shell exports still win.
v0.6.4
v0.6.4: /api/articles scales — pagination (limit/cursor/tag/q/fields), new /api/articles/lite endpoint, RFC 7232 ETag + 304 on articles/taxonomy. 12k-article sidebar load: 3.66MB/3.5s → 500KB/0.3s.
v0.6.3
v0.6.3: TF-IDF prefilter in query_with_search caps the LLM selector prompt at O(top_k), unblocking kb_ask deep=true and promote=true on KBs above ~10k articles (observed: 11,625-article KB previously exceeded upstream context windows). Configurable via config.yaml::query.prefilter_threshold (500) and .prefilter_top_k (200).
v0.6.2
v0.6.2: kb_search_raw raw-source fallback + lint perf cache; SKILL.md corrected for CLI command name (llmbase not llmwiki), accurate security posture, and up-to-date MCP tool list
v0.1.1
Fix security metadata: declare credentials, permissions, install steps. Add Security section.
v0.1.0
Initial release: CLI commands, workflows, MCP integration, self-healing KB
元数据
Slug llmwiki
版本 0.8.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 17
常见问题

Llmwiki 是什么?

LLM-powered personal knowledge base. Raw documents in, an LLM compiles them into a structured interlinked wiki with trilingual articles, emergent taxonomy, a... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 297 次。

如何安装 Llmwiki?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install llmwiki」即可一键安装,无需额外配置。

Llmwiki 是免费的吗?

是的,Llmwiki 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Llmwiki 支持哪些平台?

Llmwiki 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Llmwiki?

由 Huang Geyang(@hosuke)开发并维护,当前版本 v0.8.0。

💬 留言讨论