← Back to Skills Marketplace
henryczq

《毛选》1-7卷文本查询

by henryczq · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ⚠ suspicious
207
Downloads
0
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install mao-selected-works
Description
触发:当用户要检索《毛泽东选集》全文、按卷或文章定位原文、按标题/关键词找内容时使用。此 skill 适用于 OpenClaw,本地默认使用结构化检索与关键词检索;只有在配置文件显式开启后,才使用向量召回和重排组成的混合检索。
README (SKILL.md)

毛选检索

Overview

这个 skill 为 OpenClaw 提供《毛泽东选集》的本地知识库能力,包含 2 个部分:

  • scripts/build_index.py:从 Markdown 建立 SQLite FTS 索引,并按配置决定是否生成向量索引
  • scripts/search.py:按卷、篇、标题、关键词或混合检索查询文章与段落

默认模式不依赖外部 API,只使用本地结构化检索与全文检索。只有 config/search.json 中显式开启 rag.enabled 时,才会使用 embedding;只有 rag.rerank.enabled 也打开时,才会继续重排。

默认模型与平台约定:

  • 嵌入模型:BAAI/bge-m3
  • 重排模型:BAAI/bge-reranker-v2-m3
  • 嵌入分批默认按 64 条请求,避免常见平台的单批上限
  • 如果用户没有单独为 embedding / rerank 配置 base_urlapi_key_env,则默认继承 rag.api.base_urlrag.api.api_key_env

何时使用

在这些场景调用本 skill:

  • 用户要找《毛选》第几卷、某篇文章或某个主题出现在哪里
  • 用户给出标题、别名或关键词,希望返回对应文章或相关段落
  • 用户要求开启混合检索,并提供可用的 embedding / rerank API 配置

以下情况不要直接调用本 skill:

  • 用户只是在讨论观点,不需要定位《毛选》原文
  • 当前工作已经有准确文件路径和文本片段,不需要再次建索引或检索

工作流

1. 建立索引

直接运行:

python scripts/build_index.py

默认会建立:

  • 文档级索引:按卷、篇、标题、别名、全文找文章
  • 段落级索引:按关键词找命中片段

索引工具默认直接扫描 data/ 目录下的 Markdown 文件。

如果 config/search.json 里开启了 rag.enabled 并配置好 embedding,则会同时生成向量索引。

2. 配置

API key 建议通过环境变量 MAO_SKILL_API_KEY 配置。

使用配置管理脚本:

# 查看当前配置
python scripts/config.py show

# 修改配置
python scripts/config.py set rag.api.base_url "https://api.siliconflow.cn/v1"
python scripts/config.py set rag.api.api_key_env "MAO_SKILL_API_KEY"
python scripts/config.py set rag.enabled true
python scripts/config.py set chunk_size 1024
python scripts/config.py set chunk_overlap 100

其他模型参数(如 embedding.modelembedding.batch_sizererank.model)直接编辑 config/search.json

3. 查询

常用查询方式:

按卷列文章:

python scripts/search.py catalog --volume 第一卷

按卷和篇直接定位文章:

python scripts/search.py show --volume 1 --chapter 3

按标题找文章:

python scripts/search.py show --title 实践论

按关键词搜段落:

python scripts/search.py search "调查研究"

显式开启混合检索:

python scripts/search.py search "统一战线" --mode hybrid

测试模型连通性:

python scripts/search.py test-model --target embedding
python scripts/search.py test-model --target rerank

如果没有配置 rag.api.base_url,或没有设置环境变量 MAO_SKILL_API_KEY,命令会直接提示用户先设置。

卷、篇、标题、关键词混合过滤:

python scripts/search.py search "调查研究" --volume 1 --chapter 7 --title 本本主义

输出要求

调用本 skill 时,返回结果必须优先给出可核对来源:

  • 卷次
  • 篇次
  • 文章标题
  • 日期(如果有)
  • 命中的段落摘要
  • 源文件路径或文档 ID
  • 检索方式:cataloglexicallexical-likehybridhybrid-rerank

如果没有命中,不要臆造答案。应明确说明:

  • 没有在当前语料中检到
  • 是标题未命中,还是关键词未命中
  • 如果用户允许,可以建议补充别名、整理 metadata 或开启混合检索

参考资料

  • 数据格式:references/corpus-format.md
  • 检索规则:references/retrieval-rules.md
  • 输出结构:references/output-schema.md
Usage Guidance
This skill appears to be a local search tool over the included corpus and is coherent with that purpose — but pay attention before enabling the hybrid/vector (RAG) mode. By default the SKILL.md states it operates locally, but if you turn on rag.enabled the scripts will call embedding/reranker APIs and will look for an API key. Two things to check before installing or using: - Keep rag.enabled false unless you intentionally want remote embedding/reranking. With rag.enabled=false the skill uses local FTS/lexical search only. - If you do enable rag, set a dedicated environment variable (e.g., MAO_SKILL_API_KEY) and a dedicated base_url for only this skill. Do NOT let the skill inherit a platform-wide rag.api.api_key_env value — that could expose a global/API key used by other integrations. Confirm config/search.json is edited to point to a credential you control. - Inspect the scripts (scripts/search.py, build_index.py, common.py) to confirm precisely what is sent to the remote endpoint (full text, excerpts, or just hashes). If you are privacy-sensitive, keep RAG disabled or run embedding with a local model/provider you trust. - Note the skill includes ~400 local Markdown files (large corpus). Confirm you are allowed to host/use this corpus and that local storage use is acceptable. If you want, I can (a) scan the scripts to list exact network calls/endpoints and what payloads they transmit, or (b) suggest exact config edits to ensure the skill never uses external API keys unless you explicitly set them.
Capability Assessment
Purpose & Capability
Name/description, included data/ corpus and the search/build_index scripts align with a local retrieval/search skill for 《毛泽东选集》. The code and docs describe building a local SQLite/FTS index and optional vector indexes — these are expected for the stated purpose.
Instruction Scope
SKILL.md limits external calls to an opt-in RAG mode (rag.enabled) and documents local-only defaults. However the runtime instructions include commands to 'test-model', enable 'rag.enabled' and set API keys; those commands will cause network calls outside of the local corpus if enabled. The instructions otherwise stay within the stated retrieval scope and reference only the local data/ directory.
Install Mechanism
No install spec is declared (instruction-only install). The skill includes Python scripts (build_index.py, search.py, etc.) and reads local Markdown files — no dubious remote download/install behavior is declared.
Credentials
The manifest declares no required env vars, but SKILL.md recommends MAO_SKILL_API_KEY for embeddings and documents that embedding/rerank configuration will inherit rag.api.base_url and rag.api.api_key_env by default. Automatically inheriting a global rag.* API key setting is disproportionate: it risks the skill using platform/global credentials without a clear explicit opt-in. Also the SKILL.md example default base_url (https://api.siliconflow.cn/v1) is a third‑party endpoint; if enabled it would send text/embeddings to an external service. Require explicit, per-skill credential configuration rather than implicit inheritance.
Persistence & Privilege
always is false and there is no install routine that requests permanent system-wide presence or writes to other skill configs. The skill stores and reads its own data/ and config/search.json only.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install mao-selected-works
  3. After installation, invoke the skill by name or use /mao-selected-works
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
优化脚本兼容性
v1.0.0
《毛选》mao-selected-works v1.0.0 实现了《毛泽东选集》本地化、结构化且基于关键词的高效检索。 多维度检索支持:支持按目录、标题、关键词进行查询;用户可灵活选择按“卷、篇、段落”进行精确匹配或混合搜索。 混合检索与重排序 (Hybrid Search):集成了基于向量的语义检索与传统检索技术,通过显式配置支持兼容 API,并提供结果重排序功能,大幅提升搜索准确度。 强大的 CLI 工具链:内置命令行工具,涵盖索引构建 (Index Building)、内容检索以及配置管理等功能。
Metadata
Slug mao-selected-works
Version 1.0.1
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 2
Frequently Asked Questions

What is 《毛选》1-7卷文本查询?

触发:当用户要检索《毛泽东选集》全文、按卷或文章定位原文、按标题/关键词找内容时使用。此 skill 适用于 OpenClaw,本地默认使用结构化检索与关键词检索;只有在配置文件显式开启后,才使用向量召回和重排组成的混合检索。 It is an AI Agent Skill for Claude Code / OpenClaw, with 207 downloads so far.

How do I install 《毛选》1-7卷文本查询?

Run "/install mao-selected-works" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is 《毛选》1-7卷文本查询 free?

Yes, 《毛选》1-7卷文本查询 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does 《毛选》1-7卷文本查询 support?

《毛选》1-7卷文本查询 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created 《毛选》1-7卷文本查询?

It is built and maintained by henryczq (@henryczq); the current version is v1.0.1.

💬 Comments