← 返回 Skills 市场
141553

知识库归档系统

作者 141553 · GitHub ↗ · v1.1.0 · MIT-0
cross-platform ⚠ suspicious
95
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install kb-archiver
功能描述
智能本地知识库归档系统 v1.1.0。支持 AI 智能分类、批量归档、全文搜索、统计报告。 自动将文件分类归档到本地知识库,提取全文索引支持秒级搜索。 小文件存本地、大文件可对接云存储。支持 Excel/Word/PPT/PDF/TXT 等格式。 当用户需要:归档文件、建立知识库、全文检索文档内容、管理大量工作文...
使用说明 (SKILL.md)

知识库归档系统 v1.1.0

智能本地知识库归档方案,支持 AI 智能分类批量归档全文搜索统计报告

快速开始

单文件归档

# 基础用法(关键词分类)
node _scripts/archive.mjs /path/to/file.xlsx

# 指定分类
node _scripts/archive.mjs /path/to/file.xlsx "工作文件"

# AI 智能分类
node _scripts/archive.mjs /path/to/file.xlsx --ai-classify

批量归档

# 归档整个文件夹
node _scripts/archive.mjs /path/to/folder/

# 按文件类型过滤
node _scripts/archive.mjs /path/to/folder/ --pattern "*.xlsx"

# 批量 AI 分类
node _scripts/archive.mjs /path/to/folder/ --ai-classify

搜索

# 搜索关键词
node _scripts/archive.mjs search "门店"

# 按分类过滤
node _scripts/archive.mjs search "数据" --category "工作文件"

统计

node _scripts/archive.mjs stats

目录结构

knowledge-base/
├── 工作文件/          ← 数据报表、销售业绩等
├── 方案文档/          ← 计划方案、策略规划等
├── 参考资料/          ← 话术模板、培训教程等
├── 其他文档/          ← 未分类文档
├── _index/            ← 全文索引
│   ├── _manifest.json ← 归档清单
│   └── *.txt          ← 索引文件
└── _scripts/
    └── archive.mjs

分类说明

分类 关键词 说明
工作文件 数据、报表、统计、门店、业绩、订单 日常运营数据
方案文档 方案、计划、策略、制度、规范 规划类文档
参考资料 话术、模板、培训、教程、案例 学习参考材料
其他文档 - 不属于以上分类

AI 智能分类

使用 --ai-classify 参数启用 AI 分类:

  • 基于文件名 + 内容摘要进行语义分析
  • 自动判断最合适的分类
  • AI 不可用时自动 fallback 到关键词匹配

配置方式(可选):

# 设置环境变量
export OPENCLAW_MODEL="your-model"
export OPENCLAW_API_ENDPOINT="http://localhost:11434/api/chat"

支持格式

格式 提取方式 说明
.xlsx Python openpyxl Excel 表格
.docx ZIP 解析 Word 文档
.pptx ZIP 解析 PowerPoint
.pdf 直接读取 PDF 文本
.txt/.csv/.md/.json/.xml/.html/.log 直接读取 文本文件

云存储对接(可选)

支持腾讯云 COS、AWS S3、阿里云 OSS 等,修改脚本配置即可:

const CLOUD_STORAGE = {
  enabled: true,
  type: 'cos',
  bucket: 'mybucket-1250000000',
  prefix: 'knowledge-base/',
  command: (filepath, remotePath) => `coscmd upload "${filepath}" "${remotePath}"`,
};

FAQ

Q: 大文件如何处理?

A: 超过 10MB 的文件会自动上传到云存储(需配置),本地只保留索引。未配置云存储时会跳过上传但仍创建索引。

Q: 加密/密码保护的文件怎么办?

A: 加密的 Office 文件无法提取内容,会记录错误信息。建议先解密再归档。

Q: 文件损坏无法读取?

A: 脚本会捕获错误并记录,不会中断批量处理。损坏文件的索引会标注 [提取失败: ...]

Q: 如何配置云存储?

腾讯云 COS:

# 安装 coscmd
pip install coscmd
# 配置
coscmd config -a \x3CSecretId> -s \x3CSecretKey> -b \x3CBucket> -r \x3CRegion>

AWS S3:

# 安装 aws-cli
pip install awscli
# 配置
aws configure

阿里云 OSS:

# 安装 ossutil
wget https://gosspublic.alicdn.com/ossutil/1.7.14/ossutil-v1.7.14-linux-amd64.zip
# 配置
./ossutil config

Q: 批量归档时如何跳过已存在的文件?

A: 脚本会自动检测清单中是否已存在相同文件名和大小的记录,已归档的文件会自动跳过。

Q: 搜索结果太多怎么办?

A: 使用 --category 参数按分类过滤,缩小搜索范围。

边界场景

场景 处理方式
文件不存在 报错退出
文件名重复 自动添加序号后缀
不支持的格式 创建索引但标注不支持提取
AI 分类失败 自动 fallback 到关键词分类
云存储上传失败 记录错误,继续创建本地索引
批量处理中断 已处理的文件已保存,可重新运行继续

更新日志

v1.1.0

  • ✨ AI 智能分类(基于语义分析)
  • ✨ 批量归档整个文件夹
  • ✨ 搜索命令(高亮匹配结果)
  • ✨ 统计命令(分类统计、总体统计)
  • ✨ 进度条显示
  • ✨ 文件过滤(--pattern)
  • 📝 文档优化(FAQ、边界场景)

v1.0.0

  • 🎉 初始版本
  • ✨ 自动分类(关键词匹配)
  • ✨ 全文索引
  • ✨ 云存储支持
安全使用建议
This skill appears to implement the claimed archiving and search features, but take precautions before running it on sensitive data: - Metadata omission: The skill metadata declares no env vars, yet the script uses OPENCLAW_MODEL and OPENCLAW_API_ENDPOINT and supports cloud uploads that require secrets. Treat any model endpoints or cloud credentials you provide as able to receive extracted document content. - Shell execution risks: The script uses execSync with interpolated file paths and runs Python code via shell. Filenames containing quotes or special characters could cause command-line injection in some cases. Avoid running the tool on untrusted filenames or run it in a sandbox/container. - Data exfiltration surface: If you enable cloud storage or set OPENCLAW_API_ENDPOINT to a remote service, document contents (including large-file uploads or AI classification payloads) may be transmitted off your machine. Only configure endpoints/credentials you trust. - Recommended actions before use: review the full archive.mjs content (especially how file paths are used in execSync), run the script in an isolated environment (container or VM), test with non-sensitive sample files, and prefer local-only operation (keep CLOUD_STORAGE.enabled=false and do not set OPENCLAW_API_ENDPOINT) if you must index private documents. If you want higher confidence, provide the full (non-truncated) script or confirm whether filenames are sanitized and whether the script escapes shell arguments when calling external commands.
功能分析
Type: OpenClaw Skill Name: kb-archiver Version: 1.1.0 The skill is classified as suspicious due to multiple critical Command Injection vulnerabilities in `_scripts/archive.mjs`. The script frequently uses `execSync` to invoke Python for document parsing (Excel, Word, PPT) and cloud storage tools, interpolating file paths directly into shell commands without adequate sanitization. While these behaviors appear intended for the stated purpose of archiving and indexing files, the lack of input validation allows for arbitrary code execution via maliciously crafted filenames. No evidence of intentional data exfiltration or backdoors was found, but the implementation is highly insecure.
能力评估
Purpose & Capability
The name/description (local KB archiver with AI classification, indexing, cloud upload) align with the included script's functionality. The script implements extraction, indexing, AI classification and optional cloud upload, which are expected for this purpose. However, the SKILL metadata declares no required env vars/credentials while the README and code reference OPENCLAW_MODEL / OPENCLAW_API_ENDPOINT and optional cloud upload configuration—an inconsistency between declared requirements and actual runtime behavior.
Instruction Scope
SKILL.md advises running the included Node script which will read arbitrary files/folders to extract content, call external tools, and optionally upload large files. The script executes shell commands (execSync) to call 'openclaw chat', python3 snippets, and user-defined cloud commands. Those instructions allow the agent to read and process any filesystem path you pass in, and to send file contents to configured external endpoints (OpenClaw API, cloud storage) — which is expected functionality but also a potential data-exfiltration vector if misconfigured.
Install Mechanism
There is no install spec (instruction-only plus a script file), so nothing is automatically downloaded during install. That reduces installer-level risk. The SKILL.md recommends installing third-party CLIs (coscmd, awscli, ossutil) via pip/wget; those are external installation steps the user would perform, and the script will call them if cloud upload is enabled.
Credentials
Metadata lists no required environment variables or credentials, but the script reads environment variables (OPENCLAW_MODEL, OPENCLAW_API_ENDPOINT) and has hooks for cloud storage commands that require cloud secrets (COS/S3/OSS). Asking for cloud credentials or model endpoints is proportionate to optional cloud/upload and AI classification features — but these env vars/credentials are not declared in the skill metadata, which is a mismatch and means a user might not realize what secrets the skill can use or require.
Persistence & Privilege
The skill is not marked always:true and does not request system-wide config paths. It runs as a script when invoked and does not claim persistent or privileged presence in the agent environment.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install kb-archiver
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /kb-archiver 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.1.0
v1.1.0: AI智能分类、批量归档、搜索命令、统计命令、文档优化升级
v1.0.0
首次发布:支持自动分类、全文索引、分级存储
元数据
Slug kb-archiver
版本 1.1.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

知识库归档系统 是什么?

智能本地知识库归档系统 v1.1.0。支持 AI 智能分类、批量归档、全文搜索、统计报告。 自动将文件分类归档到本地知识库,提取全文索引支持秒级搜索。 小文件存本地、大文件可对接云存储。支持 Excel/Word/PPT/PDF/TXT 等格式。 当用户需要:归档文件、建立知识库、全文检索文档内容、管理大量工作文... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 95 次。

如何安装 知识库归档系统?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install kb-archiver」即可一键安装,无需额外配置。

知识库归档系统 是免费的吗?

是的,知识库归档系统 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

知识库归档系统 支持哪些平台?

知识库归档系统 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 知识库归档系统?

由 141553(@141553)开发并维护,当前版本 v1.1.0。

💬 留言讨论