← 返回 Skills 市场

Knowledge Importer

Name: Knowledge Importer
Author: sunshinegw

作者 sunshinegw · GitHub ↗ · v1.3.0 · MIT-0

cross-platform ✓ 安全检测通过

119

总下载

当前安装

版本数

在 OpenClaw 中安装

/install knowledge-importer

功能描述

将 Word/Excel/PPT/PDF/MD 等格式的文档转换为 Markdown 格式，并保存到 Obsidian 知识库。图片可上传到图床，生成外部 URL 链接。当用户需要：1) 导入文档到知识库 2) 将文件转换为 MD 格式 3) 提取文档内容并保留图片时使用此技能。

使用说明 (SKILL.md)

Knowledge Importer

将各种格式的文档转换为 Markdown 并保存到知识库。

环境配置

首次使用前，请配置以下环境变量或修改 scripts/config.py：

# 图床服务地址
export DUFS_SERVER_URL="http://你的服务器IP:端口"

# 知识库存放路径
export KNOWLEDGE_BASE_PATH="/你的/Obsidian/路径"

支持的格式

格式	扩展名	依赖库	图片处理
Word	.docx	python-docx	✅ 图床/Base64
Excel	.xlsx / .xls	openpyxl	-
PPT	.pptx	python-pptx	✅ 图床/Base64
PDF	.pdf	pdfplumber	✅ 图床/Base64
Markdown	.md	原生支持	-

图片处理方式

方式一：图床上传（推荐）

配置图床服务器后，图片会上传到图床生成外部 URL：

DUFs_CONFIG = {
    "server_url": "http://你的服务器IP:端口",
    "timeout": 30,
    "retry_times": 3,
}

上传路径：http://你的服务器IP:端口/Picture/\x3Cuuid>.png

方式二：Base64 内嵌（备用）

如果图床不可用，自动降级为 Base64 内嵌方式：

![image](data:image/png;base64,iVBORw0KGgo...)

Obsidian CLI 前提条件

在 Obsidian 中启用 CLI：
- Settings → General → Command line interface → 启用
- 按照提示完成注册

CLI 命令格式：

xvfb-run obsidian create name="文件名" content="内容"
xvfb-run obsidian append file="文件" content="内容"

知识库目录结构

默认路径：$KNOWLEDGE_BASE_PATH（见环境配置）

目录结构（两级分类）

知识库/
├── 申报方案/
│   └── \x3C行业/产品>/
├── 解决方案/
│   └── \x3C行业/产品>/
├── 技术文档/
│   └── \x3C行业/产品>/
└── \x3C其他分类>/

分类原则

申报方案/：申报书、投标书、建设方案申请等
解决方案/：面向客户的解决方案文档
技术文档/：产品使用经验、技术部署文档
\x3C其他分类>/：根据需要自定义

使用方式

1. 单文件转换

将 /path/to/document.docx 导入知识库

2. 指定输出目录

将文件导入到 [目录名]

转换规则

文件名：保留原文件名（去掉扩展名）
图片：Word/PPT/PDF 中的图片会提取并上传图床
表格：Excel/PDF 中的表格会保持 Markdown 格式

执行脚本

# 进入脚本目录
cd skills/knowledge-importer/scripts

# 配置环境变量（或修改 config.py）
export DUFS_SERVER_URL="http://192.168.1.100:5000"
export KNOWLEDGE_BASE_PATH="/path/to/Obsidian"

# 单文件转换
python3 import_doc.py /path/to/document.docx

# 批量转换
python3 import_doc.py --batch /path/to/folder

# 查看帮助
python3 import_doc.py --help

依赖安装

pip3 install python-docx python-pptx openpyxl pdfplumber

图床推荐

Dufs：轻量文件服务器，支持上传 API
- Docker 部署：docker run -v /path:/data -p 5000:5000 sigoden/dufs
PicList：支持多种图床
兰空图床：自建图床解决方案

安全使用建议

This skill appears to do what it says: convert docs to Markdown and upload any extracted images to the image host you configure. Before using it: 1) Only set DUFS_SERVER_URL to a server you trust (images will be PUT there); do not point it at unknown or public endpoints if images may contain sensitive info. 2) Set KNOWLEDGE_BASE_PATH carefully to avoid overwriting important files. 3) Ensure dependencies (python-docx, python-pptx, openpyxl, pdfplumber) and the Obsidian CLI/xvfb-run are installed if you intend to use CLI features. 4) Review scripts/config.py (or config.py.example) to confirm upload path, retry behavior, and timeout meet your safety needs. If you want greater assurance, inspect the full import_doc.py (already bundled) and do a test run on non-sensitive documents first.

功能分析

Type: OpenClaw Skill Name: knowledge-importer Version: 1.3.0 The skill bundle provides a legitimate utility for converting various document formats (Word, Excel, PPT, PDF) into Markdown for use in Obsidian knowledge bases. The core logic in `scripts/import_doc.py` uses standard document processing libraries (python-docx, pdfplumber, etc.) and includes a feature to upload extracted images to a user-defined file server (Dufs) via HTTP PUT requests. No evidence of data exfiltration, malicious command execution, or prompt injection was found; the script's behavior aligns strictly with its documented purpose.

能力评估

✓ Purpose & Capability

Name/description (document -> Markdown, save to Obsidian, upload images) align with the included script: import_doc.py parses docx/pptx/pdf/xlsx/md, extracts images and uploads them to a configurable image host, and writes Markdown files into a user-specified knowledge base path.

ℹ Instruction Scope

SKILL.md instructs running the local import script, configuring DUFS_SERVER_URL and KNOWLEDGE_BASE_PATH, and using the Obsidian CLI (via xvfb-run). The instructions cause network PUT uploads to the configured image host and write files into the provided vault path — this is expected but means the image host you specify will receive uploaded images, so only set it to a trusted endpoint and avoid importing sensitive images.

✓ Install Mechanism

No install spec (instruction-only with a bundled script). Dependencies are standard Python packages listed in README/SKILL.md (python-docx, python-pptx, openpyxl, pdfplumber). No third-party downloads or obscure URLs in the bundle.

ℹ Credentials

The skill requires configuring DUFS_SERVER_URL and KNOWLEDGE_BASE_PATH (documented in SKILL.md/README) but the registry metadata listed no required env vars — minor metadata omission. No unexpected credentials are requested.

✓ Persistence & Privilege

always:false and user-invocable:true. The skill does not request permanent platform presence or modify other skills; it writes files only to the user-specified knowledge base path and temporary asset dirs.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install knowledge-importer
安装完成后，直接呼叫该 Skill 的名称或使用 /knowledge-importer 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.3.0

feat: PPT导入默认不提取图片，节省时间和图床空间（可通过 extract_images=True 参数开启）

v1.2.0

feat: 添加自动分类功能，支持 --category 参数指定分类目录；自动根据文件名关键词匹配分类（申报方案/解决方案/技术文档/世校赛/职业教育等）

v1.1.0

fix: PDF嵌入图片提取改用PyMuPDF替代pdf2image

元数据

Slug knowledge-importer

版本 1.3.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 3

常见问题