← Back to Skills Marketplace
sunshinegw

Knowledge Importer

by sunshinegw · GitHub ↗ · v1.3.0 · MIT-0
cross-platform ✓ Security Clean
119
Downloads
1
Stars
0
Active Installs
3
Versions
Install in OpenClaw
/install knowledge-importer
Description
将 Word/Excel/PPT/PDF/MD 等格式的文档转换为 Markdown 格式,并保存到 Obsidian 知识库。图片可上传到图床,生成外部 URL 链接。当用户需要:1) 导入文档到知识库 2) 将文件转换为 MD 格式 3) 提取文档内容并保留图片时使用此技能。
README (SKILL.md)

Knowledge Importer

将各种格式的文档转换为 Markdown 并保存到知识库。

环境配置

首次使用前,请配置以下环境变量或修改 scripts/config.py

# 图床服务地址
export DUFS_SERVER_URL="http://你的服务器IP:端口"

# 知识库存放路径
export KNOWLEDGE_BASE_PATH="/你的/Obsidian/路径"

支持的格式

格式 扩展名 依赖库 图片处理
Word .docx python-docx ✅ 图床/Base64
Excel .xlsx / .xls openpyxl -
PPT .pptx python-pptx ✅ 图床/Base64
PDF .pdf pdfplumber ✅ 图床/Base64
Markdown .md 原生支持 -

图片处理方式

方式一:图床上传(推荐)

配置图床服务器后,图片会上传到图床生成外部 URL:

DUFs_CONFIG = {
    "server_url": "http://你的服务器IP:端口",
    "timeout": 30,
    "retry_times": 3,
}

上传路径:http://你的服务器IP:端口/Picture/\x3Cuuid>.png

方式二:Base64 内嵌(备用)

如果图床不可用,自动降级为 Base64 内嵌方式:

![image](data:image/png;base64,iVBORw0KGgo...)

Obsidian CLI 前提条件

  1. 在 Obsidian 中启用 CLI

    • Settings → General → Command line interface → 启用
    • 按照提示完成注册
  2. CLI 命令格式

    xvfb-run obsidian create name="文件名" content="内容"
    xvfb-run obsidian append file="文件" content="内容"
    

知识库目录结构

默认路径:$KNOWLEDGE_BASE_PATH(见环境配置)

目录结构(两级分类)

知识库/
├── 申报方案/
│   └── \x3C行业/产品>/
├── 解决方案/
│   └── \x3C行业/产品>/
├── 技术文档/
│   └── \x3C行业/产品>/
└── \x3C其他分类>/

分类原则

  • 申报方案/:申报书、投标书、建设方案申请等
  • 解决方案/:面向客户的解决方案文档
  • 技术文档/:产品使用经验、技术部署文档
  • \x3C其他分类>/:根据需要自定义

使用方式

1. 单文件转换

将 /path/to/document.docx 导入知识库

2. 指定输出目录

将文件导入到 [目录名]

转换规则

  • 文件名:保留原文件名(去掉扩展名)
  • 图片:Word/PPT/PDF 中的图片会提取并上传图床
  • 表格:Excel/PDF 中的表格会保持 Markdown 格式

执行脚本

# 进入脚本目录
cd skills/knowledge-importer/scripts

# 配置环境变量(或修改 config.py)
export DUFS_SERVER_URL="http://192.168.1.100:5000"
export KNOWLEDGE_BASE_PATH="/path/to/Obsidian"

# 单文件转换
python3 import_doc.py /path/to/document.docx

# 批量转换
python3 import_doc.py --batch /path/to/folder

# 查看帮助
python3 import_doc.py --help

依赖安装

pip3 install python-docx python-pptx openpyxl pdfplumber

图床推荐

  • Dufs:轻量文件服务器,支持上传 API
    • Docker 部署:docker run -v /path:/data -p 5000:5000 sigoden/dufs
  • PicList:支持多种图床
  • 兰空图床:自建图床解决方案
Usage Guidance
This skill appears to do what it says: convert docs to Markdown and upload any extracted images to the image host you configure. Before using it: 1) Only set DUFS_SERVER_URL to a server you trust (images will be PUT there); do not point it at unknown or public endpoints if images may contain sensitive info. 2) Set KNOWLEDGE_BASE_PATH carefully to avoid overwriting important files. 3) Ensure dependencies (python-docx, python-pptx, openpyxl, pdfplumber) and the Obsidian CLI/xvfb-run are installed if you intend to use CLI features. 4) Review scripts/config.py (or config.py.example) to confirm upload path, retry behavior, and timeout meet your safety needs. If you want greater assurance, inspect the full import_doc.py (already bundled) and do a test run on non-sensitive documents first.
Capability Analysis
Type: OpenClaw Skill Name: knowledge-importer Version: 1.3.0 The skill bundle provides a legitimate utility for converting various document formats (Word, Excel, PPT, PDF) into Markdown for use in Obsidian knowledge bases. The core logic in `scripts/import_doc.py` uses standard document processing libraries (python-docx, pdfplumber, etc.) and includes a feature to upload extracted images to a user-defined file server (Dufs) via HTTP PUT requests. No evidence of data exfiltration, malicious command execution, or prompt injection was found; the script's behavior aligns strictly with its documented purpose.
Capability Assessment
Purpose & Capability
Name/description (document -> Markdown, save to Obsidian, upload images) align with the included script: import_doc.py parses docx/pptx/pdf/xlsx/md, extracts images and uploads them to a configurable image host, and writes Markdown files into a user-specified knowledge base path.
Instruction Scope
SKILL.md instructs running the local import script, configuring DUFS_SERVER_URL and KNOWLEDGE_BASE_PATH, and using the Obsidian CLI (via xvfb-run). The instructions cause network PUT uploads to the configured image host and write files into the provided vault path — this is expected but means the image host you specify will receive uploaded images, so only set it to a trusted endpoint and avoid importing sensitive images.
Install Mechanism
No install spec (instruction-only with a bundled script). Dependencies are standard Python packages listed in README/SKILL.md (python-docx, python-pptx, openpyxl, pdfplumber). No third-party downloads or obscure URLs in the bundle.
Credentials
The skill requires configuring DUFS_SERVER_URL and KNOWLEDGE_BASE_PATH (documented in SKILL.md/README) but the registry metadata listed no required env vars — minor metadata omission. No unexpected credentials are requested.
Persistence & Privilege
always:false and user-invocable:true. The skill does not request permanent platform presence or modify other skills; it writes files only to the user-specified knowledge base path and temporary asset dirs.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install knowledge-importer
  3. After installation, invoke the skill by name or use /knowledge-importer
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.3.0
feat: PPT导入默认不提取图片,节省时间和图床空间(可通过 extract_images=True 参数开启)
v1.2.0
feat: 添加自动分类功能,支持 --category 参数指定分类目录;自动根据文件名关键词匹配分类(申报方案/解决方案/技术文档/世校赛/职业教育等)
v1.1.0
fix: PDF嵌入图片提取改用PyMuPDF替代pdf2image
Metadata
Slug knowledge-importer
Version 1.3.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 3
Frequently Asked Questions

What is Knowledge Importer?

将 Word/Excel/PPT/PDF/MD 等格式的文档转换为 Markdown 格式,并保存到 Obsidian 知识库。图片可上传到图床,生成外部 URL 链接。当用户需要:1) 导入文档到知识库 2) 将文件转换为 MD 格式 3) 提取文档内容并保留图片时使用此技能。 It is an AI Agent Skill for Claude Code / OpenClaw, with 119 downloads so far.

How do I install Knowledge Importer?

Run "/install knowledge-importer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Knowledge Importer free?

Yes, Knowledge Importer is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Knowledge Importer support?

Knowledge Importer is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Knowledge Importer?

It is built and maintained by sunshinegw (@sunshinegw); the current version is v1.3.0.

💬 Comments