Description

文档处理与转换技能，基于 MarkItDown 工具。支持将 PDF、Word、PowerPoint、Excel、图片、音频等多种格式文件批量转换为 Markdown。适用于文档数字化、知识库构建、内容提取等场景。

README (SKILL.md)

\r \r

文档转换技能 (convert-markdown)\r

Name: 文档整理技能 (convert-markdown)
Author: byteuser1977

\r

概述\r

\r MarkItDown 是 Microsoft 开发的多功能文档转换工具，能够将各种文件格式高质量转换为 Markdown 格式。本技能提供完整的文档处理工作流，包括：\r \r

多格式支持：PDF、DOCX、PPTX、XLSX、图片、音频、HTML、CSV、JSON、ZIP、EPub、YouTube URLs 等\r
结构化保留：保持标题、列表、表格、链接等重要文档结构\r
批量处理：支持目录递归处理和批量转换\r
OCR 能力：图片和扫描 PDF 的文本识别\r
音频转录：音频文件的语音转文本\r
可扩展性：可选依赖组按需安装，适配不同需求场景\r \r

快速开始\r

\r

1. 环境准备\r

\r 确保已安装 Python 3.10 或更高版本。建议使用虚拟环境：\r \r

# 创建虚拟环境\r
python -m venv .venv\r
\r
# 激活虚拟环境\r
# Windows:\r
.venv\Scripts\activate\r
# Linux/Mac:\r
source .venv/bin/activate\r
```\r
\r
### 2. 安装 MarkItDown\r
\r
```bash\r
# 安装完整功能（推荐）\r
pip install 'markitdown[all]'\r
\r
# 或按需安装特定格式支持\r
pip install 'markitdown[pdf,docx,pptx]'\r
```\r
\r
可选依赖组说明：\r
- `[all]` - 所有格式支持（PDF、Office、图片、音频、HTML 等）\r
- `[pdf]` - PDF 处理（包含 OCR）\r
- `[docx]` - Word 文档\r
- `[pptx]` - PowerPoint\r
- `[xlsx]` - Excel\r
- `[image]` - 图片 EXIF 和 OCR\r
- `[audio]` - 音频转录\r
- `[html]` - HTML 转换\r
- `[ytdlp]` - YouTube 下载\r
\r
### 3. 基本使用\r
\r
#### NPX CLI 方式（推荐）\r
\r
本技能提供 NPX CLI 工具，可直接通过 npx 命令调用：\r
\r
```bash\r
# 查看帮助\r
npx convert-markdown\r
\r
# 转换单个文件\r
npx convert-markdown convert --input document.pdf --output document.md\r
\r
# 转换目录\r
npx convert-markdown convert --input ./docs --output ./markdown\r
\r
# 批量转换（指定格式）\r
npx convert-markdown batch --source ./docs --target ./markdown --include .pdf,.docx\r
\r
# 覆盖已存在文件\r
npx convert-markdown convert --input document.pdf --output document.md --overwrite\r
```\r
\r
**CLI 命令说明：**\r
\r
| 命令 | 说明 | 参数 |\r
|------|------|------|\r
| `convert` | 转换文件或目录 | `--input`, `--output`, `--overwrite` |\r
| `batch` | 批量转换目录 | `--source`, `--target`, `--include`, `--exclude` |\r
\r
#### MarkItDown 命令行方式\r
\r
转换单个文件：\r
```bash\r
markitdown document.pdf > document.md\r
markitdown presentation.pptx -o slides.md\r
```\r
\r
批量处理目录：\r
```bash\r
# 转换当前目录所有支持文件\r
markitdown *.pdf *.docx *.pptx\r
\r
# 递归处理子目录\r
markitdown ./docs/ --recursive\r
\r
# 输出到指定目录\r
markitdown ./source/ -o ./output/\r
```\r
\r
#### Python API 方式\r
\r
```python\r
from markitdown import MarkItDown\r
\r
# 创建转换器实例\r
md = MarkItDown()\r
\r
# 转换文件\r
result = md.convert("document.pdf")\r
print(result.text_content)\r
\r
# 转换并保存\r
with open("output.md", "w", encoding="utf-8") as f:\r
    f.write(result.text_content)\r
```\r
\r
## 常见任务\r
\r
### 任务 1: 批量转换知识库文档\r
\r
将大量文档批量转换为 Markdown 格式，便于建立搜索索引：\r
\r
```bash\r
# 创建输出目录\r
mkdir converted_docs\r
\r
# 批量转换并保存\r
markitdown ./source_documents/ --recursive -o ./converted_docs/\r
```\r
\r
### 任务 2: 处理扫描版 PDF\r
\r
对于扫描的 PDF 文件，需要安装 OCR 依赖：\r
\r
```bash\r
pip install 'markitdown[pdf]'  # 包含 OCR 功能\r
markitdown scanned_document.pdf -o text.md\r
```\r
\r
### 任务 3: 提取表格数据\r
\r
MarkItDown 能够保留原始表格结构：\r
\r
```bash\r
markitdown financial_report.xlsx > report.md\r
# 输出中的表格将保持 Markdown 表格格式\r
```\r
\r
### 任务 4: 处理多媒体文件\r
\r
支持图片 OCR 和音频转录：\r
\r
```bash\r
# 提取图片中的文字\r
markitdown screenshot.png > extracted_text.md\r
\r
# 转换音频为文字记录\r
markitdown podcast.mp3 > transcript.md\r
```\r
\r
### 任务 5: 集成到自动化流程\r
\r
在 Python 脚本中使用：\r
\r
```python\r
from pathlib import Path\r
from markitdown import MarkItDown\r
\r
def convert_directory(input_dir, output_dir):\r
    """批量转换目录中的所有支持文件"""\r
    md = MarkItDown()\r
    input_path = Path(input_dir)\r
    output_path = Path(output_dir)\r
    output_path.mkdir(exist_ok=True)\r
\r
    for file_path in input_path.rglob("*"):\r
        if file_path.is_file():\r
            try:\r
                result = md.convert(str(file_path))\r
                rel_path = file_path.relative_to(input_path)\r
                output_file = output_path / rel_path.with_suffix('.md')\r
                output_file.parent.mkdir(parents=True, exist_ok=True)\r
                output_file.write_text(result.text_content, encoding='utf-8')\r
                print(f"✓ {file_path} -> {output_file}")\r
            except Exception as e:\r
                print(f"✗ {file_path}: {e}")\r
\r
# 使用示例\r
convert_directory("./raw_docs/", "./markdown_docs/")\r
```\r
\r
## 高级配置\r
\r
### 自定义转换选项\r
\r
```python\r
from markitdown import MarkItDown, StreamConverter\r
\r
# 使用流式转换（处理大文件）\r
with open("large_file.pdf", "rb") as f:\r
    md = MarkItDown()\r
    result = md.convert_stream(f)\r
    print(result.text_content)\r
```\r
\r
### 插件系统\r
\r
MarkItDown 支持自定义转换器插件。如需扩展支持特殊格式，可开发自定义 DocumentConverter：\r
\r
```python\r
from markitdown import DocumentConverter\r
\r
class CustomConverter(DocumentConverter):\r
    def convert(self, file_stream, **kwargs):\r
        # 实现自定义转换逻辑\r
        pass\r
\r
# 注册插件\r
md = MarkItDown(converters=[CustomConverter()])\r
```\r
\r
### MCP 服务器集成\r
\r
MarkItDown 提供 Model Context Protocol (MCP) 服务器，可与 Claude Desktop 等 LLM 应用集成：\r
\r
```bash\r
# 安装 MCP 服务器\r
pip install markitdown[all,mcp]\r
\r
# 配置 Claude Desktop 使用\r
# 在 claude_desktop_config.json 中添加：\r
# "mcpServers": {\r
#   "markitdown": {\r
#     "command": "python",\r
#     "args": ["-m", "markitdown.mcp"]\r
#   }\r
# }\r
```\r
\r
## 最佳实践\r
\r
1. **安装策略**：生产环境推荐 `[all]` 以确保格式兼容性；资源受限环境可按需安装\r
2. **内存管理**：处理超大文件时使用 `convert_stream()` 避免内存溢出\r
3. **错误处理**：转换可能失败（损坏文件、不支持的格式），应捕获异常并记录\r
4. **编码统一**：始终使用 UTF-8 编码读写 Markdown 文件\r
5. **文件组织**：输出目录结构与输入目录保持一致，便于维护和追踪\r
6. **性能优化**：批量转换时可并行处理（多进程/多线程）提高效率\r
\r
## 故障排除\r
\r
| 问题 | 可能原因 | 解决方案 |\r
|------|----------|----------|\r
| `ModuleNotFoundError` | 依赖未安装 | 重新运行 `pip install 'markitdown[all]'` |\r
| OCR 不工作 | 缺少 Tesseract | 安装 Tesseract OCR 引擎 |\r
| 图片转换失败 | PIL/Pillow 缺失 | `pip install pillow` |\r
| YouTube 失败 | yt-dlp 未安装 | `pip install yt-dlp` |\r
| 内存不足 | 文件太大 | 使用 `convert_stream()` 或分批处理 |\r
\r
## 资源目录说明\r
\r
本技能包含以下资源目录：\r
\r
- **scripts/** - 可执行脚本（示例和工具）\r
- **references/** - 参考文档和详细 API\r
- **assets/** - 模板和配置文件（当前为空）\r
\r
## 相关链接\r
\r
- [MarkItDown PyPI](https://pypi.org/project/markitdown/)\r
- [GitHub 仓库](https://github.com/microsoft/markitdown)\r
- [MCP 服务器文档](https://github.com/microsoft/markitdown/tree/main/packages/markitdown-mcp)\r
\r
## 更新日志\r
\r
- 2026-03-12 - v1.0.3 版本：\r
  - 修复：修正 CLI 脚本中指向 convert_markonverter.py 的路径错误\r
  - 优化：更新版本号，保持与 package.json 一致\r
  - 维护：清理冗余的 Node.js 包装器配置\r
\r
- 2026-03-09 - 初始版本，基于 MarkItDown 0.1.0+ 创建技能模板

Usage Guidance

This package appears to be a legitimate converter wrapper, but exercise caution before installing or running it: 1) The registry metadata omits runtime requirements while SKILL.md and the scripts require Python 3.10+ (and optionally Node.js/NPX and system Tesseract for OCR). Ensure you provision those runtimes yourself. 2) The skill depends on the third-party Python package markitdown — review that package on PyPI/GitHub to confirm its provenance and recent versions before pip installing. 3) Because the code will execute local Python code and third-party library logic, run it in an isolated environment (virtualenv, container) and avoid giving it access to sensitive directories. 4) If you need OCR or YouTube download features, be aware they may require installing system tools (Tesseract, ffmpeg) and additional Python extras. 5) The bundle claims "Microsoft + OpenClaw Community" but the skill homepage/source is unknown — verify the upstream project URLs and repository before trusting it in production.

Capability Analysis

Type: OpenClaw Skill Name: convert-markdown Version: 1.0.3 The skill bundle is a legitimate implementation of a document conversion utility based on Microsoft's MarkItDown library. The code consists of standard Python and Node.js wrappers (e.g., `scripts/cli.py`, `bin/convert-markdown.js`) designed to facilitate batch processing and indexing of various file formats like PDF and Office documents. There is no evidence of data exfiltration, malicious execution, or prompt injection; all file system operations are aligned with the stated purpose of converting documents to Markdown.

Capability Assessment

ℹ Purpose & Capability

The skill's name/description align with the included scripts (Python wrappers around a MarkItDown converter). However the registry metadata claims no required binaries or env vars while SKILL.md and the scripts explicitly require Python 3.10+ (and optionally Node.js/NPX and system Tesseract for OCR). This mismatch between declared requirements and actual runtime needs is incoherent and important.

✓ Instruction Scope

Runtime instructions only describe installing the MarkItDown Python package and running local conversion tools; the code accesses local files/directories (as expected), writes Markdown outputs and metadata (paths, sizes, timestamps). There are no automatic network exfiltration endpoints or hidden data-sending steps in the scripts. Documentation includes examples that fetch Tesseract language data from GitHub raw URLs, but those are manual instructions, not performed automatically by the skill.

⚠ Install Mechanism

There is no install specification for the platform even though the bundle contains executable code. The manifest lists pip package dependencies (markitdown[docx,xlsx,pdf]>=0.1.5) and package.json provides a Node wrapper, but the registry 'requirements' section declared none — this inconsistency means the runtime environment may lack required interpreters/packages unless the user manually installs them. The code itself does not download arbitrary archives or use obscure URLs.

✓ Credentials

The skill requests no environment variables or credentials. The scripts operate on local files and produce local outputs; metadata collected (original_path, file sizes, timestamps, authors from document metadata) is reasonable for an indexing/convert tool. There are no unexpected secret or cloud credentials requested.

✓ Persistence & Privilege

always:false and the skill does not request persistent system-wide privileges. It does not modify other skills or global agent settings. It can start services (e.g., user can run an MCP server), but that is optional and user-initiated per the docs.

Version History

v1.0.3

v1.0.3 includes CLI bugfixes and maintenance: - 修复 CLI 脚本中指向 convert_markonverter.py 的路径错误 - 更新版本号，保持与 package.json 一致 - 清理冗余的 Node.js 包装器配置

v1.0.2

- 新增 NPX CLI 支持，提供 bin/convert-markdown.js 入口脚本。 - 增加 manifest.json、package.json 配置和 scripts/cli.py 脚本，便于命令行和多平台调用。 - SKILL.md 文档补充 npx convert-markdown 用法说明。 - 保持与 MarkItDown 工具的兼容和扩展能力。

v1.0.1

- 首次发布 convert-markdown 技能，基于 MarkItDown 工具实现多格式文档批量转换为 Markdown。 - 支持 PDF、Word、PowerPoint、Excel、图片、音频等多种格式的内容结构化转换。 - 提供 OCR（图片与扫描 PDF 识别）、音频转录、批量递归处理和可扩展插件机制。 - 包括命令行及 Python API 用法，适用于知识库构建和自动化场景。 - 集成 MCP 服务器，用于大模型应用对文档的能力扩展。

v1.0.0

- 初始版本发布，基于 MarkItDown 0.1.0+。 - 支持将 PDF、Word、PPT、Excel、图片、音频等多种格式批量转换为 Markdown。 - 提供结构化内容保留、OCR、音频转录等特色功能。 - 同时支持命令行与 Python API 两种用法。 - 包含详细的使用指引、常见任务场景和故障排查说明。

Metadata

Slug convert-markdown

Version 1.0.3

License MIT-0

All-time Installs 2

Active Installs 2

Total Versions 4

Frequently Asked Questions

What is 文档整理技能 (convert-markdown)?

文档处理与转换技能，基于 MarkItDown 工具。支持将 PDF、Word、PowerPoint、Excel、图片、音频等多种格式文件批量转换为 Markdown。适用于文档数字化、知识库构建、内容提取等场景。 It is an AI Agent Skill for Claude Code / OpenClaw, with 394 downloads so far.

How do I install 文档整理技能 (convert-markdown)?

Run "/install convert-markdown" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is 文档整理技能 (convert-markdown) free?

Yes, 文档整理技能 (convert-markdown) is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does 文档整理技能 (convert-markdown) support?

文档整理技能 (convert-markdown) is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created 文档整理技能 (convert-markdown)?

It is built and maintained by byteuser1977 (@byteuser1977); the current version is v1.0.3.

More Skills

文档整理技能 (convert-markdown)