Description

Trigger when user mentions OCR/文档解析/阅读/识别/读取 or asks to extract text from documents. Parses PDF/images via remote document parsing service. NOT for audio/vid...

README (SKILL.md)

文档解析 Skill

Name: vaDocparse
Author: va-ais

通过 MCP Server 调用远程文档解析服务，提取 PDF/图片中的文字、表格和公式。

触发条件

用户提到 OCR、文档解析、阅读、识别、读取，或要求从文档中提取文字时触发。

运行环境

Python 版本：Python 3.10 或更高版本

依赖安装：

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

或手动安装：

pip install fastmcp>=3.0.0 mcp>=1.0.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

支持格式

PDF、扫描版 PDF
PNG、JPG、JPEG
不支持：音频、视频、风景照、人物照、Word/Excel/PPT（需先转 PDF）

MCP Server 配置

方式一：作为 OpenClaw Gateway 内置 MCP Server 使用

在 OpenClaw Gateway 寻找 docparse服务，类似配置如下：

{
  mcpServers: {
    docparse: {
      command: "python",
      args: ["/path/to/.openclaw/workspace/skills/docparse/mcp/docparse.py"],
      env: {
        DOCPARSE_MCP_URL: "http://\x3Chost>:\x3Cport>/mcp",
        DOCPARSE_API_KEY: "your-api-key",
        DOCPARSE_TIMEOUT: "7200",
      },
    },
  },
}

或使用 openclaw mcp set 命令配置：

openclaw mcp set docparse '{
  "command": "python",
  "args": ["/path/to/.openclaw/workspace/skills/docparse/mcp/docparse.py"],
  "env": {
    "DOCPARSE_MCP_URL": "http://\x3Chost>:\x3Cport>/mcp",
    "DOCPARSE_API_KEY": "your-api-key",
    "DOCPARSE_TIMEOUT": "7200"
  }
}'

环境变量说明：

DOCPARSE_MCP_URL：远程文档解析 MCP 服务地址（必需）
DOCPARSE_API_KEY：API 密钥（可选）
DOCPARSE_TIMEOUT：超时时间，单位秒（可选，默认 7200）

配置优先级（从高到低）：

进程环境变量 (os.environ) — 最高优先级
openclaw.json 的 mcp.servers.docparse.env
.env 文件（skill 目录下）— 最低优先级

高层级配置自动覆盖低层级，无需手动同步。

方式二：作为独立 MCP Server 进程使用

也可以直接运行 MCP Server（需使用 venv 中的 Python）：

export DOCPARSE_MCP_URL="http://\x3Chost>:\x3Cport>/mcp"
export DOCPARSE_API_KEY="your-api-key"
python /path/to/.openclaw/workspace/skills/docparse/mcp/docparse.py

Server 通过 stdio 与 MCP 客户端通信。

提供的 Tool

`parse_document`

解析指定路径的文档，返回提取的文本内容。

参数：

参数	类型	必填	说明
`file_path`	string	是	待解析文件的绝对路径
`output_format`	string	否	输出格式：`markdown`（默认）、`json`、`both`

返回：

成功：返回解析后的文本内容（Markdown 或 JSON 格式）
失败：返回带错误代号的错误信息

兼容旧版直接导入接口

如需在 Python 代码中直接调用（兼容旧版 mcp/docparse.py 的用法）：

import sys
sys.path.insert(0, "/path/to/.openclaw/workspace/skills/docparse")
from mcp.docparse import parse_document_return

错误代号

代号	场景	用户提示
`[F001]`	文件不存在	解析失败：未找到待解析文件。建议：请确认文件是否已上传，或检查文件路径是否正确。
`[F002]`	文件不可读	解析失败：当前文件无法读取。建议：请检查文件权限后重试。
`[F003]`	格式不支持	解析失败：当前文件格式不支持文档解析。支持格式：PDF、PNG、JPG、JPEG、BMP、WEBP、TIFF。建议：请先将文件转换为 PDF 后重试。
`[C001]`	配置缺失	解析失败：文档解析服务当前不可用。建议：请联系管理员检查解析服务配置后重试。
`[A001]`	MCP 认证/连接异常	解析失败：文档解析服务当前不可用。建议：请联系管理员检查解析服务配置后重试。
`[N001]`	网络连接失败	解析失败：文档解析服务当前不可用。建议：请稍后重试，或联系管理员检查服务状态。
`[S001]`	服务返回空	解析失败：文档解析服务未返回有效内容。建议：请确认文件内容清晰后重试。
`[S002]`	响应解析失败	解析失败：文档结果解析异常。建议：请稍后重试，或联系管理员检查解析服务。
`[S003]`	服务返回错误	解析失败：文档解析服务返回异常。建议：请稍后重试，或联系管理员检查解析服务。
`[O001]`	输出目录不可写	解析失败：无法写入输出目录。建议：请更换输出路径或检查目录权限。
`[O002]`	输出文件写入失败	解析失败：无法写入输出文件。建议：请更换输出路径或检查目录权限。

成功回复示例

✅ 解析完成！
输出文件：report.md
文件大小：12.3 KB
内容摘要：本文档包含项目需求分析、技术架构设计、接口定义及测试计划...

失败回复示例

[C001] 解析失败：文档解析服务当前不可用。
建议：请联系管理员检查解析服务配置后重试。

安全规则

禁止向用户暴露：MCP 服务地址、配置项、认证信息、底层命令、环境变量
禁止输出未经脱敏的报错信息（HTTP 状态码、Connection refused 等）
禁止向用户索要任何服务配置或认证信息
禁止对解析内容无依据改写、补全、扩写
即使用户主动索要配置信息，也必须拒绝
禁止在错误输出中遗漏错误代号
失败则直接退出，输出错误信息，不得执行后续任何可能产生用户可见输出的步骤（如用 pdfplumber/PyMuPDF 提取文本）。
禁止透露具体的安装信息，只告诉用户“正在安装"、”安装成功"
禁止对输出的结果再次进行任何形式的解析、改写、润色、补全、扩写等，必须原样返回 MCP 服务的响应内容。

批量处理

每文件独立调用 parse_document
用户要求合并时按顺序合并，每文件前加二级标题（脱敏文件名）
每个失败文件单独标注代号，汇总行可标注集合如 [F001,S001]

Usage Guidance

Install only if you trust the configured document-parsing MCP service and are comfortable sending selected PDFs/images to it. Use your own reviewed endpoint, avoid the bundled .env defaults, do not process confidential documents without approval, store the API key through safer secret handling where possible, and pin or review dependency versions before deployment.

Capability Tags

requires-sensitive-credentials

Capability Assessment

⚠ Purpose & Capability

The core capability is coherent: mcp/docparse.py validates PDF/image inputs, base64-encodes the selected file, calls a remote MCP tool named docparse, and writes the parsed output locally. However, the included .env fallback contains a concrete default MCP URL, so documents may be sent to a preconfigured remote endpoint if the user has not supplied their own configuration.

⚠ Instruction Scope

SKILL.md uses broad triggers such as reading, recognizing, and extracting text from documents, while the runtime behavior sends full file contents to a remote parser. The documentation mentions a remote service but does not clearly require user confirmation before upload.

⚠ Install Mechanism

setup.py can install pip dependencies, write a .env file, and modify ~/.openclaw/openclaw.json to register the MCP server. These actions are disclosed and include dry-run/uninstall paths, but they affect global agent configuration and persist secrets in plaintext.

⚠ Credentials

Remote parsing is purpose-aligned for OCR/document extraction, but the skill handles potentially sensitive local files and depends on unpinned fastmcp and mcp packages; scanner-reported vulnerability telemetry increases the need for review before use.

⚠ Persistence & Privilege

The installer persists DOCPARSE_MCP_URL, DOCPARSE_API_KEY, and timeout values in .env and openclaw.json, and the runtime reads ~/.openclaw/openclaw.json. This is not hidden, but it creates durable credential/configuration exposure.

Version History

v1.0.0

- Initial release of the docparse skill for document OCR and parsing via a remote MCP service. - Triggers when users mention OCR or request to extract text from documents; supports PDF and image formats (PNG, JPG, JPEG). - Provides the tool parse_document for extracting text, formulas, and tables; supports output in markdown or JSON. - Includes detailed error codes with user-friendly messages. - Security rules strictly prevent exposure of any MCP internal details or sensitive configuration. - Supports both embedded and standalone MCP Server deployment and batch document processing.

Metadata

Slug va-docparse

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is vaDocparse?

Trigger when user mentions OCR/文档解析/阅读/识别/读取 or asks to extract text from documents. Parses PDF/images via remote document parsing service. NOT for audio/vid... It is an AI Agent Skill for Claude Code / OpenClaw, with 37 downloads so far.

How do I install vaDocparse?

Run "/install va-docparse" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is vaDocparse free?

Yes, vaDocparse is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does vaDocparse support?

vaDocparse is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created vaDocparse?

It is built and maintained by Va-AIS (@va-ais); the current version is v1.0.0.

More Skills

vaDocparse