← 返回 Skills 市场
mzlzyca

PDF to DOCX

作者 mzlzyCA · GitHub ↗ · v0.4.0 · MIT-0
cross-platform ✓ 安全检测通过
201
总下载
0
收藏
2
当前安装
6
版本数
在 OpenClaw 中安装
/install pdf-to-docx
功能描述
Convert PDF documents to Word (.docx) format using MinerU. Transforms PDF files into editable Word documents preserving layout, text, tables, and formatting....
使用说明 (SKILL.md)

PDF to DOCX

Convert PDF files to editable Word (.docx) format using MinerU.

⚠️ Token required. flash-extract does not support DOCX output. You must configure a token via mineru-open-api auth before using this skill.

⚠️ Output to file required. DOCX is a binary format and cannot be streamed to stdout — you must always specify -o \x3Cdirectory>.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Authentication

Token required — create one at https://mineru.net/apiManage/token:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Quick Start

# Convert PDF to DOCX (token required, -o is mandatory)
mineru-open-api extract report.pdf -f docx -o ./out/

# From URL
mineru-open-api extract https://example.com/report.pdf -f docx -o ./out/

# With language hint
mineru-open-api extract report.pdf -f docx --language en -o ./out/

# With VLM model for better layout accuracy (complex PDFs)
mineru-open-api extract report.pdf -f docx --model vlm -o ./out/

# Batch convert multiple PDFs
mineru-open-api extract *.pdf -f docx -o ./out/

Capabilities

  • Supported input: .pdf (local file or URL)
  • Output format: Word (.docx) via -f docx
  • Token required (mineru-open-api auth or MINERU_TOKEN env)
  • -o \x3Cdir> is mandatory — DOCX cannot stream to stdout
  • Language hint with --language (default: ch, use en for English)
  • Page range with --pages (e.g. 1-10)
  • Batch mode supported: extract *.pdf -f docx -o ./out/

Notes

  • flash-extract does NOT support DOCX output — always use extract with token
  • DOCX output cannot be streamed to stdout; -o flag is required
  • Use --model vlm for PDFs with complex layouts, tables, or mixed content
  • Use --model pipeline if you need guaranteed fidelity with no hallucination risk
  • Output directory will be created if it does not exist
  • All progress/status messages go to stderr
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
安全使用建议
This skill appears coherent, but consider these practical precautions before installing: 1) MINERU_TOKEN grants MinerU access to perform conversions — do not supply it if you don't trust MinerU or the token's scope. 2) Converted PDFs are uploaded to the service (implicit in using an external API); avoid sending sensitive/confidential documents unless you have reviewed MinerU's privacy/security policy. 3) Prefer installing from the official GitHub repo or a vetted npm package; inspect the mineru-open-api package source if you can. 4) If you only need occasional conversions, create a token with minimal scope and revoke it when finished. 5) The agent will read any local file you ask it to convert, so avoid giving it broad instructions that could cause it to scan filesystem locations you didn't intend to share.
功能分析
Type: OpenClaw Skill Name: pdf-to-docx Version: 0.4.0 The skill bundle provides a legitimate interface for the MinerU document intelligence engine (by Shanghai AI Lab) to convert PDF files to DOCX format. It correctly identifies the need for an API token (MINERU_TOKEN) and utilizes the official 'mineru-open-api' CLI tool via npm or Go, with no evidence of malicious intent, data exfiltration, or prompt injection.
能力评估
Purpose & Capability
The skill name/description match the declared dependencies: it requires the mineru-open-api CLI and an MINERU_TOKEN, both of which are directly used by the SKILL.md commands.
Instruction Scope
SKILL.md only instructs the agent to run mineru-open-api commands, authenticate with MINERU_TOKEN, and read local PDF files or URLs. There are no instructions to access unrelated files, other env vars, or external endpoints beyond MinerU.
Install Mechanism
Install methods are standard: npm -g mineru-open-api or go install from the GitHub repo. These are expected for a CLI tool. (As always, installing third-party packages has inherent supply-chain risk — see user guidance.)
Credentials
Only one credential (MINERU_TOKEN) is required and it's used for authenticating to the MinerU service — proportional to the described functionality.
Persistence & Privilege
The skill is not always-enabled and does not request system-wide configuration changes or access to other skills' credentials. It behaves like a normal user-invokable CLI integration.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install pdf-to-docx
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /pdf-to-docx 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.4.0
SEO: expand description for better ClawHub vector search discovery
v0.3.0
Rollback to original version
v0.2.1
SEO optimization v0.2.1
v0.2.0
SEO optimization v0.2.0
v1.0.1
Minor update
v1.0.0
Initial release
元数据
Slug pdf-to-docx
版本 0.4.0
许可证 MIT-0
累计安装 2
当前安装数 2
历史版本数 6
常见问题

PDF to DOCX 是什么?

Convert PDF documents to Word (.docx) format using MinerU. Transforms PDF files into editable Word documents preserving layout, text, tables, and formatting.... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 201 次。

如何安装 PDF to DOCX?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-to-docx」即可一键安装,无需额外配置。

PDF to DOCX 是免费的吗?

是的,PDF to DOCX 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

PDF to DOCX 支持哪些平台?

PDF to DOCX 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 PDF to DOCX?

由 mzlzyCA(@mzlzyca)开发并维护,当前版本 v0.4.0。

💬 留言讨论