← Back to Skills Marketplace
mzlzyca

PDF to DOCX

by mzlzyCA · GitHub ↗ · v0.4.0 · MIT-0
cross-platform ✓ Security Clean
201
Downloads
0
Stars
2
Active Installs
6
Versions
Install in OpenClaw
/install pdf-to-docx
Description
Convert PDF documents to Word (.docx) format using MinerU. Transforms PDF files into editable Word documents preserving layout, text, tables, and formatting....
README (SKILL.md)

PDF to DOCX

Convert PDF files to editable Word (.docx) format using MinerU.

⚠️ Token required. flash-extract does not support DOCX output. You must configure a token via mineru-open-api auth before using this skill.

⚠️ Output to file required. DOCX is a binary format and cannot be streamed to stdout — you must always specify -o \x3Cdirectory>.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Authentication

Token required — create one at https://mineru.net/apiManage/token:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Quick Start

# Convert PDF to DOCX (token required, -o is mandatory)
mineru-open-api extract report.pdf -f docx -o ./out/

# From URL
mineru-open-api extract https://example.com/report.pdf -f docx -o ./out/

# With language hint
mineru-open-api extract report.pdf -f docx --language en -o ./out/

# With VLM model for better layout accuracy (complex PDFs)
mineru-open-api extract report.pdf -f docx --model vlm -o ./out/

# Batch convert multiple PDFs
mineru-open-api extract *.pdf -f docx -o ./out/

Capabilities

  • Supported input: .pdf (local file or URL)
  • Output format: Word (.docx) via -f docx
  • Token required (mineru-open-api auth or MINERU_TOKEN env)
  • -o \x3Cdir> is mandatory — DOCX cannot stream to stdout
  • Language hint with --language (default: ch, use en for English)
  • Page range with --pages (e.g. 1-10)
  • Batch mode supported: extract *.pdf -f docx -o ./out/

Notes

  • flash-extract does NOT support DOCX output — always use extract with token
  • DOCX output cannot be streamed to stdout; -o flag is required
  • Use --model vlm for PDFs with complex layouts, tables, or mixed content
  • Use --model pipeline if you need guaranteed fidelity with no hallucination risk
  • Output directory will be created if it does not exist
  • All progress/status messages go to stderr
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
Usage Guidance
This skill appears coherent, but consider these practical precautions before installing: 1) MINERU_TOKEN grants MinerU access to perform conversions — do not supply it if you don't trust MinerU or the token's scope. 2) Converted PDFs are uploaded to the service (implicit in using an external API); avoid sending sensitive/confidential documents unless you have reviewed MinerU's privacy/security policy. 3) Prefer installing from the official GitHub repo or a vetted npm package; inspect the mineru-open-api package source if you can. 4) If you only need occasional conversions, create a token with minimal scope and revoke it when finished. 5) The agent will read any local file you ask it to convert, so avoid giving it broad instructions that could cause it to scan filesystem locations you didn't intend to share.
Capability Analysis
Type: OpenClaw Skill Name: pdf-to-docx Version: 0.4.0 The skill bundle provides a legitimate interface for the MinerU document intelligence engine (by Shanghai AI Lab) to convert PDF files to DOCX format. It correctly identifies the need for an API token (MINERU_TOKEN) and utilizes the official 'mineru-open-api' CLI tool via npm or Go, with no evidence of malicious intent, data exfiltration, or prompt injection.
Capability Assessment
Purpose & Capability
The skill name/description match the declared dependencies: it requires the mineru-open-api CLI and an MINERU_TOKEN, both of which are directly used by the SKILL.md commands.
Instruction Scope
SKILL.md only instructs the agent to run mineru-open-api commands, authenticate with MINERU_TOKEN, and read local PDF files or URLs. There are no instructions to access unrelated files, other env vars, or external endpoints beyond MinerU.
Install Mechanism
Install methods are standard: npm -g mineru-open-api or go install from the GitHub repo. These are expected for a CLI tool. (As always, installing third-party packages has inherent supply-chain risk — see user guidance.)
Credentials
Only one credential (MINERU_TOKEN) is required and it's used for authenticating to the MinerU service — proportional to the described functionality.
Persistence & Privilege
The skill is not always-enabled and does not request system-wide configuration changes or access to other skills' credentials. It behaves like a normal user-invokable CLI integration.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pdf-to-docx
  3. After installation, invoke the skill by name or use /pdf-to-docx
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.4.0
SEO: expand description for better ClawHub vector search discovery
v0.3.0
Rollback to original version
v0.2.1
SEO optimization v0.2.1
v0.2.0
SEO optimization v0.2.0
v1.0.1
Minor update
v1.0.0
Initial release
Metadata
Slug pdf-to-docx
Version 0.4.0
License MIT-0
All-time Installs 2
Active Installs 2
Total Versions 6
Frequently Asked Questions

What is PDF to DOCX?

Convert PDF documents to Word (.docx) format using MinerU. Transforms PDF files into editable Word documents preserving layout, text, tables, and formatting.... It is an AI Agent Skill for Claude Code / OpenClaw, with 201 downloads so far.

How do I install PDF to DOCX?

Run "/install pdf-to-docx" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PDF to DOCX free?

Yes, PDF to DOCX is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PDF to DOCX support?

PDF to DOCX is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PDF to DOCX?

It is built and maintained by mzlzyCA (@mzlzyca); the current version is v0.4.0.

💬 Comments