← Back to Skills Marketplace
mzlzyca

PDF to HTML

by mzlzyCA · GitHub ↗ · v0.4.0 · MIT-0
cross-platform ✓ Security Clean
182
Downloads
0
Stars
1
Active Installs
6
Versions
Install in OpenClaw
/install pdf-to-html
Description
Convert PDF documents to HTML using MinerU. Transforms PDF files into web-ready HTML with structure and formatting preserved. Features: PDF to HTML conversio...
README (SKILL.md)

PDF to HTML

Convert PDF files to HTML using MinerU.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Convert PDF to HTML (requires token)
mineru-open-api extract report.pdf -f html -o ./out/

# From URL
mineru-open-api extract https://example.com/report.pdf -f html -o ./out/

# With language hint
mineru-open-api extract report.pdf -f html --language en -o ./out/

Authentication

Token required:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

  • Supported input: .pdf (local file or URL)
  • Output format: HTML (-f html)
  • HTML output requires extract with token — not available in flash-extract
  • Language hint with --language (default: ch, use en for English)
  • Page range with --pages (e.g. 1-10)

Notes

  • HTML output (-f html) is only available via extract with token
  • Output goes to stdout by default; use -o \x3Cdir> to save to a file
  • All progress/status messages go to stderr; document content goes to stdout
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
Usage Guidance
This skill is essentially documentation for using the MinerU CLI and appears coherent. Before installing: 1) Verify mineru-open-api is the official MinerU package (check the npm page and the GitHub repo linked in SKILL.md). 2) Create a dedicated MinerU token with minimal scope and do not reuse other service credentials. 3) If you install via npm, review the package's install scripts and source code if you require tightened supply-chain control. 4) Prefer running the CLI in a sandbox or CI runner if you are processing untrusted PDFs. 5) Avoid embedding the MINERU_TOKEN in shared logs or public code; set it as a restricted environment variable.
Capability Analysis
Type: OpenClaw Skill Name: pdf-to-html Version: 0.4.0 The skill is a legitimate wrapper for the MinerU document intelligence engine, facilitating PDF-to-HTML conversion via the 'mineru-open-api' CLI. It requires a standard API token (MINERU_TOKEN) and uses official installation paths from npm and GitHub (opendatalab/MinerU-Ecosystem), with no evidence of malicious intent or data exfiltration.
Capability Assessment
Purpose & Capability
The skill is an instruction-only wrapper for the MinerU CLI. Declared requirements (mineru-open-api binary and MINERU_TOKEN) directly match the described functionality (calling mineru-open-api extract to produce HTML). There are no unrelated binaries or extra credential claims.
Instruction Scope
SKILL.md instructs the agent to run the mineru-open-api CLI (extract, auth) against local files or URLs and to use MINERU_TOKEN. It does not instruct reading other environment variables, unrelated system files, or exfiltrating data to unexpected endpoints.
Install Mechanism
Install options are npm (mineru-open-api) or go install from a GitHub repo (github.com/opendatalab/...). Both are standard, traceable mechanisms. No downloads from untrusted shorteners or personal IPs are used. (As usual with npm, postinstall scripts are possible; review package sources if you require stricter controls.)
Credentials
Only a single token (MINERU_TOKEN) is required and is justified by the CLI's auth flow. No other credentials or config paths are requested. Users should confirm the token's scope and avoid reusing high-privilege tokens.
Persistence & Privilege
The skill is not always-enabled and does not request persistent modification of other skills or system-wide settings. Autonomous invocation is allowed but this is the platform default and not a reason to flag the skill by itself.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pdf-to-html
  3. After installation, invoke the skill by name or use /pdf-to-html
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.4.0
SEO: expand description for better ClawHub vector search discovery
v0.3.0
Rollback to original version
v0.2.1
SEO optimization v0.2.1
v0.2.0
SEO optimization v0.2.0
v1.0.1
Minor update
v1.0.0
Initial release
Metadata
Slug pdf-to-html
Version 0.4.0
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 6
Frequently Asked Questions

What is PDF to HTML?

Convert PDF documents to HTML using MinerU. Transforms PDF files into web-ready HTML with structure and formatting preserved. Features: PDF to HTML conversio... It is an AI Agent Skill for Claude Code / OpenClaw, with 182 downloads so far.

How do I install PDF to HTML?

Run "/install pdf-to-html" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PDF to HTML free?

Yes, PDF to HTML is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PDF to HTML support?

PDF to HTML is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PDF to HTML?

It is built and maintained by mzlzyCA (@mzlzyca); the current version is v0.4.0.

💬 Comments