← Back to Skills Marketplace

PDF to Markdown - Extract Text, Tables, Formulas from PDF

Name: PDF to Markdown - Extract Text, Tables, Formulas from PDF
Author: tanis90

by tanis90 · GitHub ↗ · v1.0.4 · MIT-0

cross-platform ✓ Security Clean

365

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install pdftomd

Description

PDF to Markdown converter - extract text, tables and formulas from PDF files to clean Markdown. Use when converting PDF documents, extracting PDF content, pa...

README (SKILL.md)

PDF to Markdown - Extract Text, Tables, Formulas from PDF

Convert PDF files to clean Markdown using MinerU Open API. No API key required.

Quick Start

# Convert a local PDF to Markdown
mineru-open-api flash-extract report.pdf

# Convert a PDF from URL (no download needed)
mineru-open-api flash-extract https://cdn-mineru.openxlab.org.cn/demo/example.pdf

# Save to file
mineru-open-api flash-extract report.pdf -o ./output/

# Convert specific pages
mineru-open-api flash-extract report.pdf --pages 1-10

Language Rule

You MUST reply to the user in the SAME language they use. This is non-negotiable.

Capabilities

Extracts text, tables, and formulas from PDF
Supports both local files and URLs directly
Page range selection with --pages
Language hint with --language (default: ch, use en for English)
No API key, no signup, no authentication
Max 10MB / 20 pages per document

When to Use

User asks to "read", "extract", "convert", or "parse" a PDF
User shares a PDF file or PDF link and asks for its content
User wants to summarize or analyze a PDF document
User needs PDF content in Markdown format

CLI Reference

Run mineru-open-api flash-extract --help for all available options.

Data Flow

flash-extract sends the document to the MinerU API (mineru.net) for processing and returns Markdown. This is a stateless API call — no account, no persistent storage. MinerU is an open-source project by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

Notes

Output is Markdown only; images/tables/formulas may be replaced with placeholders
For larger files (up to 200MB/600 pages) or precision extraction with full assets, use mineru-open-api extract (requires auth via mineru-open-api auth)
If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli

Usage Guidance

This skill appears to do what it claims (call the mineru-open-api CLI to convert PDFs to Markdown), but it uploads the PDFs to an external MinerU API without authentication. Before installing or using it: 1) Do not send sensitive or confidential PDFs unless you trust mineru.net and understand its retention/privacy policy. 2) Verify the mineru-open-api package source (npm/uv) or the GitHub repo referenced in the SKILL.md to ensure you install the official CLI and not a malicious package. 3) If you need offline/local processing for privacy, prefer local extraction tools instead. 4) Test with non-sensitive sample documents first, and inspect the installed binary (or its source) if you require higher assurance.

Capability Analysis

Type: OpenClaw Skill Name: pdftomd Version: 1.0.4 The skill is a wrapper for the MinerU Open API (mineru.net) used to convert PDF files to Markdown. It transparently discloses that documents are sent to an external API for processing and provides standard installation methods via npm, uv, or go. No malicious patterns, hidden data exfiltration, or harmful prompt injections were found in SKILL.md or the associated metadata.

Capability Assessment

✓ Purpose & Capability

The skill is a PDF→Markdown converter and declares/uses a single CLI binary (mineru-open-api) and CLI commands that match that purpose. The install options (npm/uv/go) and referenced repo align with the MinerU project named in the README.

ℹ Instruction Scope

SKILL.md's runtime instructions are narrow and restricted to invoking mineru-open-api on local files or URLs. However, the instructions explicitly send documents to an external MinerU API (mineru.net). That is coherent with the described capability but has privacy implications: any PDF you convert is uploaded to a remote service.

ℹ Install Mechanism

Installation is via standard package ecosystems (npm, uv, go install) which is reasonable for a CLI. This is moderate-risk compared to an arbitrary download because packages come from registries and a GitHub path is provided; you should still verify the package source, version, and code before installing.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. That is proportionate to the stated functionality. The lack of auth is consistent with the claim that small files require no API key, but means uploads are unauthenticated.

✓ Persistence & Privilege

The skill does not request persistent/always-on privileges, does not modify other skills, and has no special system path requirements. It installs a single CLI binary into the environment, which is expected behavior.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install pdftomd
After installation, invoke the skill by name or use /pdftomd
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.4

- Added "uv" as a new install method for the CLI. - Updated install instructions to mention downloading from the official website if package managers are unavailable. - Removed the dedicated "Install" section to streamline documentation. - No changes to functionality or usage.

v1.0.3

- Added npm as a new installation option for mineru-open-api. - Updated Homebrew install method to npm and removed platform specificity. - Minor adjustment to install instructions for broader compatibility. - No changes to core functionality or usage.

v1.0.2

- Added installation instructions for the Go toolchain (go install) to the metadata. - Users can now install mineru-open-api via go install in addition to Homebrew.

v1.0.1

- Homebrew is now the primary (and only) install method listed; curl and PowerShell install instructions have been removed. - Installation information updated: references now point to the GitHub source and clarify open-source license (Apache-2.0). - Data privacy section replaced with a clearer Data Flow section describing how documents are processed. - Minor wording improvements for clarity and consistency throughout the documentation. - No changes to core functionality or usage.

v1.0.0

- Initial release of PDF to Markdown converter. - Extracts text, tables, and formulas from PDF files to clean Markdown. - Supports both local PDF files and direct URLs. - No authentication or API key required; open-source CLI. - Allows page range selection and language hints. - Maximum file size of 10MB or 20 pages per document.

Metadata

Slug pdftomd

Version 1.0.4

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 5

Frequently Asked Questions

What is PDF to Markdown - Extract Text, Tables, Formulas from PDF?

PDF to Markdown converter - extract text, tables and formulas from PDF files to clean Markdown. Use when converting PDF documents, extracting PDF content, pa... It is an AI Agent Skill for Claude Code / OpenClaw, with 365 downloads so far.

How do I install PDF to Markdown - Extract Text, Tables, Formulas from PDF?

Run "/install pdftomd" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PDF to Markdown - Extract Text, Tables, Formulas from PDF free?

Yes, PDF to Markdown - Extract Text, Tables, Formulas from PDF is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PDF to Markdown - Extract Text, Tables, Formulas from PDF support?

PDF to Markdown - Extract Text, Tables, Formulas from PDF is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PDF to Markdown - Extract Text, Tables, Formulas from PDF?

It is built and maintained by tanis90 (@tanis90); the current version is v1.0.4.

More Skills

PDF to Markdown - Extract Text, Tables, Formulas from PDF

PDF to Markdown - Extract Text, Tables, Formulas from PDF

Quick Start

Language Rule

Capabilities

When to Use

CLI Reference

Data Flow

Notes

What is PDF to Markdown - Extract Text, Tables, Formulas from PDF?

How do I install PDF to Markdown - Extract Text, Tables, Formulas from PDF?

Is PDF to Markdown - Extract Text, Tables, Formulas from PDF free?

Which platforms does PDF to Markdown - Extract Text, Tables, Formulas from PDF support?

Who created PDF to Markdown - Extract Text, Tables, Formulas from PDF?

💬 Comments