← Back to Skills Marketplace
tanis90

PDF to Markdown - Extract Text, Tables, Formulas from PDF

by tanis90 · GitHub ↗ · v1.0.4 · MIT-0
cross-platform ✓ Security Clean
365
Downloads
0
Stars
0
Active Installs
5
Versions
Install in OpenClaw
/install pdftomd
Description
PDF to Markdown converter - extract text, tables and formulas from PDF files to clean Markdown. Use when converting PDF documents, extracting PDF content, pa...
README (SKILL.md)

PDF to Markdown - Extract Text, Tables, Formulas from PDF

Convert PDF files to clean Markdown using MinerU Open API. No API key required.

Quick Start

# Convert a local PDF to Markdown
mineru-open-api flash-extract report.pdf

# Convert a PDF from URL (no download needed)
mineru-open-api flash-extract https://cdn-mineru.openxlab.org.cn/demo/example.pdf

# Save to file
mineru-open-api flash-extract report.pdf -o ./output/

# Convert specific pages
mineru-open-api flash-extract report.pdf --pages 1-10

Language Rule

You MUST reply to the user in the SAME language they use. This is non-negotiable.

Capabilities

  • Extracts text, tables, and formulas from PDF
  • Supports both local files and URLs directly
  • Page range selection with --pages
  • Language hint with --language (default: ch, use en for English)
  • No API key, no signup, no authentication
  • Max 10MB / 20 pages per document

When to Use

  • User asks to "read", "extract", "convert", or "parse" a PDF
  • User shares a PDF file or PDF link and asks for its content
  • User wants to summarize or analyze a PDF document
  • User needs PDF content in Markdown format

CLI Reference

Run mineru-open-api flash-extract --help for all available options.

Data Flow

flash-extract sends the document to the MinerU API (mineru.net) for processing and returns Markdown. This is a stateless API call — no account, no persistent storage. MinerU is an open-source project by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

Notes

  • Output is Markdown only; images/tables/formulas may be replaced with placeholders
  • For larger files (up to 200MB/600 pages) or precision extraction with full assets, use mineru-open-api extract (requires auth via mineru-open-api auth)
  • If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli
Usage Guidance
This skill appears to do what it claims (call the mineru-open-api CLI to convert PDFs to Markdown), but it uploads the PDFs to an external MinerU API without authentication. Before installing or using it: 1) Do not send sensitive or confidential PDFs unless you trust mineru.net and understand its retention/privacy policy. 2) Verify the mineru-open-api package source (npm/uv) or the GitHub repo referenced in the SKILL.md to ensure you install the official CLI and not a malicious package. 3) If you need offline/local processing for privacy, prefer local extraction tools instead. 4) Test with non-sensitive sample documents first, and inspect the installed binary (or its source) if you require higher assurance.
Capability Analysis
Type: OpenClaw Skill Name: pdftomd Version: 1.0.4 The skill is a wrapper for the MinerU Open API (mineru.net) used to convert PDF files to Markdown. It transparently discloses that documents are sent to an external API for processing and provides standard installation methods via npm, uv, or go. No malicious patterns, hidden data exfiltration, or harmful prompt injections were found in SKILL.md or the associated metadata.
Capability Assessment
Purpose & Capability
The skill is a PDF→Markdown converter and declares/uses a single CLI binary (mineru-open-api) and CLI commands that match that purpose. The install options (npm/uv/go) and referenced repo align with the MinerU project named in the README.
Instruction Scope
SKILL.md's runtime instructions are narrow and restricted to invoking mineru-open-api on local files or URLs. However, the instructions explicitly send documents to an external MinerU API (mineru.net). That is coherent with the described capability but has privacy implications: any PDF you convert is uploaded to a remote service.
Install Mechanism
Installation is via standard package ecosystems (npm, uv, go install) which is reasonable for a CLI. This is moderate-risk compared to an arbitrary download because packages come from registries and a GitHub path is provided; you should still verify the package source, version, and code before installing.
Credentials
The skill requests no environment variables, credentials, or config paths. That is proportionate to the stated functionality. The lack of auth is consistent with the claim that small files require no API key, but means uploads are unauthenticated.
Persistence & Privilege
The skill does not request persistent/always-on privileges, does not modify other skills, and has no special system path requirements. It installs a single CLI binary into the environment, which is expected behavior.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pdftomd
  3. After installation, invoke the skill by name or use /pdftomd
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.4
- Added "uv" as a new install method for the CLI. - Updated install instructions to mention downloading from the official website if package managers are unavailable. - Removed the dedicated "Install" section to streamline documentation. - No changes to functionality or usage.
v1.0.3
- Added npm as a new installation option for mineru-open-api. - Updated Homebrew install method to npm and removed platform specificity. - Minor adjustment to install instructions for broader compatibility. - No changes to core functionality or usage.
v1.0.2
- Added installation instructions for the Go toolchain (go install) to the metadata. - Users can now install mineru-open-api via go install in addition to Homebrew.
v1.0.1
- Homebrew is now the primary (and only) install method listed; curl and PowerShell install instructions have been removed. - Installation information updated: references now point to the GitHub source and clarify open-source license (Apache-2.0). - Data privacy section replaced with a clearer Data Flow section describing how documents are processed. - Minor wording improvements for clarity and consistency throughout the documentation. - No changes to core functionality or usage.
v1.0.0
- Initial release of PDF to Markdown converter. - Extracts text, tables, and formulas from PDF files to clean Markdown. - Supports both local PDF files and direct URLs. - No authentication or API key required; open-source CLI. - Allows page range selection and language hints. - Maximum file size of 10MB or 20 pages per document.
Metadata
Slug pdftomd
Version 1.0.4
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 5
Frequently Asked Questions

What is PDF to Markdown - Extract Text, Tables, Formulas from PDF?

PDF to Markdown converter - extract text, tables and formulas from PDF files to clean Markdown. Use when converting PDF documents, extracting PDF content, pa... It is an AI Agent Skill for Claude Code / OpenClaw, with 365 downloads so far.

How do I install PDF to Markdown - Extract Text, Tables, Formulas from PDF?

Run "/install pdftomd" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PDF to Markdown - Extract Text, Tables, Formulas from PDF free?

Yes, PDF to Markdown - Extract Text, Tables, Formulas from PDF is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PDF to Markdown - Extract Text, Tables, Formulas from PDF support?

PDF to Markdown - Extract Text, Tables, Formulas from PDF is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PDF to Markdown - Extract Text, Tables, Formulas from PDF?

It is built and maintained by tanis90 (@tanis90); the current version is v1.0.4.

💬 Comments