← 返回 Skills 市场
tanis90

PDF to Markdown - Extract Text, Tables, Formulas from PDF

作者 tanis90 · GitHub ↗ · v1.0.4 · MIT-0
cross-platform ✓ 安全检测通过
365
总下载
0
收藏
0
当前安装
5
版本数
在 OpenClaw 中安装
/install pdftomd
功能描述
PDF to Markdown converter - extract text, tables and formulas from PDF files to clean Markdown. Use when converting PDF documents, extracting PDF content, pa...
使用说明 (SKILL.md)

PDF to Markdown - Extract Text, Tables, Formulas from PDF

Convert PDF files to clean Markdown using MinerU Open API. No API key required.

Quick Start

# Convert a local PDF to Markdown
mineru-open-api flash-extract report.pdf

# Convert a PDF from URL (no download needed)
mineru-open-api flash-extract https://cdn-mineru.openxlab.org.cn/demo/example.pdf

# Save to file
mineru-open-api flash-extract report.pdf -o ./output/

# Convert specific pages
mineru-open-api flash-extract report.pdf --pages 1-10

Language Rule

You MUST reply to the user in the SAME language they use. This is non-negotiable.

Capabilities

  • Extracts text, tables, and formulas from PDF
  • Supports both local files and URLs directly
  • Page range selection with --pages
  • Language hint with --language (default: ch, use en for English)
  • No API key, no signup, no authentication
  • Max 10MB / 20 pages per document

When to Use

  • User asks to "read", "extract", "convert", or "parse" a PDF
  • User shares a PDF file or PDF link and asks for its content
  • User wants to summarize or analyze a PDF document
  • User needs PDF content in Markdown format

CLI Reference

Run mineru-open-api flash-extract --help for all available options.

Data Flow

flash-extract sends the document to the MinerU API (mineru.net) for processing and returns Markdown. This is a stateless API call — no account, no persistent storage. MinerU is an open-source project by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

Notes

  • Output is Markdown only; images/tables/formulas may be replaced with placeholders
  • For larger files (up to 200MB/600 pages) or precision extraction with full assets, use mineru-open-api extract (requires auth via mineru-open-api auth)
  • If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli
安全使用建议
This skill appears to do what it claims (call the mineru-open-api CLI to convert PDFs to Markdown), but it uploads the PDFs to an external MinerU API without authentication. Before installing or using it: 1) Do not send sensitive or confidential PDFs unless you trust mineru.net and understand its retention/privacy policy. 2) Verify the mineru-open-api package source (npm/uv) or the GitHub repo referenced in the SKILL.md to ensure you install the official CLI and not a malicious package. 3) If you need offline/local processing for privacy, prefer local extraction tools instead. 4) Test with non-sensitive sample documents first, and inspect the installed binary (or its source) if you require higher assurance.
功能分析
Type: OpenClaw Skill Name: pdftomd Version: 1.0.4 The skill is a wrapper for the MinerU Open API (mineru.net) used to convert PDF files to Markdown. It transparently discloses that documents are sent to an external API for processing and provides standard installation methods via npm, uv, or go. No malicious patterns, hidden data exfiltration, or harmful prompt injections were found in SKILL.md or the associated metadata.
能力评估
Purpose & Capability
The skill is a PDF→Markdown converter and declares/uses a single CLI binary (mineru-open-api) and CLI commands that match that purpose. The install options (npm/uv/go) and referenced repo align with the MinerU project named in the README.
Instruction Scope
SKILL.md's runtime instructions are narrow and restricted to invoking mineru-open-api on local files or URLs. However, the instructions explicitly send documents to an external MinerU API (mineru.net). That is coherent with the described capability but has privacy implications: any PDF you convert is uploaded to a remote service.
Install Mechanism
Installation is via standard package ecosystems (npm, uv, go install) which is reasonable for a CLI. This is moderate-risk compared to an arbitrary download because packages come from registries and a GitHub path is provided; you should still verify the package source, version, and code before installing.
Credentials
The skill requests no environment variables, credentials, or config paths. That is proportionate to the stated functionality. The lack of auth is consistent with the claim that small files require no API key, but means uploads are unauthenticated.
Persistence & Privilege
The skill does not request persistent/always-on privileges, does not modify other skills, and has no special system path requirements. It installs a single CLI binary into the environment, which is expected behavior.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install pdftomd
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /pdftomd 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.4
- Added "uv" as a new install method for the CLI. - Updated install instructions to mention downloading from the official website if package managers are unavailable. - Removed the dedicated "Install" section to streamline documentation. - No changes to functionality or usage.
v1.0.3
- Added npm as a new installation option for mineru-open-api. - Updated Homebrew install method to npm and removed platform specificity. - Minor adjustment to install instructions for broader compatibility. - No changes to core functionality or usage.
v1.0.2
- Added installation instructions for the Go toolchain (go install) to the metadata. - Users can now install mineru-open-api via go install in addition to Homebrew.
v1.0.1
- Homebrew is now the primary (and only) install method listed; curl and PowerShell install instructions have been removed. - Installation information updated: references now point to the GitHub source and clarify open-source license (Apache-2.0). - Data privacy section replaced with a clearer Data Flow section describing how documents are processed. - Minor wording improvements for clarity and consistency throughout the documentation. - No changes to core functionality or usage.
v1.0.0
- Initial release of PDF to Markdown converter. - Extracts text, tables, and formulas from PDF files to clean Markdown. - Supports both local PDF files and direct URLs. - No authentication or API key required; open-source CLI. - Allows page range selection and language hints. - Maximum file size of 10MB or 20 pages per document.
元数据
Slug pdftomd
版本 1.0.4
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 5
常见问题

PDF to Markdown - Extract Text, Tables, Formulas from PDF 是什么?

PDF to Markdown converter - extract text, tables and formulas from PDF files to clean Markdown. Use when converting PDF documents, extracting PDF content, pa... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 365 次。

如何安装 PDF to Markdown - Extract Text, Tables, Formulas from PDF?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdftomd」即可一键安装,无需额外配置。

PDF to Markdown - Extract Text, Tables, Formulas from PDF 是免费的吗?

是的,PDF to Markdown - Extract Text, Tables, Formulas from PDF 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

PDF to Markdown - Extract Text, Tables, Formulas from PDF 支持哪些平台?

PDF to Markdown - Extract Text, Tables, Formulas from PDF 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 PDF to Markdown - Extract Text, Tables, Formulas from PDF?

由 tanis90(@tanis90)开发并维护,当前版本 v1.0.4。

💬 留言讨论