← Back to Skills Marketplace
894
Downloads
0
Stars
7
Active Installs
1
Versions
Install in OpenClaw
/install mineru-pdf
Description
Parse PDF documents with MinerU MCP to extract text, tables, and formulas. Supports multiple backends including MLX-accelerated inference on Apple Silicon.
Usage Guidance
This skill appears to do what it says: parse PDFs via MinerU MCP or the included Python wrapper. Before installing or running it: (1) ensure you trust the uvx/mcp-mineru package source and be aware that model downloads may occur on first use; (2) run parse.py with an explicit output_dir to avoid accidental writes to sensitive locations; (3) do not run test.sh without inspecting or replacing its default PDF path (it points to an inbound media file under ~/.openclaw); and (4) if you need stronger isolation, run the tool in a sandbox or VM since it will create persistent files and may download model artifacts.
Capability Analysis
Type: OpenClaw Skill
Name: mineru-pdf
Version: 1.0.0
The skill is classified as suspicious due to a shell injection vulnerability found in the `test.sh` script. The script directly interpolates the `$PDF_FILE` variable into a Python string executed by `uvx`, which could allow an attacker to inject arbitrary commands if the `PDF_FILE` variable contains malicious characters. While `parse.py` uses `argparse` for robust input handling, the `test.sh` script demonstrates a critical RCE risk pattern. No evidence of intentional malicious behavior (e.g., data exfiltration, persistence, or prompt injection against the agent) was found in other files like `SKILL.md` or `parse.py`.
Capability Assessment
Purpose & Capability
Name/description match the included files and instructions: the SKILL.md and parse.py call MinerU components (via uvx/mcp-mineru or direct Python), and the declared required binary (uvx) is actually used in examples. There are no unrelated binaries or unexpected credential requests.
Instruction Scope
Instructions focus on parsing PDFs and saving outputs. parse.py reads a user-supplied PDF and writes parsed files to an output directory (persistent storage). Note: examples use absolute local paths (e.g., /Users/lwj04/...), and test.sh has a default PDF path under .openclaw/media/inbound — running the test.sh unmodified could act on that inbound file. This behavior is expected for a parsing tool but users should be aware it writes persistent files and that example paths are hard-coded.
Install Mechanism
No registry install spec is required by the platform; SKILL.md recommends installing via uvx / mcp-mineru (a package-managed installation). There are no downloads from unknown URLs or archive extractions in the skill files themselves.
Credentials
The skill declares no environment variables or credentials and only depends on the uvx binary and the MinerU Python package. That is proportionate for a PDF-parsing wrapper which either invokes uvx/mcp-mineru or imports mineru modules.
Persistence & Privilege
always is false and the skill does not request elevated system-wide privileges or modify other skills' configs. It writes output files to user-specified directories (intentional persistence), which is expected for this use case.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install mineru-pdf - After installation, invoke the skill by name or use
/mineru-pdf - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of mineru-pdf, a PDF parser supporting text, tables, and formulas with Apple Silicon optimization.
- Parse PDF documents using MinerU MCP to extract structured content (text, tables, formulas)
- Supports multiple backends including MLX for Apple Silicon and a general pipeline
- Provides both a direct parsing tool (persistent output) and MinerU MCP integration (temporary output)
- Handles advanced options: specific page ranges, backend selection, table/formula toggles
- Returns structured Markdown output with metadata, Markdown tables, and LaTeX for formulas
- Supports PDF and various image formats; built-in OCR for scanned documents
Metadata
Frequently Asked Questions
What is Mineru Pdf?
Parse PDF documents with MinerU MCP to extract text, tables, and formulas. Supports multiple backends including MLX-accelerated inference on Apple Silicon. It is an AI Agent Skill for Claude Code / OpenClaw, with 894 downloads so far.
How do I install Mineru Pdf?
Run "/install mineru-pdf" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Mineru Pdf free?
Yes, Mineru Pdf is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Mineru Pdf support?
Mineru Pdf is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Mineru Pdf?
It is built and maintained by Etoile04 (@etoile04); the current version is v1.0.0.
More Skills