← Back to Skills Marketplace
1551
Downloads
0
Stars
1
Active Installs
3
Versions
Install in OpenClaw
/install pdf-parser-mineru
Description
PDF document parsing tool based on local MinerU, supports converting PDF to Markdown, JSON, and other machine-readable formats.
Usage Guidance
This skill is internally coherent: it installs MinerU and runs the mineru CLI to convert PDFs to Markdown/JSON. Before installing, consider the following: (1) mineru is a third-party PyPI package — review its project page and dependencies and prefer installation into an isolated virtual environment or container; (2) MinerU may download models or contact network endpoints at install/time or runtime — if you need offline/sandboxed processing, verify or block network access; (3) the skill requires absolute file paths and can read any PDF you point it at — avoid supplying sensitive documents to untrusted third-party binaries; (4) the included install.sh is safe-looking but will run pip installs and assumes Python 3.10–3.13; run it manually rather than automatically if you want to inspect it first. If you want stronger assurance, review the mineru package source and any model download behavior before use.
Capability Analysis
Type: OpenClaw Skill
Name: pdf-parser-mineru
Version: 1.0.2
The OpenClaw AgentSkills skill bundle for PDF parsing using MinerU is classified as benign. All files, including `SKILL.md`, `install.sh`, and `script/pdf_parser.py`, align with the stated purpose of converting PDFs to Markdown or JSON. The `install.sh` script uses standard Python package management tools (`pip`, `uv`) to install the `mineru` dependency without any suspicious remote execution or persistence mechanisms. Crucially, the `script/pdf_parser.py` uses `subprocess.run` with a list of arguments, which safely prevents shell injection vulnerabilities from user-controlled parameters like `file_path` and `output_dir`. There are no indications of prompt injection attempts, data exfiltration, unauthorized network activity, or other malicious behaviors.
Capability Assessment
Purpose & Capability
Name/description match the included files: SKILL.md documents running MinerU and the repository provides an install script and a Python wrapper that invokes the mineru CLI. Required capabilities (MinerU installation, Python) are proportional to the stated parsing functionality.
Instruction Scope
Runtime instructions and the Python script stay within the skill's scope: they accept an absolute file path and output directory, run a local mineru CLI process, and read/return generated files. The script sets a couple of local env vars to control device selection for the subprocess but does not read or transmit unrelated system secrets or contact hidden endpoints itself. Note: mineru (the third-party tool) may perform network activity or model downloads — that behavior is external to the skill and should be reviewed if you need offline guarantees.
Install Mechanism
There is no platform install spec in registry metadata, but an included install.sh performs pip and 'uv pip install -U "mineru[all]"'. Installing MinerU via PyPI is expected here; it's a moderate-risk operation (pulling packages from PyPI and possibly downloading models/data at runtime). No obscure URLs, shorteners, or direct archive downloads are used in the provided scripts.
Credentials
The skill requests no environment variables or credentials. The code sets PYTORCH_ENABLE_MPS_FALLBACK and MPS_DEVICE locally for the mineru subprocess (device control only). There are no requests for unrelated secrets or config paths.
Persistence & Privilege
Skill flags are standard (always: false, agent invocation allowed). The package does not request permanent system changes or modify other skills' configs. install.sh and the Python script only install MinerU and run it; they do not attempt to persist credentials or enable automatic always-on behavior.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install pdf-parser-mineru - After installation, invoke the skill by name or use
/pdf-parser-mineru - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.2
- Added a Chinese-language documentation file: SKILL_zh.md
- No changes to code or functionality; documentation is now available in both English and Chinese.
v1.0.1
- 修正 SKILL.md 文件冒头字段,将原有的 YAML 字段 name 和描述语句合并为 name/description 形式。
- 其余内容未发生实质性变更。
v1.0.0
Initial release with MinerU-based PDF parsing and conversion tools.
- Added pdf_to_markdown: Convert PDFs to Markdown with structure, formulas, tables, and image extraction using MinerU.
- Added pdf_to_json: Convert PDFs to structured JSON with detailed layout, blocks, images, tables, and formulas.
- Both tools support OCR, formula/table extraction toggles, multi-language, page range selection, and multiple parsing backends.
- Included setup and system requirements, backend selection tips, troubleshooting, and usage scenarios.
Metadata
Frequently Asked Questions
What is pdf-parser-mineru?
PDF document parsing tool based on local MinerU, supports converting PDF to Markdown, JSON, and other machine-readable formats. It is an AI Agent Skill for Claude Code / OpenClaw, with 1551 downloads so far.
How do I install pdf-parser-mineru?
Run "/install pdf-parser-mineru" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is pdf-parser-mineru free?
Yes, pdf-parser-mineru is completely free (open-source). You can download, install and use it at no cost.
Which platforms does pdf-parser-mineru support?
pdf-parser-mineru is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created pdf-parser-mineru?
It is built and maintained by baokui (@baokui); the current version is v1.0.2.
More Skills