← 返回 Skills 市场
a-i-r

MinerU PDF Extractor

作者 A-I-R · GitHub ↗ · v1.0.5
cross-platform ✓ 安全检测通过
940
总下载
2
收藏
2
当前安装
6
版本数
在 OpenClaw 中安装
/install mineru-pdf-extractor
功能描述
Extract PDF content to Markdown using MinerU API. Supports formulas, tables, OCR. Provides both local file and online URL parsing methods.
安全使用建议
This skill appears to be what it claims: a set of scripts to call the MinerU API to parse PDFs. Before installing/running: 1) Be sure to set MINERU_TOKEN (or MINERU_API_KEY) — SKILL.md requires it even though the top-level registry metadata omitted it. 2) Review the shell scripts (they are included) and only run them if you trust the source and the MinerU endpoints listed (mineru.net, mineru.oss-cn-shanghai.aliyuncs.com, cdn-mineru.openxlab.org.cn). 3) The scripts use curl and unzip (and may use jq or python3 if present); install those if you want improved JSON handling. 4) Treat your MINERU token as sensitive — do not expose it in public repos or logs, and consider least-privilege options with the provider. 5) If you will process sensitive PDFs, verify the provider's privacy policy before uploading. Overall: coherent and low-risk for its stated purpose, with only the metadata-accuracy and tooling-notes mentioned above to fix.
功能分析
Type: OpenClaw Skill Name: mineru-pdf-extractor Version: 1.0.5 The skill bundle is classified as benign. It demonstrates strong security practices, including explicit input sanitization (`validate_filename`, `validate_dirname`, `escape_json`), URL validation (whitelisting `cdn-mineru.openxlab.org.cn` for downloads in `scripts/local_file_step4_download.sh` and `scripts/online_file_step2_poll_result.sh`), and ZIP file integrity checks (`unzip -t`). The documentation (`SKILL.md`, `docs/*.md`) clearly outlines these security measures and their purpose, indicating a deliberate effort to prevent common vulnerabilities like directory traversal and injection attacks. No evidence of data exfiltration, malicious execution, persistence mechanisms, or prompt injection attempts against the AI agent was found.
能力评估
Purpose & Capability
The skill's name/description (PDF → Markdown using MinerU) matches the included scripts and docs: they call MinerU API endpoints, upload files to presigned OSS URLs, poll results and download a ZIP with parsed Markdown. One inconsistency: the registry metadata at the top states "Required env vars: none", but the SKILL.md and all scripts clearly require an API token (MINERU_TOKEN or MINERU_API_KEY). This is likely an authoring/metadata omission rather than malicious behavior, but users should be aware the token is required.
Instruction Scope
The runtime instructions and scripts operate within stated scope: reading a local PDF path (when using local flow), validating/sanitizing inputs, calling MinerU API endpoints under MINERU_BASE_URL, uploading to presigned OSS URLs and downloading results from the official CDN host. Scripts include input sanitization, ZIP validation and directory traversal checks. They do not attempt to read unrelated system files or send data to unexpected external endpoints. Minor tooling note: scripts optionally pipe responses to `python3 -m json.tool` for pretty-printing but SKILL.md does not list python3 as a recommended/required tool.
Install Mechanism
There is no install spec; this is an instruction-only skill with included shell scripts. Nothing in the bundle downloads arbitrary code at install time. Risk is low from the install mechanism itself. However, running the provided scripts will execute code included in the repo, so users should review them before executing.
Credentials
The scripts require a single service credential (MINERU_TOKEN or MINERU_API_KEY) and optionally MINERU_BASE_URL. That is proportional for a MinerU API integration. The only notable mismatch is registry metadata claiming no required env vars while SKILL.md and scripts require the token—this should be corrected. No unrelated secrets or broad cloud credentials (AWS, GCP, etc.) are requested.
Persistence & Privilege
The skill does not request permanent/always-on privileges, does not alter other skills or system-wide configs, and is user-invocable only. Default autonomous invocation is allowed (platform normal) but the skill itself does not request elevated persistence.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install mineru-pdf-extractor
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /mineru-pdf-extractor 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.5
No file changes were detected in this version. - Version metadata updated without modifications to any skill files. - No user-facing features, documentation, or code have changed.
v1.0.4
- Added Security Functions in Files in docs Folder.
v1.0.3
No functional changes in this release. - Documentation metadata in SKILL.md was updated for improved formatting and clarity. - Added author, version, requirements, and optional fields in YAML format. - Linked and described both English and Chinese documentation files. - No changes to code or features.
v1.0.2
- Added Chinese documentation files: SKILL_zh.md and two detailed workflow guides under docs/ for both online and local document parsing. - Updated requirements to mention optional jq dependency for enhanced JSON parsing and security. - No changes to main logic or features; update focuses on documentation and usability for Chinese-speaking users.
v1.0.1
- Added homepage and source repository links to metadata for easier access to the MinerU website and GitHub source. - Clarified that this is a community skill and not an official MinerU product. - No behavioral changes or new features; documentation improvements only.
v1.0.0
Initial release of mineru-pdf-extractor. - Extract PDF content to Markdown using MinerU API, with support for formulas, tables, and OCR. - Provides scripts and documentation for both local file and online URL parsing methods. - Local parsing has a 4-step process; online parsing is a 2-step process. - Output includes Markdown files, extracted images, and structured JSON data. - Requires curl, unzip, and a MinerU API token set as an environment variable. - Detailed usage guides and batch processing examples included.
元数据
Slug mineru-pdf-extractor
版本 1.0.5
许可证
累计安装 2
当前安装数 2
历史版本数 6
常见问题

MinerU PDF Extractor 是什么?

Extract PDF content to Markdown using MinerU API. Supports formulas, tables, OCR. Provides both local file and online URL parsing methods. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 940 次。

如何安装 MinerU PDF Extractor?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install mineru-pdf-extractor」即可一键安装,无需额外配置。

MinerU PDF Extractor 是免费的吗?

是的,MinerU PDF Extractor 完全免费(开源免费),可自由下载、安装和使用。

MinerU PDF Extractor 支持哪些平台?

MinerU PDF Extractor 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 MinerU PDF Extractor?

由 A-I-R(@a-i-r)开发并维护,当前版本 v1.0.5。

💬 留言讨论