← 返回 Skills 市场
mineru-extract

mineru document extractor

作者 MinerU-Extract · GitHub ↗ · v0.1.29 · MIT-0
cross-platform ✓ 安全检测通过
3200
总下载
6
收藏
6
当前安装
38
版本数
在 OpenClaw 中安装
/install mineru-document-extractor
功能描述
MinerU document extraction — convert PDFs, scanned documents, images, Word (DOC/DOCX), PowerPoint (PPT/PPTX), and web pages into clean Markdown, HTML, LaTeX,...
安全使用建议
This skill is coherent: it simply instructs the agent to use the mineru-open-api CLI which sends documents to mineru.net for server-side extraction. Before installing, consider: 1) Privacy — any document you submit (including sensitive content) will be uploaded to mineru.net; confirm the service's privacy policy and whether that is acceptable. 2) Token storage — if you use MINERU_TOKEN, it may be stored in ~/.mineru/config.yaml or as an env var; protect that token. 3) Package provenance — verify the mineru-open-api npm/GitHub project is the legitimate upstream (check repository, maintainers, and package integrity) before installing in production. 4) Network fetching — the CLI can crawl arbitrary URLs; if you allow autonomous agent runs, restrict network access or avoid running crawl on untrusted inputs to reduce SSRF or accidental exfiltration to internal endpoints. 5) Sandbox installation — if you have doubts, test the CLI in an isolated environment (container or VM) and inspect what (if any) files/configs are created. If you want a local-only option (no external upload), look for tools that run extraction locally rather than calling a remote API.
功能分析
Type: OpenClaw Skill Name: mineru-document-extractor Version: 0.1.29 The skill is a legitimate wrapper for the MinerU document extraction CLI (mineru-open-api), used for converting PDFs and other documents into Markdown or HTML. It explicitly discloses in its metadata that data is transmitted to the mineru.net API for processing. The skill correctly restricts the agent's capabilities using the 'allowed-tools' field to only the specific 'mineru-open-api' binary, and the instructions in SKILL.md are well-aligned with the tool's stated purpose without any evidence of malicious intent or prompt injection.
能力评估
Purpose & Capability
Name/description match the runtime instructions and declared dependencies: the skill is an instruction-only wrapper around the mineru-open-api CLI. The declared required binary (mineru-open-api) and optional MINERU_TOKEN env/config are appropriate for a cloud-based document extraction tool.
Instruction Scope
SKILL.md only instructs the agent to run the mineru-open-api CLI (flash-extract, extract, crawl, auth). It references the token, optional config (~/.mineru/config.yaml), and remote API host (mineru.net). This stays within the stated purpose. Note: the crawl command fetches arbitrary HTTP/HTTPS URLs (so a running agent could request internal or external endpoints if invoked), and all document data is sent to mineru.net for server-side processing per the metadata.
Install Mechanism
Install options are standard package installs (npm package mineru-open-api and a Go install from github.com/opendatalab). No downloads from random shorteners or personal IP addresses are specified. As with any third-party npm/go package, verify the package source/repo and integrity before installing into sensitive environments.
Credentials
The skill does not require unrelated credentials. MINERU_TOKEN is optional and justified for higher-capability 'extract' and 'crawl' modes; config path (~/.mineru/config.yaml) is consistent with the CLI's auth behavior. No unrelated secrets or system credentials are requested.
Persistence & Privilege
always is false and the skill does not request elevated or persistent platform privileges. It does not modify other skills or agent-wide settings. Note: autonomous agent invocation (default) combined with the ability to crawl URLs means an agent could be used to fetch arbitrary endpoints if allowed to run without restrictions.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install mineru-document-extractor
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /mineru-document-extractor 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.29
No user-visible changes in this release — SKILL.md and included documentation remain unchanged.
v0.1.28
- Updated the skill metadata to remove mention of the mineru-open-api source code reference and clarify package installation steps. - Adjusted installation info and metadata to align with current packaging and supported platforms. - Removed references to the mineru-open-api CLI as the official open-source client in the privacy notice. - No changes to the tool's usage, commands, or core functionality.
v0.1.27
mineru-document-extractor 0.1.27 - Added metadata section to SKILL.md to improve discoverability and clarify installation, privacy, and usage details. - No changes to command syntax, features, or workflow. - No code or functional changes detected.
v0.1.26
- MinerU flash-extract now features table recognition, formula recognition, and OCR, expanding its quick extraction capabilities. - Updated comparison: both extraction modes (flash-extract and extract) now support tables, formulas, and OCR. - Clarified that advanced features like VLM model selection, multi-format output, and batch processing remain exclusive to the precision extract mode. - Adjusted usage guidelines and workflow examples to reflect the enhanced abilities of flash-extract for instant document conversion.
v0.1.25
- Documentation was significantly rewritten for clarity and conciseness. - All usage instructions and workflows are now consistently branded as "MinerU". - Reorganized sections and simplified tables for easier reading. - Command/flag lists are streamlined; repetitive/advanced batch details were removed or condensed. - More direct agent usage rules and examples are provided. - Technical content and command references remain unchanged.
v0.1.24
- Skill name updated from "mineru" to "MinerU Document Extractor" - Title and headings clarified for consistency and branding - No changes to functionality or commands - No code or logic modifications detected
v0.1.23
Version 0.1.23 of mineru-document-extractor - No functional or documentation changes detected in this release. - Version update with no file modifications.
v0.1.22
No functional changes; minor metadata and description update. - Updated the skill description and metadata for clarity and completeness. - No code or command changes.
v0.1.21
- Added tags section to metadata for improved discoverability and categorization. - No functional changes to document extraction features or CLI usage. - Documentation updated only; no code or behavior changes included.
v0.1.20
No functional changes detected in version 0.1.20. - No code or documentation changes in this release. - Skill content and capabilities remain unchanged.
v0.1.19
No changes detected in this version. - Version 0.1.19 was released with no detected file or documentation updates.
v0.1.18
- Updated documentation headline from "Document Extraction with mineru-open-api" to "Document Extraction with mineru agent api" - No code or functionality changes detected in this version - All installation, usage, and feature notes remain unchanged
v0.1.17
- Added _meta.json file for metadata management. - No changes to core functionality or documentation content.
v0.1.16
- Removed the _meta.json file from the skill package. - No changes to functionality or user-facing documentation.
v0.1.15
No file changes detected in this release. - No updates or changes; internal version bump only. - Functionality and documentation remain the same as the previous version.
v0.1.14
No changes detected in this version. - Version bumped to 0.1.14 with no file or documentation changes. - No new features, fixes, or updates included in this release.
v0.1.13
- No file or documentation changes detected in this release. - Version bump only; functionality remains unchanged.
v0.1.12
- Removed the file CONTRIBUTING.md from the project. - No functional or user-facing changes.
v0.1.11
- Added CONTRIBUTING.md to provide contribution guidelines. - Added _meta.json for additional skill metadata configuration.
v0.1.10
Fix: move config.yaml from requires to optional (flash-extract works without it)
元数据
Slug mineru-document-extractor
版本 0.1.29
许可证 MIT-0
累计安装 6
当前安装数 6
历史版本数 38
常见问题

mineru document extractor 是什么?

MinerU document extraction — convert PDFs, scanned documents, images, Word (DOC/DOCX), PowerPoint (PPT/PPTX), and web pages into clean Markdown, HTML, LaTeX,... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 3200 次。

如何安装 mineru document extractor?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install mineru-document-extractor」即可一键安装,无需额外配置。

mineru document extractor 是免费的吗?

是的,mineru document extractor 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

mineru document extractor 支持哪些平台?

mineru document extractor 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 mineru document extractor?

由 MinerU-Extract(@mineru-extract)开发并维护,当前版本 v0.1.29。

💬 留言讨论