← 返回 Skills 市场

Doc OCR

Name: Doc OCR
Author: mzlzyca

作者 mzlzyCA · GitHub ↗ · v0.4.0 · MIT-0

cross-platform ✓ 安全检测通过

201

总下载

当前安装

版本数

在 OpenClaw 中安装

/install doc-ocr

功能描述

OCR (Optical Character Recognition) for Word documents (.docx) containing scanned pages or image-embedded content. Uses MinerU to extract text from Word file...

使用说明 (SKILL.md)

Doc OCR

Use OCR to extract text from Word (.docx) files that contain scanned pages or image-embedded content, using MinerU.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# OCR extraction from .docx (requires token)
mineru-open-api extract report.docx --ocr -o ./out/

# With VLM model for better accuracy on complex image layouts
mineru-open-api extract report.docx --ocr --model vlm -o ./out/

Authentication

Token required:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

Supported input: .docx (local file or URL)
OCR is only available via extract (requires token)
Use --ocr flag to enable OCR on image-embedded content
Use --model vlm for complex or mixed-content documents
Language hint with --language (default: ch, use en for English)

Notes

OCR is NOT available in flash-extract — use extract with --ocr
If the .docx has a normal text layer, OCR is not needed — use doc-extract instead
Output goes to stdout by default; use -o \x3Cdir> to save to a file or directory
All progress/status messages go to stderr; document content goes to stdout
MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

安全使用建议

This skill appears to do what it says: it runs the MinerU CLI to OCR .docx files and requires a MinerU API token. Before installing: (1) confirm you trust the npm package or GitHub repo (inspect source if you need high assurance); (2) treat MINERU_TOKEN like a secret—use a token with minimal scope and do not store it in shared places; (3) assume documents processed may be uploaded to MinerU's servers—do not OCR highly sensitive documents unless you verify local-only processing or run your own MinerU instance; (4) prefer installing from official project releases or from source if you want to audit behavior (npm installs can run scripts).

功能分析

Type: OpenClaw Skill Name: doc-ocr Version: 0.4.0 The skill provides instructions for using the MinerU OCR service to extract text from Word documents via the 'mineru-open-api' CLI tool. It requires a legitimate API token and points to official resources from OpenDataLab (Shanghai AI Lab). No malicious code, obfuscation, or prompt injection attempts were found; the behavior is entirely consistent with the stated purpose of document OCR.

能力评估

✓ Purpose & Capability

Name/description (OCR for .docx using MinerU) matches the declared requirements: a mineru-open-api binary and a MINERU_TOKEN. The install options (npm or go install for mineru-open-api) are the expected way to obtain that CLI.

ℹ Instruction Scope

SKILL.md only instructs running mineru-open-api on local files or URLs and configuring MINERU_TOKEN. It does not ask the agent to read unrelated files or environment variables. Important caveat: the docs and auth flow imply processing via MinerU's service (token management and API token creation), so document contents may be uploaded to an external service—review privacy requirements before OCRing sensitive documents.

ℹ Install Mechanism

Install spec uses npm (mineru-open-api) or go install from a GitHub path — both are reasonable for a CLI. Note that global npm installs run package scripts and that npm packages come from the public registry; if you need higher assurance, inspect the package source or install from the project repo directly.

✓ Credentials

Only MINERU_TOKEN is required and set as the primary credential, which is proportionate for a remote OCR API. Keep the token secret and limit its scope if possible.

✓ Persistence & Privilege

Skill is not always-enabled and does not request system config paths or other skills' credentials. It is user-invocable and can be autonomously called by the agent (normal behavior) but does not request elevated persistence.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install doc-ocr
安装完成后，直接呼叫该 Skill 的名称或使用 /doc-ocr 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v0.4.0

SEO: expand description for better ClawHub vector search discovery

v0.3.0

Rollback to original version

v0.2.0

SEO optimization: expanded description with rich keywords, trigger phrases, and bilingual content for better ClawHub vector search ranking.

v1.1.0

Update to v1.1.0

v1.0.1

Fix: declare MINERU_TOKEN credential in metadata

v1.0.0

Doc OCR - use OCR to extract text from Word (.docx) files with scanned or image-embedded content usi

元数据

Slug doc-ocr

版本 0.4.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 6

常见问题

Doc OCR 是什么？

OCR (Optical Character Recognition) for Word documents (.docx) containing scanned pages or image-embedded content. Uses MinerU to extract text from Word file... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 201 次。

如何安装 Doc OCR？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install doc-ocr」即可一键安装，无需额外配置。

Doc OCR 是免费的吗？

是的，Doc OCR 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Doc OCR 支持哪些平台？

Doc OCR 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Doc OCR？

由 mzlzyCA（@mzlzyca）开发并维护，当前版本 v0.4.0。

Doc OCR

Doc OCR

Install

Quick Start

Authentication

Capabilities

Notes

Doc OCR 是什么？

如何安装 Doc OCR？

Doc OCR 是免费的吗？

Doc OCR 支持哪些平台？

谁开发了 Doc OCR？

💬 留言讨论