← 返回 Skills 市场
201
总下载
0
收藏
0
当前安装
6
版本数
在 OpenClaw 中安装
/install doc-ocr
功能描述
OCR (Optical Character Recognition) for Word documents (.docx) containing scanned pages or image-embedded content. Uses MinerU to extract text from Word file...
使用说明 (SKILL.md)
Doc OCR
Use OCR to extract text from Word (.docx) files that contain scanned pages or image-embedded content, using MinerU.
Install
npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
Quick Start
# OCR extraction from .docx (requires token)
mineru-open-api extract report.docx --ocr -o ./out/
# With VLM model for better accuracy on complex image layouts
mineru-open-api extract report.docx --ocr --model vlm -o ./out/
Authentication
Token required:
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable
Create token at: https://mineru.net/apiManage/token
Capabilities
- Supported input: .docx (local file or URL)
- OCR is only available via
extract(requires token) - Use
--ocrflag to enable OCR on image-embedded content - Use
--model vlmfor complex or mixed-content documents - Language hint with
--language(default:ch, useenfor English)
Notes
- OCR is NOT available in
flash-extract— useextractwith--ocr - If the
.docxhas a normal text layer, OCR is not needed — usedoc-extractinstead - Output goes to stdout by default; use
-o \x3Cdir>to save to a file or directory - All progress/status messages go to stderr; document content goes to stdout
- MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
安全使用建议
This skill appears to do what it says: it runs the MinerU CLI to OCR .docx files and requires a MinerU API token. Before installing: (1) confirm you trust the npm package or GitHub repo (inspect source if you need high assurance); (2) treat MINERU_TOKEN like a secret—use a token with minimal scope and do not store it in shared places; (3) assume documents processed may be uploaded to MinerU's servers—do not OCR highly sensitive documents unless you verify local-only processing or run your own MinerU instance; (4) prefer installing from official project releases or from source if you want to audit behavior (npm installs can run scripts).
功能分析
Type: OpenClaw Skill
Name: doc-ocr
Version: 0.4.0
The skill provides instructions for using the MinerU OCR service to extract text from Word documents via the 'mineru-open-api' CLI tool. It requires a legitimate API token and points to official resources from OpenDataLab (Shanghai AI Lab). No malicious code, obfuscation, or prompt injection attempts were found; the behavior is entirely consistent with the stated purpose of document OCR.
能力评估
Purpose & Capability
Name/description (OCR for .docx using MinerU) matches the declared requirements: a mineru-open-api binary and a MINERU_TOKEN. The install options (npm or go install for mineru-open-api) are the expected way to obtain that CLI.
Instruction Scope
SKILL.md only instructs running mineru-open-api on local files or URLs and configuring MINERU_TOKEN. It does not ask the agent to read unrelated files or environment variables. Important caveat: the docs and auth flow imply processing via MinerU's service (token management and API token creation), so document contents may be uploaded to an external service—review privacy requirements before OCRing sensitive documents.
Install Mechanism
Install spec uses npm (mineru-open-api) or go install from a GitHub path — both are reasonable for a CLI. Note that global npm installs run package scripts and that npm packages come from the public registry; if you need higher assurance, inspect the package source or install from the project repo directly.
Credentials
Only MINERU_TOKEN is required and set as the primary credential, which is proportionate for a remote OCR API. Keep the token secret and limit its scope if possible.
Persistence & Privilege
Skill is not always-enabled and does not request system config paths or other skills' credentials. It is user-invocable and can be autonomously called by the agent (normal behavior) but does not request elevated persistence.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install doc-ocr - 安装完成后,直接呼叫该 Skill 的名称或使用
/doc-ocr触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.4.0
SEO: expand description for better ClawHub vector search discovery
v0.3.0
Rollback to original version
v0.2.0
SEO optimization: expanded description with rich keywords, trigger phrases, and bilingual content for better ClawHub vector search ranking.
v1.1.0
Update to v1.1.0
v1.0.1
Fix: declare MINERU_TOKEN credential in metadata
v1.0.0
Doc OCR - use OCR to extract text from Word (.docx) files with scanned or image-embedded content usi
元数据
常见问题
Doc OCR 是什么?
OCR (Optical Character Recognition) for Word documents (.docx) containing scanned pages or image-embedded content. Uses MinerU to extract text from Word file... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 201 次。
如何安装 Doc OCR?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install doc-ocr」即可一键安装,无需额外配置。
Doc OCR 是免费的吗?
是的,Doc OCR 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Doc OCR 支持哪些平台?
Doc OCR 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Doc OCR?
由 mzlzyCA(@mzlzyca)开发并维护,当前版本 v0.4.0。
推荐 Skills