← 返回 Skills 市场

HTML OCR

Name: HTML OCR
Author: mzlzyca

作者 mzlzyCA · GitHub ↗ · v0.4.0 · MIT-0

cross-platform ✓ 安全检测通过

173

总下载

当前安装

版本数

在 OpenClaw 中安装

/install html-ocr

功能描述

OCR for HTML pages containing image-embedded or scanned content. Uses MinerU to extract text from images within HTML files and web pages. Features: OCR extra...

使用说明 (SKILL.md)

HTML OCR

Use OCR to extract text from HTML files that contain scanned images or image-embedded content using MinerU.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# OCR extraction from local HTML file (requires token)
mineru-open-api extract page.html --ocr -o ./out/

# With VLM model for better accuracy
mineru-open-api extract page.html --ocr --model vlm -o ./out/

Authentication

Token required:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

Supported input: local .html file
OCR requires extract with token — not available in flash-extract
Use --ocr flag to enable OCR on image-embedded content in HTML
Use --model vlm for complex or mixed-content pages

Notes

HTML is NOT supported by flash-extract; use extract with token
If the HTML has normal text content, OCR is not needed — use html-extract instead
Output goes to stdout by default; use -o \x3Cdir> to save to a file or directory
All progress/status messages go to stderr; document content goes to stdout
MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

安全使用建议

This skill appears to do what it says: it runs the MinerU CLI and requires a MINERU_TOKEN. Before installing, consider: (1) MINERU_TOKEN is sensitive — only provide it if you trust MinerU/mineru-open-api and the organization. (2) OCRing local HTML/images will likely upload those files to MinerU servers — avoid sending sensitive documents. (3) Global npm packages and go installs execute third-party code at install time; review the mineru-open-api package source or GitHub repo (https://github.com/opendatalab/MinerU) if you need higher assurance. (4) If you decide to proceed, run the CLI in an isolated environment/container and grant the token minimal scope; revoke the token if you suspect misuse.

功能分析

Type: OpenClaw Skill Name: html-ocr Version: 0.4.0 The skill bundle provides instructions and metadata for using the MinerU OCR tool (developed by OpenDataLab) to process HTML files. It correctly identifies dependencies on the legitimate 'mineru-open-api' package and requires a standard API token (MINERU_TOKEN) for its cloud-based extraction features. No malicious code, data exfiltration, or prompt injection attempts were found in SKILL.md or _meta.json.

能力评估

✓ Purpose & Capability

The skill is an instruction-only wrapper around the mineru-open-api CLI and declares the exact binary and MINERU_TOKEN credential it needs. The declared dependencies (mineru-open-api via npm or Go) and the MINERU_TOKEN credential align with the stated HTML OCR purpose.

ℹ Instruction Scope

SKILL.md instructs the agent to run mineru-open-api on local HTML files and to authenticate with MINERU_TOKEN or interactive auth. This stays within the OCR purpose. Note: running the CLI will upload page content/images to MinerU's service (expected for a remote OCR API), so private/sensitive content may be transmitted externally.

✓ Install Mechanism

Install options are npm (mineru-open-api) or go install from a GitHub path. These are expected for a CLI tool and are not downloads from arbitrary URLs. Installing a global npm package or running go install executes third-party code at install time — standard but worth auditing if you don't trust the publisher.

✓ Credentials

The only required environment variable is MINERU_TOKEN (declared as primaryEnv). That is proportionate for an API-backed OCR CLI. No unrelated credentials or extra config paths are requested.

✓ Persistence & Privilege

The skill is user-invocable, not always-on, and does not request special system persistence or cross-skill configuration. It uses normal CLI invocation and environment token-based auth.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install html-ocr
安装完成后，直接呼叫该 Skill 的名称或使用 /html-ocr 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v0.4.0

SEO: expand description for better ClawHub vector search discovery

v0.3.0

Rollback to original version

v0.2.0

SEO optimization v0.2.0

v1.0.1

Fix: declare MINERU_TOKEN credential in metadata

v1.0.0

HTML OCR - use OCR to extract text from HTML files that contain scanned images or image-embedded con

元数据

Slug html-ocr

版本 0.4.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 5

常见问题

HTML OCR 是什么？

OCR for HTML pages containing image-embedded or scanned content. Uses MinerU to extract text from images within HTML files and web pages. Features: OCR extra... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 173 次。

如何安装 HTML OCR？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install html-ocr」即可一键安装，无需额外配置。

HTML OCR 是免费的吗？

是的，HTML OCR 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

HTML OCR 支持哪些平台？

HTML OCR 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 HTML OCR？

由 mzlzyCA（@mzlzyca）开发并维护，当前版本 v0.4.0。

HTML OCR

HTML OCR

Install

Quick Start

Authentication

Capabilities

Notes

HTML OCR 是什么？

如何安装 HTML OCR？

HTML OCR 是免费的吗？

HTML OCR 支持哪些平台？

谁开发了 HTML OCR？

💬 留言讨论