← 返回 Skills 市场
mzlzyca

HTML Analysis

作者 mzlzyCA · GitHub ↗ · v0.4.0 · MIT-0
cross-platform ✓ 安全检测通过
176
总下载
0
收藏
1
当前安装
5
版本数
在 OpenClaw 中安装
/install html-analysis
功能描述
Analyze the structure and content of HTML documents using MinerU. Returns structured Markdown with layout information, headings, and content hierarchy preser...
使用说明 (SKILL.md)

HTML Analysis

Analyze and extract structured content from local HTML files using MinerU. Preserves document structure as Markdown. For live web page URLs, use mineru-open-api crawl.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Analyze a local HTML file (requires token)
mineru-open-api extract page.html -o ./out/

# Analyze a remote HTML file by URL (requires token)
mineru-open-api extract https://example.com/page.html -o ./out/

# Crawl a live web page (requires token)
mineru-open-api crawl https://example.com/article -o ./out/

Authentication

Token required:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

  • Supported input: local .html file or remote HTML URL
  • HTML input requires extract (token required) — not supported by flash-extract
  • For live web pages (rendered JS content), use mineru-open-api crawl
  • Language hint with --language (default: ch, use en for English)

Notes

  • HTML is NOT supported by flash-extract — use extract with token
  • For web page crawling, use mineru-open-api crawl \x3CURL> instead of extract
  • Output goes to stdout by default; use -o \x3Cdir> to save to a file or directory
  • All progress/status messages go to stderr; document content goes to stdout
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
安全使用建议
This skill delegates HTML analysis to the MinerU CLI and requires a MinerU API token. Before installing: verify the npm package and the GitHub repo are the legitimate MinerU project, confirm the token you create has minimal scope and no unnecessary permissions, and avoid sending locally stored files that contain secrets to the remote service unless you trust MinerU's handling and storage policies. If you need fully offline analysis, prefer a local-only tool; otherwise ensure your environment policy permits the CLI to make outgoing network requests to MinerU endpoints.
功能分析
Type: OpenClaw Skill Name: html-analysis Version: 0.4.0 The skill provides an interface for the MinerU document intelligence engine (OpenDataLab) to analyze HTML structures. It utilizes the 'mineru-open-api' CLI tool and requires a 'MINERU_TOKEN' to interact with the mineru.net API. The instructions in SKILL.md are consistent with the stated purpose of document analysis and crawling, and no evidence of malicious intent, data exfiltration beyond the intended API usage, or prompt injection was found.
能力评估
Purpose & Capability
Name/description, required binary (mineru-open-api), and required env var (MINERU_TOKEN) all align: the skill is explicitly a MinerU-backed HTML analyzer and only asks for the CLI and its token.
Instruction Scope
SKILL.md only instructs using the mineru-open-api CLI on local HTML files or URLs, how to authenticate, and where outputs go. It does not ask to read unrelated system files or exfiltrate data to unexpected endpoints.
Install Mechanism
Install options are npm (mineru-open-api) and go install from a GitHub repo—both are standard distribution methods for CLI tools and appropriate for this purpose (no arbitrary download URLs or extract-from-unknown-hosts).
Credentials
Only a single token (MINERU_TOKEN) is required and declared as the primary credential; this is expected for an API-backed CLI and is proportional to the described functionality.
Persistence & Privilege
No 'always: true'; default autonomous invocation is allowed (normal). The skill does not request system-wide config changes or access to other skills' credentials.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install html-analysis
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /html-analysis 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.4.0
SEO: expand description for better ClawHub vector search discovery
v0.3.0
Rollback to original version
v0.2.0
SEO optimization v0.2.0
v1.0.1
Fix: declare MINERU_TOKEN credential in metadata
v1.0.0
Analyze and extract structured content from HTML files using MinerU mineru-open-api
元数据
Slug html-analysis
版本 0.4.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 5
常见问题

HTML Analysis 是什么?

Analyze the structure and content of HTML documents using MinerU. Returns structured Markdown with layout information, headings, and content hierarchy preser... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 176 次。

如何安装 HTML Analysis?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install html-analysis」即可一键安装,无需额外配置。

HTML Analysis 是免费的吗?

是的,HTML Analysis 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

HTML Analysis 支持哪些平台?

HTML Analysis 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 HTML Analysis?

由 mzlzyCA(@mzlzyca)开发并维护,当前版本 v0.4.0。

💬 留言讨论