← Back to Skills Marketplace
mzlzyca

HTML Analysis

by mzlzyCA · GitHub ↗ · v0.4.0 · MIT-0
cross-platform ✓ Security Clean
176
Downloads
0
Stars
1
Active Installs
5
Versions
Install in OpenClaw
/install html-analysis
Description
Analyze the structure and content of HTML documents using MinerU. Returns structured Markdown with layout information, headings, and content hierarchy preser...
README (SKILL.md)

HTML Analysis

Analyze and extract structured content from local HTML files using MinerU. Preserves document structure as Markdown. For live web page URLs, use mineru-open-api crawl.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Analyze a local HTML file (requires token)
mineru-open-api extract page.html -o ./out/

# Analyze a remote HTML file by URL (requires token)
mineru-open-api extract https://example.com/page.html -o ./out/

# Crawl a live web page (requires token)
mineru-open-api crawl https://example.com/article -o ./out/

Authentication

Token required:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

  • Supported input: local .html file or remote HTML URL
  • HTML input requires extract (token required) — not supported by flash-extract
  • For live web pages (rendered JS content), use mineru-open-api crawl
  • Language hint with --language (default: ch, use en for English)

Notes

  • HTML is NOT supported by flash-extract — use extract with token
  • For web page crawling, use mineru-open-api crawl \x3CURL> instead of extract
  • Output goes to stdout by default; use -o \x3Cdir> to save to a file or directory
  • All progress/status messages go to stderr; document content goes to stdout
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
Usage Guidance
This skill delegates HTML analysis to the MinerU CLI and requires a MinerU API token. Before installing: verify the npm package and the GitHub repo are the legitimate MinerU project, confirm the token you create has minimal scope and no unnecessary permissions, and avoid sending locally stored files that contain secrets to the remote service unless you trust MinerU's handling and storage policies. If you need fully offline analysis, prefer a local-only tool; otherwise ensure your environment policy permits the CLI to make outgoing network requests to MinerU endpoints.
Capability Analysis
Type: OpenClaw Skill Name: html-analysis Version: 0.4.0 The skill provides an interface for the MinerU document intelligence engine (OpenDataLab) to analyze HTML structures. It utilizes the 'mineru-open-api' CLI tool and requires a 'MINERU_TOKEN' to interact with the mineru.net API. The instructions in SKILL.md are consistent with the stated purpose of document analysis and crawling, and no evidence of malicious intent, data exfiltration beyond the intended API usage, or prompt injection was found.
Capability Assessment
Purpose & Capability
Name/description, required binary (mineru-open-api), and required env var (MINERU_TOKEN) all align: the skill is explicitly a MinerU-backed HTML analyzer and only asks for the CLI and its token.
Instruction Scope
SKILL.md only instructs using the mineru-open-api CLI on local HTML files or URLs, how to authenticate, and where outputs go. It does not ask to read unrelated system files or exfiltrate data to unexpected endpoints.
Install Mechanism
Install options are npm (mineru-open-api) and go install from a GitHub repo—both are standard distribution methods for CLI tools and appropriate for this purpose (no arbitrary download URLs or extract-from-unknown-hosts).
Credentials
Only a single token (MINERU_TOKEN) is required and declared as the primary credential; this is expected for an API-backed CLI and is proportional to the described functionality.
Persistence & Privilege
No 'always: true'; default autonomous invocation is allowed (normal). The skill does not request system-wide config changes or access to other skills' credentials.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install html-analysis
  3. After installation, invoke the skill by name or use /html-analysis
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.4.0
SEO: expand description for better ClawHub vector search discovery
v0.3.0
Rollback to original version
v0.2.0
SEO optimization v0.2.0
v1.0.1
Fix: declare MINERU_TOKEN credential in metadata
v1.0.0
Analyze and extract structured content from HTML files using MinerU mineru-open-api
Metadata
Slug html-analysis
Version 0.4.0
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 5
Frequently Asked Questions

What is HTML Analysis?

Analyze the structure and content of HTML documents using MinerU. Returns structured Markdown with layout information, headings, and content hierarchy preser... It is an AI Agent Skill for Claude Code / OpenClaw, with 176 downloads so far.

How do I install HTML Analysis?

Run "/install html-analysis" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is HTML Analysis free?

Yes, HTML Analysis is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does HTML Analysis support?

HTML Analysis is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created HTML Analysis?

It is built and maintained by mzlzyCA (@mzlzyca); the current version is v0.4.0.

💬 Comments