← 返回 Skills 市场
167
总下载
0
收藏
0
当前安装
5
版本数
在 OpenClaw 中安装
/install html-to-html
功能描述
Clean and restructure HTML documents using MinerU. Takes messy or complex HTML and produces clean, well-formatted HTML output with proper structure preserved...
使用说明 (SKILL.md)
HTML to HTML
Fetch a remote web page or local HTML file and convert it to clean structured HTML using MinerU. Strips noise and preserves semantic content.
Install
npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
Quick Start
# Crawl a web page and output clean HTML (requires token)
mineru-open-api crawl https://example.com/article -f html -o ./out/
# Re-extract a local HTML file to clean HTML (requires token)
mineru-open-api extract page.html -f html -o ./out/
# Batch crawl multiple URLs to HTML (requires token)
mineru-open-api crawl url1 url2 -f html -o ./pages/
Authentication
Token required:
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable
Create token at: https://mineru.net/apiManage/token
Capabilities
- Input: remote web page URL or local .html file
- Output: clean structured HTML (
-f html) - For remote URLs: use
crawl -f html - For local HTML files: use
extract -f html - Requires token — not available in
flash-extract
Notes
- HTML output (
-f html) requires token; not available inflash-extract crawlsupports output formats: md, html, jsonextractsupports output formats: md, html, latex, docx, json- Output goes to stdout by default; use
-o \x3Cdir>to save to a file or directory - All progress/status messages go to stderr; document content goes to stdout
- MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
安全使用建议
This skill appears coherent: it runs the mineru-open-api CLI and needs a MINERU_TOKEN from mineru.net. Before installing, verify the npm package and GitHub repo are legitimate (check publisher, recent commits, and npm download counts). Treat MINERU_TOKEN like any API credential: only provide a token with the minimal needed scopes, avoid using it with highly sensitive local HTML unless you accept sending content to the MinerU service, and rotate/delete the token if you stop using the skill.
功能分析
Type: OpenClaw Skill
Name: html-to-html
Version: 0.4.0
The html-to-html skill is a legitimate wrapper for the MinerU document intelligence engine (by OpenDataLab). It facilitates cleaning and restructuring HTML via the 'mineru-open-api' CLI tool. The SKILL.md file contains standard installation instructions (npm/go) and usage examples for crawling URLs or extracting local files. It requires a MINERU_TOKEN for authentication but shows no signs of data exfiltration, malicious execution, or prompt injection.
能力评估
Purpose & Capability
Name/description (HTML cleanup via MinerU) align with required binary (mineru-open-api) and required env var (MINERU_TOKEN). The primary credential and declared binaries are exactly what the CLI needs to function.
Instruction Scope
SKILL.md only instructs the agent to run mineru-open-api commands against remote URLs or local HTML files, use the auth flow, and write output to stdout or files. It does not ask the agent to read unrelated system files, other credentials, or post data to unexpected endpoints beyond MinerU's API.
Install Mechanism
Installation options are standard package installs (npm package and Go install from a GitHub repo). These are expected for a CLI; no arbitrary download URLs, extract steps, or personal servers are used.
Credentials
Only MINERU_TOKEN is required and declared as the primary credential, which is proportionate for a hosted extraction/processing service. No unrelated secrets or config paths are requested.
Persistence & Privilege
Skill is not forced-always; it is user-invocable and does not request elevated persistent presence or modifications to other skills or system-wide configs.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install html-to-html - 安装完成后,直接呼叫该 Skill 的名称或使用
/html-to-html触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.4.0
SEO: expand description for better ClawHub vector search discovery
v0.3.0
Rollback to original version
v0.2.0
SEO optimization v0.2.0
v1.0.1
Fix: declare MINERU_TOKEN credential in metadata
v1.0.0
HTML to HTML - fetch a remote HTML page (URL) and convert it to clean structured HTML using MinerU c
元数据
常见问题
HTML to HTML 是什么?
Clean and restructure HTML documents using MinerU. Takes messy or complex HTML and produces clean, well-formatted HTML output with proper structure preserved... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 167 次。
如何安装 HTML to HTML?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install html-to-html」即可一键安装,无需额外配置。
HTML to HTML 是免费的吗?
是的,HTML to HTML 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
HTML to HTML 支持哪些平台?
HTML to HTML 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 HTML to HTML?
由 mzlzyCA(@mzlzyca)开发并维护,当前版本 v0.4.0。
推荐 Skills