← 返回 Skills 市场
159
总下载
0
收藏
0
当前安装
5
版本数
在 OpenClaw 中安装
/install html-to-text
功能描述
Convert HTML to plain readable text using MinerU. Strips HTML markup and extracts clean text content from web pages and HTML files. Features: HTML to text co...
使用说明 (SKILL.md)
HTML to Text
Extract plain readable text from HTML files or web pages using MinerU. MinerU outputs Markdown as the closest format to plain text.
Install
npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
Quick Start
# Extract text from a local HTML file (requires token)
mineru-open-api extract page.html -o ./out/
# Extract text from a web page (requires token)
mineru-open-api crawl https://example.com/article
# JSON output contains text fields (requires token)
mineru-open-api extract page.html -f json -o ./out/
Authentication
Token required:
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable
Create token at: https://mineru.net/apiManage/token
Capabilities
- Supported input: local .html file or web page URL
- HTML requires
extractorcrawl(token required) — not supported byflash-extract - MinerU does not have a
-f textoption; Markdown is the closest plain-text output - For truly plain text: use
extract -f jsonand read the text fields from JSON output - Language hint with
--language(default:ch, useenfor English)
Notes
- MinerU has no
-f textformat; use Markdown output or-f jsonfor text fields - HTML is NOT supported by
flash-extract - Output goes to stdout by default; use
-o \x3Cdir>to save to a file or directory - All progress/status messages go to stderr; document content goes to stdout
- MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
安全使用建议
This skill is coherent: it simply wraps the MinerU CLI and needs a MinerU API token. Before installing, verify the mineru-open-api package source (npm package name and the GitHub repo) to ensure it's the official project, obtain your MINERU_TOKEN only from the official mineru.net site, and avoid pasting that token into untrusted places. Installing globally (-g) will add a system-wide binary; use a virtualenv/container if you prefer isolation. If you plan to allow the agent to call this skill autonomously, be aware it can run the mineru-open-api commands whenever invoked — ensure you trust the agent and the token's permissions.
功能分析
Type: OpenClaw Skill
Name: html-to-text
Version: 0.4.0
The skill provides instructions and metadata for using the MinerU API (developed by OpenDataLab) to convert HTML content into plain text or Markdown. It utilizes the 'mineru-open-api' CLI tool and requires a 'MINERU_TOKEN' for authentication. All documented behaviors in SKILL.md and _meta.json are consistent with the stated purpose of document processing, and no malicious patterns, unauthorized data access, or harmful prompt injections were identified.
能力评估
Purpose & Capability
The skill is an instruction-only wrapper to run the mineru-open-api CLI to extract text from HTML/URLs. Requiring the mineru-open-api binary and MINERU_TOKEN is consistent with that purpose.
Instruction Scope
SKILL.md only instructs using mineru-open-api commands (extract, crawl, auth), creating/setting MINERU_TOKEN, and saving outputs. It does not ask the agent to read unrelated files, other env vars, or exfiltrate data to unexpected endpoints.
Install Mechanism
Installers are standard package flows: npm package and go install from a GitHub repo. No arbitrary downloads, no URL shorteners or unknown extract steps are used.
Credentials
Only MINERU_TOKEN is required and declared as the primary credential. That single token is proportional to a CLI that authenticates to MinerU's API.
Persistence & Privilege
always is false and the skill does not request system-wide changes or other skills' config. Autonomous invocation is allowed (platform default) but not excessive for this integration.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install html-to-text - 安装完成后,直接呼叫该 Skill 的名称或使用
/html-to-text触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.4.0
SEO: expand description for better ClawHub vector search discovery
v0.3.0
Rollback to original version
v0.2.0
SEO optimization v0.2.0
v1.0.1
Minor update
v1.0.0
Initial release
元数据
常见问题
HTML to Text 是什么?
Convert HTML to plain readable text using MinerU. Strips HTML markup and extracts clean text content from web pages and HTML files. Features: HTML to text co... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 159 次。
如何安装 HTML to Text?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install html-to-text」即可一键安装,无需额外配置。
HTML to Text 是免费的吗?
是的,HTML to Text 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
HTML to Text 支持哪些平台?
HTML to Text 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 HTML to Text?
由 mzlzyCA(@mzlzyca)开发并维护,当前版本 v0.4.0。
推荐 Skills