← 返回 Skills 市场
186
总下载
0
收藏
0
当前安装
6
版本数
在 OpenClaw 中安装
/install doc-to-text
功能描述
Extract plain readable text from Word documents (.doc, .docx) using MinerU. Outputs Markdown (the closest plain-text format supported) for easy reading and p...
使用说明 (SKILL.md)
Doc To Text
Extract plain readable text from Word (.doc/.docx) documents using MinerU. MinerU outputs Markdown, which is the closest format to plain text it supports.
Install
npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
Quick Start
# Extract text from .docx to stdout (no token required)
mineru-open-api flash-extract report.docx
# Save to file
mineru-open-api flash-extract report.docx -o ./out/
# Extract .doc (requires token)
mineru-open-api extract report.doc -o ./out/
# JSON output contains plain text fields (requires token)
mineru-open-api extract report.docx -f json -o ./out/
Authentication
No token needed for flash-extract on .docx. Token required for .doc and extract:
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable
Create token at: https://mineru.net/apiManage/token
Capabilities
- Supported input: .doc, .docx (local file or URL)
.docx: supportsflash-extract(no token, Markdown output to stdout).doc: requiresextractwith token- For truly plain text: use
extract -f jsonand read the text fields from the JSON output - Language hint with
--language(default:ch, useenfor English)
Notes
- MinerU does not have a
-f textoption; Markdown is the closest to plain text .docrequiresextractwith token;.docxworks withflash-extract- Output goes to stdout by default; use
-o \x3Cdir>to save to a file or directory - All progress/status messages go to stderr; document content goes to stdout
- MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
安全使用建议
This skill appears to do exactly what it claims: call the mineru-open-api CLI to extract text from .doc/.docx files. Before installing, decide whether you trust the MinerU project and the npm/GitHub sources used to install the CLI. MINERU_TOKEN grants the MinerU service permission to process documents — avoid putting a high-privilege secret there, and create/restrict a token with minimal scope if possible. If you are cautious, inspect the npm package or GitHub repo (github.com/opendatalab/MinerU-Ecosystem) prior to installing, run the CLI in a sandbox or container, and revoke the token if you stop using the skill.
功能分析
Type: OpenClaw Skill
Name: doc-to-text
Version: 0.4.0
The doc-to-text skill bundle provides instructions for an AI agent to use the legitimate 'mineru-open-api' CLI tool (developed by OpenDataLab/Shanghai AI Lab) for document processing. The SKILL.md and _meta.json files contain standard installation steps via npm or Go and usage examples for extracting text from Word documents, with no evidence of malicious intent, data exfiltration, or prompt injection.
能力评估
Purpose & Capability
The name/description (Word -> plain text via MinerU) matches the required binary (mineru-open-api) and the single required environment variable (MINERU_TOKEN). The MINERU_TOKEN is justified for the documented 'extract' operations; no unrelated credentials or binaries are requested.
Instruction Scope
SKILL.md only instructs running the mineru-open-api CLI (flash-extract/extract), setting MINERU_TOKEN or using interactive auth, and points to mineru.net. It does not ask the agent to read unrelated files, other env vars, or transmit data to unexpected endpoints.
Install Mechanism
Install uses standard package registries: npm package 'mineru-open-api' or 'go install' from github.com/opendatalab/... — both are expected for distributing a CLI. This is normal but requires trusting those package sources; no random downloads or archive extraction from untrusted URLs are present.
Credentials
Only MINERU_TOKEN is required and is declared as the primary credential. That aligns with the documented need for a token for 'extract'/.doc processing. No extra or unrelated secrets are requested.
Persistence & Privilege
The skill is not marked always:true, does not request system-wide config changes, and is instruction-only (no bundled code). Installing the CLI is standard behavior and there is no evidence the skill modifies other skills or global agent settings.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install doc-to-text - 安装完成后,直接呼叫该 Skill 的名称或使用
/doc-to-text触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.4.0
SEO: expand description for better ClawHub vector search discovery
v0.3.0
Rollback to original version
v0.2.0
SEO optimization: expanded description with rich keywords, trigger phrases, and bilingual content for better ClawHub vector search ranking.
v1.1.0
Update to v1.1.0
v1.0.1
Fix: declare MINERU_TOKEN credential in metadata
v1.0.0
Doc to Text - extract plain readable text from Word (.doc/.docx) documents using MinerU. Output is M
元数据
常见问题
Doc To Text 是什么?
Extract plain readable text from Word documents (.doc, .docx) using MinerU. Outputs Markdown (the closest plain-text format supported) for easy reading and p... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 186 次。
如何安装 Doc To Text?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install doc-to-text」即可一键安装,无需额外配置。
Doc To Text 是免费的吗?
是的,Doc To Text 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Doc To Text 支持哪些平台?
Doc To Text 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Doc To Text?
由 mzlzyCA(@mzlzyca)开发并维护,当前版本 v0.4.0。
推荐 Skills