← 返回 Skills 市场
bibekyess

DOCX TO HTML CONVERTER

作者 Bibek KC · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
290
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install docx-to-html
功能描述
Use this skill whenever the user has a DOCX file (.docx) and wants to convert, read, view, extract content from, or process it in any way — including summari...
使用说明 (SKILL.md)

DOCX to HTML Converter

This skill provides a straightforward method to convert Microsoft Word (.docx) documents into clean, semantic HTML, making them suitable for various web-based and AI-driven applications.

Compatibility

  • Python 3 (for the conversion wrapper)
  • Node.js with mammoth installed (core conversion engine)

To install Node.js dependencies, run once from the scripts/ directory:

npm install

Use Cases

  • Browser-Based Viewing: Convert DOCX documents for display in web browsers without requiring Microsoft Word.
  • AI-Ready Content: Prepare DOCX content for LLMs for tasks like summarization, Q&A, and semantic search.
  • Web Integration: Integrate Word document content into web applications, CMS, or online editors.
  • Data Extraction: Extract structured data (tables, lists, headings) from DOCX files for automated reporting and analysis.
  • Search and Indexing: Enable full-text and vector search by converting DOCX content into easily indexable HTML.

Workflow

  1. Locate DOCX File: Identify the path to the .docx file to convert.

  2. Run Conversion Script: Execute the Python wrapper from the skill's scripts/ directory:

    python3 \x3Cskill-dir>/scripts/convert.py \x3Cinput_path.docx> \x3Coutput_path.html>
    

    Replace \x3Cskill-dir> with the actual path where this skill is installed.

  3. Verify Output: Open the generated .html file in a browser and check:

    • Headings (\x3Ch1>, \x3Ch2>, etc.) appear at the correct hierarchy levels
    • Tables render with the expected rows and columns
    • Lists appear as bullet or numbered items (not plain text)
    • Bold, italic, and inline formatting are preserved
    • Images are visible (embedded as base64 by default)
  4. Process HTML: Use the resulting HTML for further tasks like summarization, indexing, or display.

Bundled Resources

  • scripts/docx-converter.js: Core Node.js conversion logic using mammoth.js.
  • scripts/convert.py: Python wrapper for invoking the Node.js converter.
  • scripts/package.json: Node.js dependency manifest (includes mammoth).

Technical Details

The conversion leverages mammoth.js, which prioritizes semantic meaning over visual replication:

  • Semantic Conversion: Document structure maps to proper HTML — headings become \x3Ch1>/\x3Ch2>, lists become \x3Cul>/\x3Col>, etc.
  • Basic Styling: Bold, italics, and common paragraph styles are preserved.
  • Image Embedding: Images are extracted and embedded as base64 data URIs in the HTML output.

Troubleshooting

Problem Likely Cause Fix
node: command not found Node.js not installed Install Node.js (v16+)
Cannot find module 'mammoth' npm deps missing Run npm install in scripts/
Empty or garbled output Corrupted or password-protected DOCX Try re-saving the file from Microsoft Word
Missing images Large embedded images Check mammoth.js image size limits in docx-converter.js

Limitations

  • Advanced or highly specific styling from the original DOCX may not be perfectly replicated in the HTML output.
  • Features like tracked changes, comments, or complex layout elements may be simplified or omitted.
安全使用建议
This skill's implementation (Python wrapper + Node/mammoth) is coherent with its stated purpose, but the registry metadata omits required binaries (Node.js and Python) — treat that as a packaging oversight. Before installing or running: (1) inspect package-lock.json for unfamiliar packages (already provided here; mammoth and common deps look normal); (2) run npm install in a sandbox or isolated environment, not on a production host; (3) ensure Node.js (v16+) and Python 3 are available; (4) if you will process sensitive documents, run the conversion in a secure/local environment since the code writes files to disk and npm packages will be downloaded; and (5) if you need a higher assurance, ask the publisher for corrected metadata and a signed release or a packaged install spec.
功能分析
Type: OpenClaw Skill Name: docx-to-html Version: 1.0.0 The OpenClaw AgentSkills bundle 'docx-to-html' is classified as benign. The skill's purpose is to convert DOCX files to HTML, which is a legitimate utility function. The `SKILL.md` provides clear instructions for the AI agent to use the skill for DOCX processing without any evidence of prompt injection aiming for malicious objectives. The core conversion logic in `scripts/convert.py` and `scripts/docx-converter.js` uses `subprocess.run` with a list of arguments, effectively mitigating shell injection risks from user-provided file paths. Dependencies listed in `scripts/package.json` and `scripts/package-lock.json` are standard for the `mammoth` library, which is a well-known DOCX parsing tool, and do not indicate any malicious intent or suspicious supply chain issues.
能力评估
Purpose & Capability
The code and SKILL.md match the stated purpose (using mammoth.js to convert .docx to HTML). However the registry metadata declares no required binaries or env vars while the SKILL.md explicitly requires Python 3 and Node.js; that mismatch is unexpected and should be corrected.
Instruction Scope
Runtime instructions are narrowly scoped to locating a .docx file, running the provided convert.py wrapper (which calls the included Node script), and verifying the HTML output. The instructions do not request or reference unrelated system files, credentials, or external endpoints.
Install Mechanism
There is no formal install spec in the registry; instead the SKILL.md instructs running 'npm install' in scripts/. That downloads packages from the public npm registry (package-lock.json is provided). This is a common pattern but increases risk compared with an explicit reviewed install spec; the lockfile points to known packages (mammoth and dependencies) and there are no download-from-arbitrary-URL steps.
Credentials
The skill does not request environment variables or credentials and the code does not access secrets or unrelated config paths. All file I/O is limited to the user-supplied input .docx and the specified output .html.
Persistence & Privilege
The skill is not 'always' enabled and does not attempt to modify other skills or global agent settings. It runs on-demand and does not request elevated or persistent privileges.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install docx-to-html
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /docx-to-html 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of docx-to-html skill. - Converts DOCX documents to clean, semantic HTML using mammoth.js. - Includes both a Node.js converter script and a Python wrapper for easy integration. - Supports extracting headings, tables, lists, inline formatting, and images from DOCX files. - Enables use-cases like browser viewing, content extraction for AI, and web integration. - Includes troubleshooting guidance and outlines current limitations.
元数据
Slug docx-to-html
版本 1.0.0
许可证
累计安装 0
当前安装数 0
历史版本数 1
常见问题

DOCX TO HTML CONVERTER 是什么?

Use this skill whenever the user has a DOCX file (.docx) and wants to convert, read, view, extract content from, or process it in any way — including summari... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 290 次。

如何安装 DOCX TO HTML CONVERTER?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install docx-to-html」即可一键安装,无需额外配置。

DOCX TO HTML CONVERTER 是免费的吗?

是的,DOCX TO HTML CONVERTER 完全免费(开源免费),可自由下载、安装和使用。

DOCX TO HTML CONVERTER 支持哪些平台?

DOCX TO HTML CONVERTER 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 DOCX TO HTML CONVERTER?

由 Bibek KC(@bibekyess)开发并维护,当前版本 v1.0.0。

💬 留言讨论