/install docx-to-html
DOCX to HTML Converter
This skill provides a straightforward method to convert Microsoft Word (.docx) documents into clean, semantic HTML, making them suitable for various web-based and AI-driven applications.
Compatibility
- Python 3 (for the conversion wrapper)
- Node.js with
mammothinstalled (core conversion engine)
To install Node.js dependencies, run once from the scripts/ directory:
npm install
Use Cases
- Browser-Based Viewing: Convert DOCX documents for display in web browsers without requiring Microsoft Word.
- AI-Ready Content: Prepare DOCX content for LLMs for tasks like summarization, Q&A, and semantic search.
- Web Integration: Integrate Word document content into web applications, CMS, or online editors.
- Data Extraction: Extract structured data (tables, lists, headings) from DOCX files for automated reporting and analysis.
- Search and Indexing: Enable full-text and vector search by converting DOCX content into easily indexable HTML.
Workflow
-
Locate DOCX File: Identify the path to the
.docxfile to convert. -
Run Conversion Script: Execute the Python wrapper from the skill's
scripts/directory:python3 \x3Cskill-dir>/scripts/convert.py \x3Cinput_path.docx> \x3Coutput_path.html>Replace
\x3Cskill-dir>with the actual path where this skill is installed. -
Verify Output: Open the generated
.htmlfile in a browser and check:- Headings (
\x3Ch1>,\x3Ch2>, etc.) appear at the correct hierarchy levels - Tables render with the expected rows and columns
- Lists appear as bullet or numbered items (not plain text)
- Bold, italic, and inline formatting are preserved
- Images are visible (embedded as base64 by default)
- Headings (
-
Process HTML: Use the resulting HTML for further tasks like summarization, indexing, or display.
Bundled Resources
scripts/docx-converter.js: Core Node.js conversion logic usingmammoth.js.scripts/convert.py: Python wrapper for invoking the Node.js converter.scripts/package.json: Node.js dependency manifest (includesmammoth).
Technical Details
The conversion leverages mammoth.js, which prioritizes semantic meaning over visual replication:
- Semantic Conversion: Document structure maps to proper HTML — headings become
\x3Ch1>/\x3Ch2>, lists become\x3Cul>/\x3Col>, etc. - Basic Styling: Bold, italics, and common paragraph styles are preserved.
- Image Embedding: Images are extracted and embedded as base64 data URIs in the HTML output.
Troubleshooting
| Problem | Likely Cause | Fix |
|---|---|---|
node: command not found |
Node.js not installed | Install Node.js (v16+) |
Cannot find module 'mammoth' |
npm deps missing | Run npm install in scripts/ |
| Empty or garbled output | Corrupted or password-protected DOCX | Try re-saving the file from Microsoft Word |
| Missing images | Large embedded images | Check mammoth.js image size limits in docx-converter.js |
Limitations
- Advanced or highly specific styling from the original DOCX may not be perfectly replicated in the HTML output.
- Features like tracked changes, comments, or complex layout elements may be simplified or omitted.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install docx-to-html - 安装完成后,直接呼叫该 Skill 的名称或使用
/docx-to-html触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
DOCX TO HTML CONVERTER 是什么?
Use this skill whenever the user has a DOCX file (.docx) and wants to convert, read, view, extract content from, or process it in any way — including summari... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 290 次。
如何安装 DOCX TO HTML CONVERTER?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install docx-to-html」即可一键安装,无需额外配置。
DOCX TO HTML CONVERTER 是免费的吗?
是的,DOCX TO HTML CONVERTER 完全免费(开源免费),可自由下载、安装和使用。
DOCX TO HTML CONVERTER 支持哪些平台?
DOCX TO HTML CONVERTER 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 DOCX TO HTML CONVERTER?
由 Bibek KC(@bibekyess)开发并维护,当前版本 v1.0.0。