docx-md
/install docx-md
Word DOCX (OOXML) – docx-md
Overview
Three entry points: Read – output compact Markdown (default, token-efficient) or full JSON; Modify – apply AI-returned edits to the docx; Finalize – accept all revisions and remove all comments. Implemented via OOXML (ZIP + XML). No commercial Word libraries required.
Workflow
| Goal | Action |
|---|---|
| Get document for AI | Read: run read script → Markdown (default) or JSON. Markdown includes \x3C!-- b:N --> blockIndex markers for edit targeting. |
| Apply AI edits to docx | Modify: run apply script with docx + edits JSON → new docx with track changes and comments. |
| Deliver final version | Finalize: run finalize script → new docx with no revisions/comments. |
LLM-oriented pipeline
- Read – Parse docx; output Markdown (default) or JSON. Markdown uses
\x3C!-- b:N -->prefix per block; revisions:{+inserted+}{-deleted-}; comments:[comment: text]. - Send the output + task prompt to the model; require the model to output only the edit JSON:
blockIndex,originalContent,content,basis. - Modify – Script infers op from
blockIndex,originalContent,content,basis; converts to OOXML (w:ins/w:del/ comment anchors), then write back to Word. - Finalize – When the user confirms, run finalize to accept all revisions and remove all comments.
See references/llm-pipeline.md for the Markdown format, JSON schema, and edit format.
1. Read
- Parse
word/document.xml(w:bodyonly) andword/comments.xml. - Output Markdown (default) or JSON. Markdown is compact and token-efficient.
Script: scripts/read_docx.py
# Default: Markdown output (token-efficient)
python3 skills/docx-md/scripts/read_docx.py document.docx
python3 skills/docx-md/scripts/read_docx.py document.docx -o result.md
# JSON output (full structure)
python3 skills/docx-md/scripts/read_docx.py document.docx -f json -o result.json
Options:
-o,--output– Output path (default: stdout)-f,--format–md(default) orjson
2. Modify
- Input: docx path + edit JSON
{ modifications: [{ blockIndex, originalContent, content, basis }] }(sameblockIndexas read output). - Flow: Convert JSON to OOXML (
w:ins/w:del/ comments), then write back to Word.
Script: scripts/apply_edits_docx.py. Use - as edits file to read JSON from stdin.
python3 skills/docx-md/scripts/apply_edits_docx.py document.docx edits.json -o output.docx
python3 skills/docx-md/scripts/apply_edits_docx.py document.docx - -o output.docx # stdin
Options: --author (default: "Review")
3. Finalize
- Accept all revisions (flatten to final text), remove all comments. Save as new docx.
- Uses
docx-revisionsto accept revisions (preserves encoding), then removes comment markup via regex on raw bytes.
Script: scripts/finalize_docx.py
Requires: pip install docx-revisions (see requirements.txt)
python3 skills/docx-md/scripts/finalize_docx.py input.docx -o output.docx
Resources
scripts/
- read_docx.py – Read:
python3 scripts/read_docx.py document.docx [-o out.md] [-f md|json] - apply_edits_docx.py – Modify:
python3 scripts/apply_edits_docx.py document.docx edits.json -o output.docx - finalize_docx.py – Finalize:
python3 scripts/finalize_docx.py input.docx -o output.docx
references/
- ooxml.md – OOXML layout (document.xml, comments.xml, revisions, comments)
- llm-pipeline.md – Pipeline: read → Markdown/JSON → model edits → modify; defines Markdown format, JSON shape (blockIndex, originalContent, content, basis)
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install docx-md - 安装完成后,直接呼叫该 Skill 的名称或使用
/docx-md触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
docx-md 是什么?
Low-level docx format tool for AI document review. Three operations: (1) read docx → output compact Markdown or JSON; (2) apply edits JSON back to docx (trac... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 663 次。
如何安装 docx-md?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install docx-md」即可一键安装,无需额外配置。
docx-md 是免费的吗?
是的,docx-md 完全免费(开源免费),可自由下载、安装和使用。
docx-md 支持哪些平台?
docx-md 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 docx-md?
由 yanweiliang323868-del(@yanweiliang323868-del)开发并维护,当前版本 v1.0.1。