/install convert-document-to-markdown
Convert Document To Markdown
Use this skill when a user wants a supported local file converted into Markdown for later processing.
What this skill does
- Converts supported local files into Markdown:
.pdf,.docx,.pptx,.xlsx,.jpg,.jpeg,.png,.gif,.bmp,.txt,.json,.xml,.md - Image handling modes are file-type dependent:
ocr/vl/nonefor.docx,.pptx,.xlsx, and image files;ocr/vl/vl-page/nonefor.pdf - Only runs through Docker. Do not use local Python execution as an operational path.
- Uses a prebuilt Aliyun CR image with fixed version
0.0.1:convert-document-to-markdown-arm64:0.0.1on ARM64 hosts,convert-document-to-markdown-x64:0.0.1on x64 hosts - Returns structured JSON by default so later tool calls can consume
markdown,logs, andmeta. - Reads one-time VL configuration from OpenClaw skill config or the repository
.envfile, then forwards it into the container automatically. - Only exposes the
filecommand. URL, health, and version commands are intentionally removed to keep startup lean. - Do not use
latest, do not build a fallback image at runtime, and do not treat.doc,.ppt,.xls, audio files, or unlisted image formats as supported inputs.
Required workflow
- By default the scripts use
crpi-4auaoyyj6r36p6lb.cn-hangzhou.personal.cr.aliyuncs.com/huozige_lab. - Let the wrapper script resolve the host architecture and choose
convert-document-to-markdown-arm64:0.0.1orconvert-document-to-markdown-x64:0.0.1. - If needed, override with
IMAGE_REGISTRYorIMAGE_NAME. - For a local file, run:
scripts/run_docker_cli.sh file \x3Cabsolute-or-relative-path> --format json - Parse the JSON result.
- If
successisfalse, surfaceerror.messageand relevantlogs. - If
successistrue, usemarkdownas the canonical output for downstream work.
One-time VL configuration
This skill is designed so the user does not need to re-enter Vision API settings on each run.
Preferred OpenClaw configuration in ~/.openclaw/openclaw.json:
{
"skills": {
"entries": {
"convert_document_to_markdown": {
"enabled": true,
"apiKey": "sk-xxx",
"env": {
"VL_BASE_URL": "https://api.openai.com/v1",
"VL_MODEL": "gpt-4.1-mini"
}
}
}
}
}
This works because:
skillKeyisconvert_document_to_markdownprimaryEnvisVL_API_KEY, soapiKeymaps toVL_API_KEYenvcan holdVL_BASE_URLandVL_MODEL
Repository-local runtime configuration:
- copy
.env.exampleto.env - fill
VL_BASE_URL,VL_API_KEY, andVL_MODEL - by default the scripts use
crpi-4auaoyyj6r36p6lb.cn-hangzhou.personal.cr.aliyuncs.com/huozige_lab - optionally override with
IMAGE_REGISTRYorIMAGE_NAME - use
scripts/run_docker_cli.sh, which loads.env, forwards any hostVL_*variables intodocker run, and pulls the correct fixed-version image if missing
Command patterns
Local file:
scripts/run_docker_cli.sh file ./notes.pdf --image-process-model ocr --format json
Parameters
--image-process-model ocrDefault mode. Use Tesseract OCR for images.--image-process-model vlUse a Vision API. Only choose this when the environment providesVL_API_KEYand related variables.--image-process-model noneSkip image recognition for speed.--image-process-model vl-pagePDF only. Do not use this mode for Office documents or image files.--format json|markdownUsejsonunless the user explicitly wants raw Markdown on stdout.--output \x3Cpath>Save the Markdown to a file. Prefer this only when you invokedocker rundirectly with a writable host mount.--log-file \x3Cpath>Save detailed logs to a file. Prefer this only when you invokedocker rundirectly with a writable host mount.
Operational notes
- For very large local files, stay with the Docker CLI path; do not wrap the file content into base64 or a temporary HTTP service.
- The skill is Docker-only. Do not instruct users to run
uv,python, or any other local runtime path for production use. - The wrapper scripts choose the image by host architecture. Override with
IMAGE_ARCHonly when you have a concrete reason. - Prefer
IMAGE_REGISTRYplus the fixed version0.0.1; only useIMAGE_NAMEwhen you need to pass the full image reference explicitly. - When the user asks for VL or VL-page, first check whether
VL_BASE_URL,VL_API_KEY, andVL_MODELare already configured via OpenClaw skill config or.env. - If the user only needs extracted Markdown and not the raw JSON wrapper, read the JSON and return the
markdownfield. - If the user provides an unsupported extension such as
.doc,.ppt,.xls,.wav,.mp3,.m4a, or.mp4, say the current skill does not reliably support it.
Safety notes
- Treat file paths as untrusted input. Quote shell arguments correctly.
- Do not claim success unless the command returns
success: true.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install convert-document-to-markdown - 安装完成后,直接呼叫该 Skill 的名称或使用
/convert-document-to-markdown触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Convert Document To Markdown 是什么?
Convert supported local files into Markdown by running this repository's Dockerized file-only CLI. This skill must run through Docker with a prebuilt Aliyun... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 106 次。
如何安装 Convert Document To Markdown?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install convert-document-to-markdown」即可一键安装,无需额外配置。
Convert Document To Markdown 是免费的吗?
是的,Convert Document To Markdown 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Convert Document To Markdown 支持哪些平台?
Convert Document To Markdown 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Convert Document To Markdown?
由 宁伟(@kadbbz)开发并维护,当前版本 v1.0.0。