/install convert-document-to-markdown
Convert Document To Markdown
Use this skill when a user wants a supported local file converted into Markdown for later processing.
What this skill does
- Converts supported local files into Markdown:
.pdf,.docx,.pptx,.xlsx,.jpg,.jpeg,.png,.gif,.bmp,.txt,.json,.xml,.md - Image handling modes are file-type dependent:
ocr/vl/nonefor.docx,.pptx,.xlsx, and image files;ocr/vl/vl-page/nonefor.pdf - Only runs through Docker. Do not use local Python execution as an operational path.
- Uses a prebuilt Aliyun CR image with fixed version
0.0.1:convert-document-to-markdown-arm64:0.0.1on ARM64 hosts,convert-document-to-markdown-x64:0.0.1on x64 hosts - Returns structured JSON by default so later tool calls can consume
markdown,logs, andmeta. - Reads one-time VL configuration from OpenClaw skill config or the repository
.envfile, then forwards it into the container automatically. - Only exposes the
filecommand. URL, health, and version commands are intentionally removed to keep startup lean. - Do not use
latest, do not build a fallback image at runtime, and do not treat.doc,.ppt,.xls, audio files, or unlisted image formats as supported inputs.
Required workflow
- By default the scripts use
crpi-4auaoyyj6r36p6lb.cn-hangzhou.personal.cr.aliyuncs.com/huozige_lab. - Let the wrapper script resolve the host architecture and choose
convert-document-to-markdown-arm64:0.0.1orconvert-document-to-markdown-x64:0.0.1. - If needed, override with
IMAGE_REGISTRYorIMAGE_NAME. - For a local file, run:
scripts/run_docker_cli.sh file \x3Cabsolute-or-relative-path> --format json - Parse the JSON result.
- If
successisfalse, surfaceerror.messageand relevantlogs. - If
successistrue, usemarkdownas the canonical output for downstream work.
One-time VL configuration
This skill is designed so the user does not need to re-enter Vision API settings on each run.
Preferred OpenClaw configuration in ~/.openclaw/openclaw.json:
{
"skills": {
"entries": {
"convert_document_to_markdown": {
"enabled": true,
"apiKey": "sk-xxx",
"env": {
"VL_BASE_URL": "https://api.openai.com/v1",
"VL_MODEL": "gpt-4.1-mini"
}
}
}
}
}
This works because:
skillKeyisconvert_document_to_markdownprimaryEnvisVL_API_KEY, soapiKeymaps toVL_API_KEYenvcan holdVL_BASE_URLandVL_MODEL
Repository-local runtime configuration:
- copy
.env.exampleto.env - fill
VL_BASE_URL,VL_API_KEY, andVL_MODEL - by default the scripts use
crpi-4auaoyyj6r36p6lb.cn-hangzhou.personal.cr.aliyuncs.com/huozige_lab - optionally override with
IMAGE_REGISTRYorIMAGE_NAME - use
scripts/run_docker_cli.sh, which loads.env, forwards any hostVL_*variables intodocker run, and pulls the correct fixed-version image if missing
Command patterns
Local file:
scripts/run_docker_cli.sh file ./notes.pdf --image-process-model ocr --format json
Parameters
--image-process-model ocrDefault mode. Use Tesseract OCR for images.--image-process-model vlUse a Vision API. Only choose this when the environment providesVL_API_KEYand related variables.--image-process-model noneSkip image recognition for speed.--image-process-model vl-pagePDF only. Do not use this mode for Office documents or image files.--format json|markdownUsejsonunless the user explicitly wants raw Markdown on stdout.--output \x3Cpath>Save the Markdown to a file. Prefer this only when you invokedocker rundirectly with a writable host mount.--log-file \x3Cpath>Save detailed logs to a file. Prefer this only when you invokedocker rundirectly with a writable host mount.
Operational notes
- For very large local files, stay with the Docker CLI path; do not wrap the file content into base64 or a temporary HTTP service.
- The skill is Docker-only. Do not instruct users to run
uv,python, or any other local runtime path for production use. - The wrapper scripts choose the image by host architecture. Override with
IMAGE_ARCHonly when you have a concrete reason. - Prefer
IMAGE_REGISTRYplus the fixed version0.0.1; only useIMAGE_NAMEwhen you need to pass the full image reference explicitly. - When the user asks for VL or VL-page, first check whether
VL_BASE_URL,VL_API_KEY, andVL_MODELare already configured via OpenClaw skill config or.env. - If the user only needs extracted Markdown and not the raw JSON wrapper, read the JSON and return the
markdownfield. - If the user provides an unsupported extension such as
.doc,.ppt,.xls,.wav,.mp3,.m4a, or.mp4, say the current skill does not reliably support it.
Safety notes
- Treat file paths as untrusted input. Quote shell arguments correctly.
- Do not claim success unless the command returns
success: true.
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install convert-document-to-markdown - After installation, invoke the skill by name or use
/convert-document-to-markdown - Provide required inputs per the skill's parameter spec and get structured output
What is Convert Document To Markdown?
Convert supported local files into Markdown by running this repository's Dockerized file-only CLI. This skill must run through Docker with a prebuilt Aliyun... It is an AI Agent Skill for Claude Code / OpenClaw, with 106 downloads so far.
How do I install Convert Document To Markdown?
Run "/install convert-document-to-markdown" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Convert Document To Markdown free?
Yes, Convert Document To Markdown is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Convert Document To Markdown support?
Convert Document To Markdown is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Convert Document To Markdown?
It is built and maintained by 宁伟 (@kadbbz); the current version is v1.0.0.