← Back to Skills Marketplace
kadbbz

Convert Document To Markdown

by 宁伟 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
106
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install convert-document-to-markdown
Description
Convert supported local files into Markdown by running this repository's Dockerized file-only CLI. This skill must run through Docker with a prebuilt Aliyun...
README (SKILL.md)

Convert Document To Markdown

Use this skill when a user wants a supported local file converted into Markdown for later processing.

What this skill does

  • Converts supported local files into Markdown: .pdf, .docx, .pptx, .xlsx, .jpg, .jpeg, .png, .gif, .bmp, .txt, .json, .xml, .md
  • Image handling modes are file-type dependent: ocr / vl / none for .docx, .pptx, .xlsx, and image files; ocr / vl / vl-page / none for .pdf
  • Only runs through Docker. Do not use local Python execution as an operational path.
  • Uses a prebuilt Aliyun CR image with fixed version 0.0.1: convert-document-to-markdown-arm64:0.0.1 on ARM64 hosts, convert-document-to-markdown-x64:0.0.1 on x64 hosts
  • Returns structured JSON by default so later tool calls can consume markdown, logs, and meta.
  • Reads one-time VL configuration from OpenClaw skill config or the repository .env file, then forwards it into the container automatically.
  • Only exposes the file command. URL, health, and version commands are intentionally removed to keep startup lean.
  • Do not use latest, do not build a fallback image at runtime, and do not treat .doc, .ppt, .xls, audio files, or unlisted image formats as supported inputs.

Required workflow

  1. By default the scripts use crpi-4auaoyyj6r36p6lb.cn-hangzhou.personal.cr.aliyuncs.com/huozige_lab.
  2. Let the wrapper script resolve the host architecture and choose convert-document-to-markdown-arm64:0.0.1 or convert-document-to-markdown-x64:0.0.1.
  3. If needed, override with IMAGE_REGISTRY or IMAGE_NAME.
  4. For a local file, run: scripts/run_docker_cli.sh file \x3Cabsolute-or-relative-path> --format json
  5. Parse the JSON result.
  6. If success is false, surface error.message and relevant logs.
  7. If success is true, use markdown as the canonical output for downstream work.

One-time VL configuration

This skill is designed so the user does not need to re-enter Vision API settings on each run.

Preferred OpenClaw configuration in ~/.openclaw/openclaw.json:

{
  "skills": {
    "entries": {
      "convert_document_to_markdown": {
        "enabled": true,
        "apiKey": "sk-xxx",
        "env": {
          "VL_BASE_URL": "https://api.openai.com/v1",
          "VL_MODEL": "gpt-4.1-mini"
        }
      }
    }
  }
}

This works because:

  • skillKey is convert_document_to_markdown
  • primaryEnv is VL_API_KEY, so apiKey maps to VL_API_KEY
  • env can hold VL_BASE_URL and VL_MODEL

Repository-local runtime configuration:

  • copy .env.example to .env
  • fill VL_BASE_URL, VL_API_KEY, and VL_MODEL
  • by default the scripts use crpi-4auaoyyj6r36p6lb.cn-hangzhou.personal.cr.aliyuncs.com/huozige_lab
  • optionally override with IMAGE_REGISTRY or IMAGE_NAME
  • use scripts/run_docker_cli.sh, which loads .env, forwards any host VL_* variables into docker run, and pulls the correct fixed-version image if missing

Command patterns

Local file:

scripts/run_docker_cli.sh file ./notes.pdf --image-process-model ocr --format json

Parameters

  • --image-process-model ocr Default mode. Use Tesseract OCR for images.
  • --image-process-model vl Use a Vision API. Only choose this when the environment provides VL_API_KEY and related variables.
  • --image-process-model none Skip image recognition for speed.
  • --image-process-model vl-page PDF only. Do not use this mode for Office documents or image files.
  • --format json|markdown Use json unless the user explicitly wants raw Markdown on stdout.
  • --output \x3Cpath> Save the Markdown to a file. Prefer this only when you invoke docker run directly with a writable host mount.
  • --log-file \x3Cpath> Save detailed logs to a file. Prefer this only when you invoke docker run directly with a writable host mount.

Operational notes

  • For very large local files, stay with the Docker CLI path; do not wrap the file content into base64 or a temporary HTTP service.
  • The skill is Docker-only. Do not instruct users to run uv, python, or any other local runtime path for production use.
  • The wrapper scripts choose the image by host architecture. Override with IMAGE_ARCH only when you have a concrete reason.
  • Prefer IMAGE_REGISTRY plus the fixed version 0.0.1; only use IMAGE_NAME when you need to pass the full image reference explicitly.
  • When the user asks for VL or VL-page, first check whether VL_BASE_URL, VL_API_KEY, and VL_MODEL are already configured via OpenClaw skill config or .env.
  • If the user only needs extracted Markdown and not the raw JSON wrapper, read the JSON and return the markdown field.
  • If the user provides an unsupported extension such as .doc, .ppt, .xls, .wav, .mp3, .m4a, or .mp4, say the current skill does not reliably support it.

Safety notes

  • Treat file paths as untrusted input. Quote shell arguments correctly.
  • Do not claim success unless the command returns success: true.
Usage Guidance
This skill does what it says — it runs a Docker image to convert files to Markdown — but you must trust the image. Before installing or running: 1) Confirm you trust crpi-4auaoyyj6r36p6lb.cn-hangzhou.personal.cr.aliyuncs.com/huozige_lab and the image maintainers, or prefer images from a well-known registry. 2) Do not put sensitive secrets (API keys) in a repo .env or host env unless you want them forwarded into the container; only set VL_API_KEY when you need Vision API processing. 3) If possible, pull and inspect or rebuild the image locally (or require image digests) so you know what code will run. 4) Consider running the container in a restricted environment (network-disabled or sandbox) if you must process sensitive files. 5) Note the documentation vs. script mismatch around reading ~/.openclaw/openclaw.json — the wrapper script only loads a local .env; ensure your platform handles any OpenClaw config securely.
Capability Analysis
Type: OpenClaw Skill Name: convert-document-to-markdown Version: 1.0.0 The skill executes opaque Docker images from a personal Aliyun registry (crpi-4auaoyyj6r36p6lb.cn-hangzhou.personal.cr.aliyuncs.com) and forwards sensitive API keys (VL_API_KEY) to them via scripts/run_docker_cli.sh. While the script limits local file access by using read-only mounts for the input directory, the reliance on unverified third-party binary artifacts to process potentially sensitive documents constitutes a supply-chain risk. There is no explicit evidence of malicious intent, but the execution of remote, non-transparent code with credentials warrants caution.
Capability Assessment
Purpose & Capability
The skill's name and description align with the provided script and SKILL.md: it converts local files to Markdown via Docker. However, the metadata declares a primary credential (VL_API_KEY) while 'Required env vars' is empty and the SKILL.md treats VL as optional (only needed for 'vl' modes). That mismatch is worth noting but can be explained by optional VL-based image processing.
Instruction Scope
The runtime instructions stay within the stated purpose: run the container with the target file mounted read-only and return JSON/markdown. SKILL.md says it will read one-time OpenClaw skill config (~/.openclaw/openclaw.json) or repo .env, but the included wrapper script only loads a local .env and forwards VL_* host env vars — it does not itself read ~/.openclaw/openclaw.json. This is an implementation/documentation mismatch.
Install Mechanism
There is no install spec, but the runtime script will docker pull and run an image from a personal Aliyun CR registry (crpi-...personal.cr.aliyuncs.com). Pulling and executing an opaque third-party image is high-risk: the container can execute arbitrary code and network traffic, and the script does not verify image provenance or digest. Using Docker here is expected for a containerized CLI, but the image source being a personal/unknown registry increases risk.
Credentials
The script forwards multiple VL_* environment variables (VL_BASE_URL, VL_API_KEY, VL_MODEL, and others) or an env-file into the container if present. Forwarding vision API credentials into the container is functionally justified only when the user requests 'vl' or 'vl-page' processing; it is unnecessary for default OCR mode. The metadata/primaryEnv setting also implies VL_API_KEY is the skill's main credential even though SKILL.md treats it as optional — an inconsistency that could lead to unneeded exposure of secrets.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or system-wide settings. The wrapper is instruction-only and does not install persistent agents. No elevated platform privileges are requested.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install convert-document-to-markdown
  3. After installation, invoke the skill by name or use /convert-document-to-markdown
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release: convert local files to Markdown using Dockerized CLI with fixed-version Aliyun CR images. - Supports converting `.pdf`, `.docx`, `.pptx`, `.xlsx`, common image formats, and plain/text files to Markdown. - Requires Docker; does not support local Python or alternative runtimes. - Selects prebuilt Aliyun CR image (`arm64` or `x64`) based on host architecture. - Returns structured JSON with markdown output, logs, and metadata for downstream processing. - Only the `file` command is exposed; URL, health, and version commands are not included. - Requires configuration of VL API credentials via OpenClaw config or `.env`. - Does not support legacy Office formats (`.doc`, `.ppt`, `.xls`) or audio/video files.
Metadata
Slug convert-document-to-markdown
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Convert Document To Markdown?

Convert supported local files into Markdown by running this repository's Dockerized file-only CLI. This skill must run through Docker with a prebuilt Aliyun... It is an AI Agent Skill for Claude Code / OpenClaw, with 106 downloads so far.

How do I install Convert Document To Markdown?

Run "/install convert-document-to-markdown" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Convert Document To Markdown free?

Yes, Convert Document To Markdown is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Convert Document To Markdown support?

Convert Document To Markdown is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Convert Document To Markdown?

It is built and maintained by 宁伟 (@kadbbz); the current version is v1.0.0.

💬 Comments