← Back to Skills Marketplace
sqlskills

Markitdown File Converter

by SQLSkills · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
86
Downloads
1
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install markitdown-file-converter
Description
将 PDF、Word (docx/doc)、Excel (xlsx/xls)、PPT (pptx/ppt)、图片等文件一键转换为 Markdown 或 JSON。 内置三大引擎:pandoc(DOCX 表格/Emoji/公式最强)、markitdown(微软开源,Excel/PPT/图片 OCR)、mammoth...
Usage Guidance
This skill mostly does what it claims (local conversion + OCR), but it will by default send images/documents to a remote 'PaddleOCR Cloud' endpoint because the code ships with a non-empty default API URL and token. Before installing or running: 1) Do not run it on sensitive documents without first verifying or disabling the cloud OCR: set environment variables PADDLEOCR_DOC_PARSING_API_URL="" and/or PADDLEOCR_ACCESS_TOKEN="" to disable the cloud path, or remove/patch scripts/ocr/paddleocr.py so is_configured() returns False unless you explicitly configure it. 2) If you must use cloud OCR, replace the default endpoint/token with a known, trusted service and a token you control, and inspect network traffic to confirm destination. 3) Consider running conversions with the pix2tex / RapidOCR local engines only (they are present) or audit/modify the code to never call external endpoints automatically. 4) If unsure, run the skill in an isolated environment (offline or sandboxed) and review/grep the repository for other hard-coded endpoints or secrets before use.
Capability Analysis
Type: OpenClaw Skill Name: markitdown-file-converter Version: 1.0.0 The skill bundle implements a document converter with high-risk capabilities, including automated system-level software installation and data transmission to external endpoints. It uses subprocess calls to execute 'pip install', 'winget', and 'apt-get' for dependency management (scripts/utils/deps.py, scripts/backends/pandoc.py), and it sends document data to a hardcoded third-party API (https://c474r929pea0qa6c.aistudio-app.com/layout-parsing) for OCR processing (scripts/ocr/paddleocr.py). While these behaviors are documented as features, the combination of automated environment modification and document exfiltration to a specific external service meets the threshold for suspicious activity.
Capability Tags
requires-oauth-token
Capability Assessment
Purpose & Capability
The skill is a document-to-Markdown/JSON converter and its files and CLI match that purpose. However, the code includes a 'PaddleOCR Cloud' integration with a default API URL and hard-coded access token so the skill will call an external cloud API by default. Requiring a remote OCR service is not inherently wrong, but the registry metadata declared no required env vars/credentials and the README implied cloud OCR is used only if configured — the code contradicts that by enabling the cloud path via non-empty defaults.
Instruction Scope
SKILL.md describes local installs and optional 'PaddleOCR Cloud' usage 'if configured'. But the runtime instructions and code will attempt to call PaddleOCR Cloud automatically because defaults are present. The skill will read images/files and POST them to an external HTTP endpoint (ocr/paddleocr.py -> httpx.post). That behavior (sending potentially sensitive document contents to a third-party endpoint) is not clearly documented as enabled by default in SKILL.md and thus expands the instruction scope unexpectedly.
Install Mechanism
There is no platform install spec in registry metadata (instruction-driven). The scripts run pip installs at runtime (subprocess pip install) and pandoc download logic uses GitHub releases or winget/brew/apt — these are expected for this functionality. No obfuscated installers or unusual download hosts are present except the single hard-coded PaddleOCR API endpoint for runtime calls (not an installer).
Credentials
Registry metadata lists no required environment variables or credentials, but the code reads PADDLEOCR_DOC_PARSING_API_URL and PADDLEOCR_ACCESS_TOKEN (in scripts/ocr/paddleocr.py). Worse, these have non-empty default values in the code, so the cloud OCR path is considered 'configured' even if the user sets nothing. A document conversion skill should not transmit document contents to a remote service without explicit, declared credentials or opt-in.
Persistence & Privilege
The skill does not request permanent inclusion (always=false), does not modify other skills, and does not persist credentials or change global agent configuration. It runs installs in the current environment, which is expected for a utility script.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install markitdown-file-converter
  3. After installation, invoke the skill by name or use /markitdown-file-converter
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
markitdown-file-converter 1.0.0 - 首次发布,支持将 PDF、Word、Excel、PPT、图片等主流文件一键转换成 Markdown 或 JSON。 - 集成三大引擎(pandoc、markitdown、mammoth),可自动安装和检测依赖,无需手动配置。 - 支持数学公式转 LaTeX、表格转标准 Markdown 表格、图片自动提取与 OCR 文字/公式识别、base64 图片自动解码。 - 提供批量转换、按标题结构化 JSON 输出、超时控制、详细进度反馈等增强功能。 - 支持命令行一键转换、目录批量处理,并自动为不同文件类型选择最优后端。
Metadata
Slug markitdown-file-converter
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Markitdown File Converter?

将 PDF、Word (docx/doc)、Excel (xlsx/xls)、PPT (pptx/ppt)、图片等文件一键转换为 Markdown 或 JSON。 内置三大引擎:pandoc(DOCX 表格/Emoji/公式最强)、markitdown(微软开源,Excel/PPT/图片 OCR)、mammoth... It is an AI Agent Skill for Claude Code / OpenClaw, with 86 downloads so far.

How do I install Markitdown File Converter?

Run "/install markitdown-file-converter" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Markitdown File Converter free?

Yes, Markitdown File Converter is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Markitdown File Converter support?

Markitdown File Converter is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Markitdown File Converter?

It is built and maintained by SQLSkills (@sqlskills); the current version is v1.0.0.

💬 Comments