← 返回 Skills 市场

MiniMax PDF OCR

Name: MiniMax PDF OCR
Author: chongjie-ran

作者 chongjie-ran · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

308

总下载

当前安装

版本数

在 OpenClaw 中安装

/install minimax-pdf-ocr

功能描述

使用 MiniMax Vision API 识别 PDF/图片中的文字

安全使用建议

This skill's code does what its name says: it converts PDF pages to images and uploads those images to a MiniMax Vision API to get OCR results, then writes a Markdown file. Before installing or using it, consider: 1) Privacy: images (full page content) are sent to https://api.minimax.chat — do not process sensitive/confidential documents unless you trust that service and its privacy policy. 2) Credentials: the code requires MINIMAX_API_KEY (set in env). The registry metadata incorrectly stated no env vars — verify you are comfortable providing that API key. 3) System dependency: pdftoppm (poppler) must be installed; SKILL.md mentions it but registry metadata omitted it. 4) Inconsistencies: SKILL.md recommends npm packages (openai, pdf2image) that are not used by the shipped code — this suggests sloppy packaging; prefer to inspect/run the script in a sandbox first. 5) Safety checks: check the API endpoint and the publisher before using with real secrets, and test on non-sensitive sample documents. If you want to proceed, run it locally in an isolated environment and verify network endpoints and outputs yourself. If you require higher assurance, ask the publisher to correct the metadata and provide provenance/hosting information.

功能分析

Type: OpenClaw Skill Name: minimax-pdf-ocr Version: 1.0.0 The skill contains a potential shell injection vulnerability in `pdf-ocr-minimax.js` due to the use of `child_process.spawn` with `shell: true` on unsanitized file paths (`pdfPath`). There is also a discrepancy between the documentation in `SKILL.md`, which instructs users to install unused dependencies (`openai`, `pdf2image`), and the actual implementation which uses native `fetch` and system calls. While the script correctly targets the legitimate MiniMax API endpoint (`api.minimax.chat`), the insecure execution pattern poses a risk.

能力评估

ℹ Purpose & Capability

The code and SKILL.md implement PDF→PNG conversion (pdftoppm/poppler) and send images to a MiniMax Vision API for OCR — this aligns with the skill name/description. However, the registry metadata (which claimed no required env vars or binaries) is inconsistent with the SKILL.md and code that require an API key (MINIMAX_API_KEY) and rely on a system binary (pdftoppm).

ℹ Instruction Scope

Runtime instructions are focused: convert PDF to images, base64-encode images, and POST them to https://api.minimax.chat/v1/text/chatcompletion_v2 for OCR, then save Markdown. The instructions do send image data (embedded as data URLs) to an external API — expected for an OCR skill but important for privacy. SKILL.md also instructs installing npm packages (openai, pdf2image) that the shipped code does not use; this is inconsistent but not directly harmful.

✓ Install Mechanism

No install spec (instruction-only) lowers risk. The only non-JS install guidance is to install poppler (provides pdftoppm) via brew — a standard system package. There are no remote download/extract steps or obscure URLs in the install path.

ℹ Credentials

The code requires a single credential (MINIMAX_API_KEY) and optionally OUTPUT_DIR — proportional for a remote OCR API. However, the registry metadata incorrectly lists no required env vars; this discrepancy between declared requirements and actual code is a red flag (could be sloppy packaging or mis-declared permissions). No other credentials are requested.

✓ Persistence & Privilege

The skill does not request persistent/always-on privileges and does not modify other skills or system-wide configs. It runs as a user-invoked Node script and only accesses the files you provide plus the environment API key.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install minimax-pdf-ocr
安装完成后，直接呼叫该 Skill 的名称或使用 /minimax-pdf-ocr 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

miniMax PDF OCR 1.0.0 – Initial Release - Recognizes text from PDFs and images using the MiniMax Vision API, supporting Chinese and English. - Converts PDF files to images (using poppler) for OCR processing. - Outputs recognition results as Markdown files with preserved formatting and structure. - Provides both command-line interface and JavaScript API usage. - Supports configurable output directories and environment-based API key management.

元数据

Slug minimax-pdf-ocr

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题

MiniMax PDF OCR 是什么？

使用 MiniMax Vision API 识别 PDF/图片中的文字. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 308 次。

如何安装 MiniMax PDF OCR？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install minimax-pdf-ocr」即可一键安装，无需额外配置。

MiniMax PDF OCR 是免费的吗？

是的，MiniMax PDF OCR 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

MiniMax PDF OCR 支持哪些平台？

MiniMax PDF OCR 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 MiniMax PDF OCR？

由 chongjie-ran（@chongjie-ran）开发并维护，当前版本 v1.0.0。