← 返回 Skills 市场

Image To Markdown

Name: Image To Markdown
Author: tanis90

作者 tanis90 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

136

总下载

当前安装

版本数

在 OpenClaw 中安装

/install image-to-markdown

功能描述

Image to Markdown - extract text from images (PNG, JPG, WebP) to Markdown with OCR. Use when reading text from screenshots, photos, scanned pages, or any ima...

使用说明 (SKILL.md)

Image to Markdown - OCR Extract Text from Images

Extract text from images to Markdown using MinerU Open API. No API key required.

Quick Start

# Extract text from a local image
mineru-open-api flash-extract screenshot.png

# Extract text from an image URL (no download needed)
mineru-open-api flash-extract https://example.com/image.png

# Save to file
mineru-open-api flash-extract photo.jpg -o ./output/

# Specify language for better accuracy
mineru-open-api flash-extract scan.jpg --language en

Language Rule

You MUST reply to the user in the SAME language they use. This is non-negotiable.

Capabilities

OCR text extraction from PNG, JPG, JPEG, WebP, BMP, TIFF
Supports both local files and URLs directly
Language hint with --language (default: ch, use en for English)
No API key, no signup, no authentication
Max 10MB per image

When to Use

User asks to "read", "extract", or "OCR" an image
User shares a screenshot and asks what it says
User wants text from a photo of a document or whiteboard
User needs image content converted to Markdown

CLI Reference

Run mineru-open-api flash-extract --help for all available options.

Data Privacy

flash-extract uploads the image to MinerU's cloud API for processing and returns the result. No account or API key is required.
Images are processed in real-time and are not stored after extraction.
For details, see https://mineru.net

Notes

Output is Markdown text extracted via OCR
For higher precision or batch processing, use mineru-open-api extract (requires auth via mineru-open-api auth)
If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli

安全使用建议

This skill appears to do what it claims (OCR -> Markdown) but it depends on a third-party CLI (mineru-open-api) that will read images you give it and upload them to MinerU's cloud. Before installing or using it: 1) Do not send sensitive or private images until you verify MinerU's privacy/storage policy and trustworthiness of the npm/go package or the downloadable binary. 2) Vet the package source: check the npm package owner, the GitHub repo (opendatalab/MinerU-Ecosystem), package contents, and recent releases for suspicious code. 3) Prefer testing with non-sensitive images first. 4) If you require guaranteed local-only OCR, use a well-known local OCR tool instead. 5) Note the SKILL.md's claim that images are not stored is unverifiable from the skill alone — treat it as a claim, not a guarantee.

功能分析

Type: OpenClaw Skill Name: image-to-markdown Version: 1.0.0 The skill provides OCR capabilities by wrapping the 'mineru-open-api' CLI to convert images to Markdown. While it uploads image data to a third-party cloud API (mineru.net) for processing, this behavior is explicitly disclosed in the documentation (SKILL.md) and is necessary for the stated functionality. No evidence of malicious intent, data exfiltration beyond the intended OCR process, or prompt injection was found.

能力评估

✓ Purpose & Capability

Name/description match the runtime instructions: the SKILL.md tells the agent to run mineru-open-api flash-extract on local files or URLs. Required binary (mineru-open-api) and install options (npm/uv/go) are proportionate to an OCR/CLI wrapper skill.

ℹ Instruction Scope

Instructions are narrowly scoped to running mineru-open-api for OCR. They explicitly allow uploading images (local file or URL) to MinerU's cloud API. The doc asserts 'no account / no API key' and 'images are not stored after extraction' — those are privacy-relevant claims the agent will follow but cannot verify. Also the skill requires you to pass image paths/URLs, which means the binary will read local files and send them to a remote endpoint.

ℹ Install Mechanism

Install options are via npm, uv, or go install (public package names / GitHub path are provided). These are standard but install arbitrary third‑party code on the host. The SKILL.md also directs users to mineru.net for a manual download if installs fail — fetching a binary from an external site has higher risk and should be verified.

✓ Credentials

No environment variables, credentials, or config paths are requested. The skill does not ask for unrelated secrets or system access beyond what a CLI OCR tool needs.

✓ Persistence & Privilege

always is false and the skill is user-invocable with normal autonomous invocation allowed. The skill does not request permanent presence or modify other skills/configurations.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install image-to-markdown
安装完成后，直接呼叫该 Skill 的名称或使用 /image-to-markdown 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of Image to Markdown skill. - Extracts text from images (PNG, JPG, WebP, BMP, TIFF) to Markdown using MinerU Open API. - Supports both local image files and direct URLs. - No API key, signup, or authentication required. - Allows language hints for improved OCR accuracy. - Designed for reading and converting text from screenshots, scanned pages, documents, and more.

元数据

Slug image-to-markdown

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题