← Back to Skills Marketplace

Image To Markdown

Name: Image To Markdown
Author: tanis90

by tanis90 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

136

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install image-to-markdown

Description

Image to Markdown - extract text from images (PNG, JPG, WebP) to Markdown with OCR. Use when reading text from screenshots, photos, scanned pages, or any ima...

README (SKILL.md)

Image to Markdown - OCR Extract Text from Images

Extract text from images to Markdown using MinerU Open API. No API key required.

Quick Start

# Extract text from a local image
mineru-open-api flash-extract screenshot.png

# Extract text from an image URL (no download needed)
mineru-open-api flash-extract https://example.com/image.png

# Save to file
mineru-open-api flash-extract photo.jpg -o ./output/

# Specify language for better accuracy
mineru-open-api flash-extract scan.jpg --language en

Language Rule

You MUST reply to the user in the SAME language they use. This is non-negotiable.

Capabilities

OCR text extraction from PNG, JPG, JPEG, WebP, BMP, TIFF
Supports both local files and URLs directly
Language hint with --language (default: ch, use en for English)
No API key, no signup, no authentication
Max 10MB per image

When to Use

User asks to "read", "extract", or "OCR" an image
User shares a screenshot and asks what it says
User wants text from a photo of a document or whiteboard
User needs image content converted to Markdown

CLI Reference

Run mineru-open-api flash-extract --help for all available options.

Data Privacy

flash-extract uploads the image to MinerU's cloud API for processing and returns the result. No account or API key is required.
Images are processed in real-time and are not stored after extraction.
For details, see https://mineru.net

Notes

Output is Markdown text extracted via OCR
For higher precision or batch processing, use mineru-open-api extract (requires auth via mineru-open-api auth)
If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli

Usage Guidance

This skill appears to do what it claims (OCR -> Markdown) but it depends on a third-party CLI (mineru-open-api) that will read images you give it and upload them to MinerU's cloud. Before installing or using it: 1) Do not send sensitive or private images until you verify MinerU's privacy/storage policy and trustworthiness of the npm/go package or the downloadable binary. 2) Vet the package source: check the npm package owner, the GitHub repo (opendatalab/MinerU-Ecosystem), package contents, and recent releases for suspicious code. 3) Prefer testing with non-sensitive images first. 4) If you require guaranteed local-only OCR, use a well-known local OCR tool instead. 5) Note the SKILL.md's claim that images are not stored is unverifiable from the skill alone — treat it as a claim, not a guarantee.

Capability Analysis

Type: OpenClaw Skill Name: image-to-markdown Version: 1.0.0 The skill provides OCR capabilities by wrapping the 'mineru-open-api' CLI to convert images to Markdown. While it uploads image data to a third-party cloud API (mineru.net) for processing, this behavior is explicitly disclosed in the documentation (SKILL.md) and is necessary for the stated functionality. No evidence of malicious intent, data exfiltration beyond the intended OCR process, or prompt injection was found.

Capability Assessment

✓ Purpose & Capability

Name/description match the runtime instructions: the SKILL.md tells the agent to run mineru-open-api flash-extract on local files or URLs. Required binary (mineru-open-api) and install options (npm/uv/go) are proportionate to an OCR/CLI wrapper skill.

ℹ Instruction Scope

Instructions are narrowly scoped to running mineru-open-api for OCR. They explicitly allow uploading images (local file or URL) to MinerU's cloud API. The doc asserts 'no account / no API key' and 'images are not stored after extraction' — those are privacy-relevant claims the agent will follow but cannot verify. Also the skill requires you to pass image paths/URLs, which means the binary will read local files and send them to a remote endpoint.

ℹ Install Mechanism

Install options are via npm, uv, or go install (public package names / GitHub path are provided). These are standard but install arbitrary third‑party code on the host. The SKILL.md also directs users to mineru.net for a manual download if installs fail — fetching a binary from an external site has higher risk and should be verified.

✓ Credentials

No environment variables, credentials, or config paths are requested. The skill does not ask for unrelated secrets or system access beyond what a CLI OCR tool needs.

✓ Persistence & Privilege

always is false and the skill is user-invocable with normal autonomous invocation allowed. The skill does not request permanent presence or modify other skills/configurations.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install image-to-markdown
After installation, invoke the skill by name or use /image-to-markdown
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

- Initial release of Image to Markdown skill. - Extracts text from images (PNG, JPG, WebP, BMP, TIFF) to Markdown using MinerU Open API. - Supports both local image files and direct URLs. - No API key, signup, or authentication required. - Allows language hints for improved OCR accuracy. - Designed for reading and converting text from screenshots, scanned pages, documents, and more.

Metadata

Slug image-to-markdown

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Image To Markdown?

Image to Markdown - extract text from images (PNG, JPG, WebP) to Markdown with OCR. Use when reading text from screenshots, photos, scanned pages, or any ima... It is an AI Agent Skill for Claude Code / OpenClaw, with 136 downloads so far.

How do I install Image To Markdown?

Run "/install image-to-markdown" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Image To Markdown free?

Yes, Image To Markdown is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Image To Markdown support?

Image To Markdown is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Image To Markdown?

It is built and maintained by tanis90 (@tanis90); the current version is v1.0.0.

More Skills