← Back to Skills Marketplace

Img2md

Name: Img2md
Author: tanis90

by tanis90 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

156

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install img2md

Description

Image to Markdown - extract text from images (PNG, JPG, WebP) to Markdown with OCR. Use when reading text from screenshots, photos, scanned pages, or any ima...

README (SKILL.md)

\r \r

Img2MD - Quick Image OCR to Markdown\r

\r Extract text from images to Markdown using MinerU Open API. No API key required.\r \r

Quick Start\r

# Img2MD - Quick Image OCR to Markdown\r
mineru-open-api flash-extract screenshot.png\r
\r
# Img2MD - Quick Image OCR to Markdown\r
mineru-open-api flash-extract https://example.com/image.png\r
\r
# Img2MD - Quick Image OCR to Markdown\r
mineru-open-api flash-extract photo.jpg -o ./output/\r
\r
# Img2MD - Quick Image OCR to Markdown\r
mineru-open-api flash-extract scan.jpg --language en\r
```\r
\r
## Language Rule\r
\r
You MUST reply to the user in the SAME language they use. This is non-negotiable.\r
\r
## Capabilities\r
\r
- OCR text extraction from PNG, JPG, JPEG, WebP, BMP, TIFF\r
- Supports both local files and URLs directly\r
- Language hint with `--language` (default: `ch`, use `en` for English)\r
- No API key, no signup, no authentication\r
- Max 10MB per image\r
\r
## When to Use\r
\r
- User asks to "read", "extract", or "OCR" an image\r
- User shares a screenshot and asks what it says\r
- User wants text from a photo of a document or whiteboard\r
- User needs image content converted to Markdown\r
\r
## CLI Reference\r
\r
Run `mineru-open-api flash-extract --help` for all available options.\r
\r
## Data Privacy\r
\r
- `flash-extract` uploads the image to MinerU's cloud API for processing and returns the result. No account or API key is required.\r
- Images are processed in real-time and are not stored after extraction.\r
- For details, see https://mineru.net\r
\r
## Notes\r
\r
- Output is Markdown text extracted via OCR\r
- For higher precision or batch processing, use `mineru-open-api extract` (requires auth via `mineru-open-api auth`)\r
- If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli\r

Usage Guidance

This skill is internally consistent: it runs a third-party CLI (mineru-open-api) to OCR images and uploads images to MinerU's cloud for processing. Before installing or using it, consider: (1) Privacy — images (including screenshots or photos with sensitive content) will be transmitted to an external service; avoid sending sensitive images unless you trust MinerU's policy. (2) Trust the CLI package — review the npm package and the GitHub repo (the go install target) or inspectorily inspect the installer before installing to ensure it is legitimate. (3) Runtime autonomy — the skill can be invoked by the agent by default; if you want to prevent unexpected uploads, restrict agent autonomy or only invoke the skill manually. (4) For batch or higher-precision workflows the SKILL.md mentions auth is available; treat any credentials you supply to that CLI as sensitive. If you want more assurance, request the upstream package source code or a checksum for the distributed binary before installing.

Capability Analysis

Type: OpenClaw Skill Name: img2md Version: 1.0.0 The img2md skill provides OCR functionality by wrapping the 'mineru-open-api' CLI tool to convert images to Markdown. While it uploads image data to a third-party cloud service (mineru.net) for processing, this behavior is explicitly documented in SKILL.md and is necessary for the tool's stated purpose. There are no signs of malicious intent, such as credential theft, unauthorized execution, or prompt injection.

Capability Assessment

✓ Purpose & Capability

The name/description (image → Markdown OCR) matches the declared binary dependency (mineru-open-api) and the SKILL.md commands (mineru-open-api flash-extract). No unrelated credentials, tools, or config paths are requested.

ℹ Instruction Scope

SKILL.md only instructs using the mineru-open-api CLI on local files or URLs and to return OCR output in the user's language. It explicitly states images are uploaded to MinerU's cloud for processing, which is consistent with the stated purpose but does mean user images are transmitted off-host.

ℹ Install Mechanism

Installation options are npm/uv/go installs of a mineru-open-api CLI or manual download from mineru.net. These are common distribution channels; no obscure shorteners or raw binary downloads are used. Installing will place a third-party CLI on the system and allow execution of that binary—verify trust in the package/source before installing.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. The SKILL.md mentions optional auth for advanced usage but does not require secrets for the basic flash-extract flow, which is proportionate to its function.

✓ Persistence & Privilege

always is false and there is no attempt to modify system/agent-wide config. The skill does require installing a CLI binary but does not demand persistent elevated privileges in its metadata.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install img2md
After installation, invoke the skill by name or use /img2md
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

- Initial release of img2md. - Extracts text from images (PNG, JPG, WebP, BMP, TIFF) to Markdown using OCR. - Supports both local image files and image URLs. - No API key or authentication required; images up to 10MB supported. - Includes command-line usage examples and installation options (npm, uv, go). - Applies user's language automatically in output.

Metadata

Slug img2md

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Img2md?

Image to Markdown - extract text from images (PNG, JPG, WebP) to Markdown with OCR. Use when reading text from screenshots, photos, scanned pages, or any ima... It is an AI Agent Skill for Claude Code / OpenClaw, with 156 downloads so far.

How do I install Img2md?

Run "/install img2md" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Img2md free?

Yes, Img2md is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Img2md support?

Img2md is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Img2md?

It is built and maintained by tanis90 (@tanis90); the current version is v1.0.0.

More Skills