← 返回 Skills 市场

Img2md

Name: Img2md
Author: tanis90

作者 tanis90 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

156

总下载

当前安装

版本数

在 OpenClaw 中安装

/install img2md

功能描述

Image to Markdown - extract text from images (PNG, JPG, WebP) to Markdown with OCR. Use when reading text from screenshots, photos, scanned pages, or any ima...

使用说明 (SKILL.md)

\r \r

Img2MD - Quick Image OCR to Markdown\r

\r Extract text from images to Markdown using MinerU Open API. No API key required.\r \r

Quick Start\r

# Img2MD - Quick Image OCR to Markdown\r
mineru-open-api flash-extract screenshot.png\r
\r
# Img2MD - Quick Image OCR to Markdown\r
mineru-open-api flash-extract https://example.com/image.png\r
\r
# Img2MD - Quick Image OCR to Markdown\r
mineru-open-api flash-extract photo.jpg -o ./output/\r
\r
# Img2MD - Quick Image OCR to Markdown\r
mineru-open-api flash-extract scan.jpg --language en\r
```\r
\r
## Language Rule\r
\r
You MUST reply to the user in the SAME language they use. This is non-negotiable.\r
\r
## Capabilities\r
\r
- OCR text extraction from PNG, JPG, JPEG, WebP, BMP, TIFF\r
- Supports both local files and URLs directly\r
- Language hint with `--language` (default: `ch`, use `en` for English)\r
- No API key, no signup, no authentication\r
- Max 10MB per image\r
\r
## When to Use\r
\r
- User asks to "read", "extract", or "OCR" an image\r
- User shares a screenshot and asks what it says\r
- User wants text from a photo of a document or whiteboard\r
- User needs image content converted to Markdown\r
\r
## CLI Reference\r
\r
Run `mineru-open-api flash-extract --help` for all available options.\r
\r
## Data Privacy\r
\r
- `flash-extract` uploads the image to MinerU's cloud API for processing and returns the result. No account or API key is required.\r
- Images are processed in real-time and are not stored after extraction.\r
- For details, see https://mineru.net\r
\r
## Notes\r
\r
- Output is Markdown text extracted via OCR\r
- For higher precision or batch processing, use `mineru-open-api extract` (requires auth via `mineru-open-api auth`)\r
- If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli\r

安全使用建议

This skill is internally consistent: it runs a third-party CLI (mineru-open-api) to OCR images and uploads images to MinerU's cloud for processing. Before installing or using it, consider: (1) Privacy — images (including screenshots or photos with sensitive content) will be transmitted to an external service; avoid sending sensitive images unless you trust MinerU's policy. (2) Trust the CLI package — review the npm package and the GitHub repo (the go install target) or inspectorily inspect the installer before installing to ensure it is legitimate. (3) Runtime autonomy — the skill can be invoked by the agent by default; if you want to prevent unexpected uploads, restrict agent autonomy or only invoke the skill manually. (4) For batch or higher-precision workflows the SKILL.md mentions auth is available; treat any credentials you supply to that CLI as sensitive. If you want more assurance, request the upstream package source code or a checksum for the distributed binary before installing.

功能分析

Type: OpenClaw Skill Name: img2md Version: 1.0.0 The img2md skill provides OCR functionality by wrapping the 'mineru-open-api' CLI tool to convert images to Markdown. While it uploads image data to a third-party cloud service (mineru.net) for processing, this behavior is explicitly documented in SKILL.md and is necessary for the tool's stated purpose. There are no signs of malicious intent, such as credential theft, unauthorized execution, or prompt injection.

能力评估

✓ Purpose & Capability

The name/description (image → Markdown OCR) matches the declared binary dependency (mineru-open-api) and the SKILL.md commands (mineru-open-api flash-extract). No unrelated credentials, tools, or config paths are requested.

ℹ Instruction Scope

SKILL.md only instructs using the mineru-open-api CLI on local files or URLs and to return OCR output in the user's language. It explicitly states images are uploaded to MinerU's cloud for processing, which is consistent with the stated purpose but does mean user images are transmitted off-host.

ℹ Install Mechanism

Installation options are npm/uv/go installs of a mineru-open-api CLI or manual download from mineru.net. These are common distribution channels; no obscure shorteners or raw binary downloads are used. Installing will place a third-party CLI on the system and allow execution of that binary—verify trust in the package/source before installing.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. The SKILL.md mentions optional auth for advanced usage but does not require secrets for the basic flash-extract flow, which is proportionate to its function.

✓ Persistence & Privilege

always is false and there is no attempt to modify system/agent-wide config. The skill does require installing a CLI binary but does not demand persistent elevated privileges in its metadata.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install img2md
安装完成后，直接呼叫该 Skill 的名称或使用 /img2md 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of img2md. - Extracts text from images (PNG, JPG, WebP, BMP, TIFF) to Markdown using OCR. - Supports both local image files and image URLs. - No API key or authentication required; images up to 10MB supported. - Includes command-line usage examples and installation options (npm, uv, go). - Applies user's language automatically in output.

元数据

Slug img2md

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Img2md 是什么？

Image to Markdown - extract text from images (PNG, JPG, WebP) to Markdown with OCR. Use when reading text from screenshots, photos, scanned pages, or any ima... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 156 次。

如何安装 Img2md？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install img2md」即可一键安装，无需额外配置。

Img2md 是免费的吗？

是的，Img2md 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Img2md 支持哪些平台？

Img2md 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Img2md？

由 tanis90（@tanis90）开发并维护，当前版本 v1.0.0。