← Back to Skills Marketplace
multi-image-to-text
by
AsianGiantDuck
· GitHub ↗
· v1.0.0
· MIT-0
125
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install test20260402
Description
批量识别图片中的文字内容并按图片分段输出为结构化文档;当用户需要从多张图片中提取文字、整理图片文字内容、将图片文字转为可编辑文档时使用
Usage Guidance
This skill appears coherent and low-risk: it does OCR on images via the agent's read_image tool and outputs Markdown. Before installing or using it, confirm how your platform implements the read_image tool (where images are sent/processed and whether they are retained), avoid uploading highly sensitive images unless you trust the platform's data handling, and be aware that the skill can create a .md file in the agent's working directory when asked. If you need stronger guarantees about deletion or local-only processing, verify those properties with the platform or prefer local/offline OCR tools.
Capability Analysis
Type: OpenClaw Skill
Name: test20260402
Version: 1.0.0
The skill bundle is a standard image-to-text (OCR) utility designed to batch process images and generate structured Markdown documents. It uses the expected 'read_image' tool and follows logical steps for data processing and formatting without any signs of data exfiltration, malicious execution, or harmful prompt injection (SKILL.md and references/output-format.md).
Capability Assessment
Purpose & Capability
Name/description (batch image→text extraction) align with the SKILL.md: it instructs the agent to accept images, call a read_image OCR tool per image, and produce structured Markdown. No unrelated binaries, env vars, or config paths are requested.
Instruction Scope
Instructions are narrowly scoped to: receive images, call read_image for each image, format and output a Markdown document, and optionally write a .md file under ./ with a timestamped name. The doc claims images are used only in the current session and not stored — this is a behavioral claim but cannot be enforced by the instruction file alone; actual persistence/processing depends on the platform/tool implementing read_image and the agent runtime.
Install Mechanism
Instruction-only skill with no install spec and no code files. No downloads or package installs are performed by the skill itself.
Credentials
The skill declares no environment variables or credentials. Its functionality (OCR on user-supplied images) does not require additional secrets, so requested access is proportionate.
Persistence & Privilege
always is false, the skill is user-invocable, and it does not request modification of other skills or global agent settings. It may write a .md file to the local working directory when asked, which is consistent with its stated output options.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install test20260402 - After installation, invoke the skill by name or use
/test20260402 - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of image-text-extractor skill:
- Supports batch OCR recognition of text from multiple uploaded images (PNG, JPG, JPEG, GIF, WebP).
- Maintains original text structure (paragraphs, titles, lists) for each image.
- Outputs results in a structured Markdown document with clear image-by-image segmentation.
- Handles recognition failures gracefully, continues processing other images.
- Designed with privacy protection; images are used only in-session and not stored.
Metadata
Frequently Asked Questions
What is multi-image-to-text?
批量识别图片中的文字内容并按图片分段输出为结构化文档;当用户需要从多张图片中提取文字、整理图片文字内容、将图片文字转为可编辑文档时使用. It is an AI Agent Skill for Claude Code / OpenClaw, with 125 downloads so far.
How do I install multi-image-to-text?
Run "/install test20260402" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is multi-image-to-text free?
Yes, multi-image-to-text is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does multi-image-to-text support?
multi-image-to-text is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created multi-image-to-text?
It is built and maintained by AsianGiantDuck (@asiangiantduck); the current version is v1.0.0.
More Skills