image-reader
/install image-reader
Image Reader Skill
Image recognition and understanding tool that leverages Doubao multimodal models to analyze image content.
Features
- Text Extraction (OCR): Extract text from images, suitable for documents, screenshots, posters, menus, etc.
- Image Description: Generate detailed descriptions of images, suitable for photos, illustrations, memes, UI screens, etc.
- General Analysis: Automatically choose the best analysis strategy based on the image type.
API Configuration
| Item | Value |
|---|---|
| API Endpoint | https://ark.cn-beijing.volces.com/api/coding/v3 |
| Model | doubao-seed-2.0-pro |
| Authentication | API Key (configured in config.yaml) |
Usage
Command Line
# General analysis
python image_reader.py /path/to/image.png
# Extract text (OCR)
python image_reader.py /path/to/image.png -p "Extract all text from the image"
# Describe the image
python image_reader.py /path/to/image.png -p "Describe this image in detail"
OpenClaw Skill Invocation
Once installed, you can invoke it using natural language:
Analyze this image
Extract the text from the image
Describe this screenshot
Output
- Text-heavy images: Returns all extracted text, preserving original formatting.
- Non-text images: Returns a detailed scene description, including objects, people, colors, style, etc.
- Mixed content: Provides both text extraction and a visual description.
Technical Details
- Uses an OpenAI-compatible API to call Doubao multimodal models
- Images are sent as base64-encoded data
- The system prompt adapts to the image type to select the most appropriate analysis strategy
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install image-reader - After installation, invoke the skill by name or use
/image-reader - Provide required inputs per the skill's parameter spec and get structured output
What is image-reader?
Image recognition and understanding tool. Uses a multimodal model (e.g. doubao-seed-2.0-pro, kimi-k2.5) to analyze image content and supports OCR text extrac... It is an AI Agent Skill for Claude Code / OpenClaw, with 335 downloads so far.
How do I install image-reader?
Run "/install image-reader" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is image-reader free?
Yes, image-reader is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does image-reader support?
image-reader is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created image-reader?
It is built and maintained by simonjoe246 (@simonjoe246); the current version is v1.0.0.