← Back to Skills Marketplace
zqy15306762317

imageReader

by zqy15306762317 · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ⚠ suspicious
93
Downloads
1
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install chat-image-reader
Description
Reads and analyzes images from messages across 10+ chat platforms using platform-specific APIs and unified image processing.
README (SKILL.md)

Chat Image Reader

Universal image reading and analysis skill for multiple chat platforms.

Supported Platforms

Platform Channel Image Source Download Method
Feishu (飞书) feishu Message image_key API + Tenant Token
DingTalk (钉钉) dingtalk Message downloadCode API + Access Token
WeChat (微信) wechat Media file_id API + Access Token
Discord discord Attachment URL Direct download
Telegram telegram file_id Bot API
WhatsApp whatsapp Media URL Direct download
Signal signal Attachment Direct download
Slack slack File URL Direct download
iMessage imessage Attachment path Local file
LINE line Message content API

Required Credentials

This skill requires API credentials to access chat platform messages. Configure at least one platform to use the skill.

Platform Credentials

Platform Required Environment Variables Notes
Feishu (飞书) FEISHU_APP_ID, FEISHU_APP_SECRET Required scopes: im:message:readonly, im:resource
DingTalk (钉钉) DINGTALK_APP_KEY, DINGTALK_APP_SECRET Required permissions: IMessage, Chat
WeChat (企业微信) WECHAT_CORP_ID, WECHAT_CORP_SECRET Required permissions: media_get
Telegram TELEGRAM_BOT_TOKEN From @BotFather
Discord DISCORD_BOT_TOKEN From Discord Developer Portal
WhatsApp WHATSAPP_TOKEN From Meta Business API

Configuration Example

# Feishu (飞书) - Required for Feishu image reading
export FEISHU_APP_ID="cli_xxx"
export FEISHU_APP_SECRET="xxx"

# DingTalk (钉钉) - Required for DingTalk image reading
export DINGTALK_APP_KEY="dingxxx"
export DINGTALK_APP_SECRET="xxx"

# WeChat (企业微信) - Required for WeChat image reading
export WECHAT_CORP_ID="xxx"
export WECHAT_CORP_SECRET="xxx"

# Telegram - Required for Telegram image reading
export TELEGRAM_BOT_TOKEN="123456:ABC-xxx"

Security Notes

  • Credentials are only used to download images from respective platforms
  • No data is sent to external servers except the official platform APIs
  • Images are stored temporarily in local temp directory
  • Credentials should be configured via environment variables, not hardcoded

Workflow

Step 1: Detect Platform from Context

Check inbound_meta.channel or inbound_meta.provider to determine the chat platform:

{
  "channel": "feishu",      // Feishu
  "channel": "discord",     // Discord
  "channel": "telegram",    // Telegram
  "channel": "whatsapp",    // WhatsApp
  ...
}

Step 2: Get Image by Platform

Feishu (飞书)

  1. Get message_id and image_key from message
  2. Get tenant access token via API
  3. Download from: GET /open-apis/im/v1/messages/{message_id}/resources/{image_key}
  4. Save to temp file

DingTalk (钉钉)

  1. Get downloadCode from message content
  2. Get access token via API (appKey + appSecret)
  3. Download from: GET /v1.0/robot/messageFiles/download?downloadCode={code}
  4. Save to temp file

WeChat (微信/企业微信)

  1. Get media_id from message
  2. Get access token via API
  3. Download from: GET https://qyapi.weixin.qq.com/cgi-bin/media/get?access_token={token}&media_id={id}
  4. Save to temp file

Discord

  1. Get attachment URL from message (Discord provides direct CDN URLs)
  2. Download directly from URL
  3. No authentication needed for public channels

Telegram

  1. Get file_id from message
  2. Call Bot API: GET https://api.telegram.org/bot{token}/getFile?file_id={file_id}
  3. Download from: https://api.telegram.org/file/bot{token}/{file_path}

WhatsApp / Signal / Slack

  1. Get media URL from message
  2. Download directly (usually with token in header)

Local File Path

If user provides a local path directly:

  1. Verify file exists
  2. Use directly without download

Step 3: Analyze Image

Use the image tool with appropriate prompt:

image(
  image: "\x3Clocal_path_or_url>",
  prompt: "描述图片内容,包括文字、图表、数据等关键信息"
)

Platform-Specific Implementation

Feishu Implementation

# 1. Get tenant token
$body = @{ app_id = $appId; app_secret = $appSecret } | ConvertTo-Json
$token = (Invoke-RestMethod -Uri "https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal" -Method Post -Body $body).tenant_access_token

# 2. Get message to find image_key
$message = Invoke-RestMethod -Uri "https://open.feishu.cn/open-apis/im/v1/messages/$messageId" -Headers @{ Authorization = "Bearer $token" }
$imageKey = $message.data.items[0].body.content | ConvertFrom-Json | Select-Object -ExpandProperty image_key

# 3. Download image
Invoke-WebRequest -Uri "https://open.feishu.cn/open-apis/im/v1/messages/$messageId/resources/$imageKey" -Headers @{ Authorization = "Bearer $token" } -OutFile $outputPath

DingTalk Implementation

# 1. Get access token
$body = @{ appKey = $appKey; appSecret = $appSecret } | ConvertTo-Json
$token = (Invoke-RestMethod -Uri "https://api.dingtalk.com/v1.0/oauth2/accessToken" -Method Post -Body $body).accessToken

# 2. Download image using downloadCode
# For robot messages:
Invoke-WebRequest -Uri "https://api.dingtalk.com/v1.0/robot/messageFiles/download?downloadCode=$downloadCode" -Headers @{ "x-acs-dingtalk-access-token" = $token } -OutFile $outputPath

# For user messages (via stream):
# Use conversation message download API

WeChat (企业微信) Implementation

# 1. Get access token
$token = (Invoke-RestMethod -Uri "https://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid=$corpId&corpsecret=$corpSecret").access_token

# 2. Download media
Invoke-WebRequest -Uri "https://qyapi.weixin.qq.com/cgi-bin/media/get?access_token=$token&media_id=$mediaId" -OutFile $outputPath

Discord Implementation

# Discord attachments are direct URLs
$attachmentUrl = $message.attachments[0].url
Invoke-WebRequest -Uri $attachmentUrl -OutFile $outputPath

Telegram Implementation

# 1. Get file path from file_id
$fileInfo = Invoke-RestMethod -Uri "https://api.telegram.org/bot$token/getFile?file_id=$fileId"
$filePath = $fileInfo.result.file_path

# 2. Download file
Invoke-WebRequest -Uri "https://api.telegram.org/file/bot$token/$filePath" -OutFile $outputPath

Reply Context Handling

When user replies to an image message:

  1. Check reply_to_id in inbound metadata
  2. Fetch the original message using platform-specific API
  3. Extract image from the original message
  4. Proceed with download and analysis

Common Analysis Prompts

Document/Screenshot

识别图片中的所有文字内容,保持原有格式和层次结构。

Chart/Table

分析图片中的图表或表格,提取所有数据和标签。

UI Screenshot

描述界面布局、功能按钮、当前状态等关键信息。

Stock/Finance Chart

识别股票代码、价格、K线形态、成交量等财务信息。

Job Posting

提取招聘信息,包括职位名称、职责、要求、薪资等关键内容。

General Image

描述图片内容,包括主要元素、文字、数据等关键信息。

Error Handling

Error Cause Solution
Image not found No image in message Ask user to resend or provide path
Download failed Permission/Network issue Check permissions, retry
API error Token expired/invalid Refresh token and retry
Analysis failed Image unclear/unsupported Try alternative prompt

Fallback Strategy

When automatic image retrieval fails:

  1. Ask user to provide path: "请提供图片的本地路径"
  2. Ask user to describe: "请描述图片内容或复制图片中的文字"
  3. Request resend: "请重新发送图片"

Temp File Management

  • Save images to $workspace/temp_images/ or system temp
  • Clean up after analysis (optional, for disk space)
  • Use unique filenames with timestamp: img_{channel}_{timestamp}.jpg

Configuration Required

Feishu (飞书)

  • FEISHU_APP_ID / channels.feishu.appId
  • FEISHU_APP_SECRET / channels.feishu.appSecret
  • Required scope: im:message:readonly, im:resource

DingTalk (钉钉)

  • DINGTALK_APP_KEY / channels.dingtalk.appKey
  • DINGTALK_APP_SECRET / channels.dingtalk.appSecret
  • Required permissions: IMessage, Chat

WeChat (企业微信)

  • WECHAT_CORP_ID / channels.wechat.corpId
  • WECHAT_CORP_SECRET / channels.wechat.corpSecret
  • Required permissions: media_get

Telegram

  • Bot token configured in OpenClaw

Discord

  • Bot token configured in OpenClaw

Notes

  • Most platforms provide images in JPG/PNG format
  • Large images may need resizing before analysis
  • Consider rate limits for API calls
  • Always validate user has permission to access the image
Usage Guidance
Don't install blindly. Key points to consider before proceeding: - Inconsistency: The registry metadata claims no required env vars, but SKILL.md and the included Python script require many sensitive platform tokens. Ask the author to reconcile the manifest and provide a clear, minimal list of required env vars. - Sensitive credentials: This skill needs tokens that can read chat messages; only provide tokens with the minimum scopes (read-only, narrow workspace/chat scopes) and prefer test accounts or sandbox credentials first. Rotate tokens after testing. - Data flow: Confirm where images are sent for analysis. The SKILL.md's image(...) call may forward image data to the model provider or third-party services; demand explicit data-handling/retention details and logs of network destinations. - Audit the code: The included download_image.py appears to only call official platform APIs and write files to temp directories, but verify there are no hidden endpoints, telemetry, or additional network calls in the full script. Run it in an isolated environment (sandbox/VM) and inspect network traffic while testing. - Origin and maintenance: The skill owner is unknown and there's no homepage; prefer skills from known authors or ask for contact/maintainer info and a verifiable repository. - Least privilege & testing: If you must try it, limit scope (enable only one platform at a time), use scoped/test tokens, and monitor activity. If the author can't clarify the manifest mismatches and data handling, treat this as untrusted.
Capability Assessment
Purpose & Capability
The skill's behavior (download images from many chat platforms) is coherent with its name and description. However, the registry-level metadata claims no required environment variables or primary credential, while SKILL.md and the included Python downloader clearly require multiple sensitive platform credentials (FEISHU, DINGTALK, WECHAT, TELEGRAM, DISCORD, WhatsApp, Slack, LINE, etc.). This inconsistency between declared registry requirements and the actual runtime needs is a red flag (could be sloppy packaging or an attempt to hide sensitive requirements).
Instruction Scope
SKILL.md and scripts limit actions to detecting platform, calling official platform APIs, saving downloaded images to a temp file, and invoking an image-analysis tool. There are no obvious instructions to read unrelated files or exfiltrate data to arbitrary endpoints. However: (1) SKILL.md asserts 'No data is sent to external servers except the official platform APIs' but the skill references optional external OCR/API keys (Baidu/Dashscope) and the downstream 'image(...)' tool — the actual destination of image analysis depends on the runtime image tool and could involve sending image data to an external model or service. (2) The skill instructs accessing inbound_meta and message contents (expected) but that grants broad read access to chat content.
Install Mechanism
This is instruction-only with a bundled Python script and no install spec that downloads arbitrary archives. No third-party install URLs, package installs, or extract-from-URL steps are present. The included script will be available on disk when the skill is installed; it uses the requests library but no additional package installation is specified in an install step.
Credentials
Functionally, the skill legitimately needs platform-specific tokens to download messages/images. But the manifest is inconsistent: top-level registry metadata lists no env vars, SKILL.md metadata lists a subset (python3 + FEISHU/DISCORD/TELEGRAM), and the Python script expects many more environment variables (WECHAT, DINGTALK, SLACK, WHATSAPP tokens, etc.) and/or tokens passed as CLI args. Asking for many platform credentials is proportionate to a multi-platform downloader only if declared transparently; the current mismatch (and the requirement for multiple powerful tokens that can read message content) increases risk if users supply full-privilege credentials.
Persistence & Privilege
The skill does not request always:true and does not claim to modify other skills or global configs. disable-model-invocation is false (normal). Note: autonomous invocation combined with broad platform credentials would enlarge the blast radius, but autonomous invocation alone is the platform default and not a standalone concern.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install chat-image-reader
  3. After installation, invoke the skill by name or use /chat-image-reader
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
**Added credential requirements, permissions, and configuration section.** - Introduced a new "Required Credentials" section detailing necessary environment variables for each supported chat platform. - Provided explicit configuration examples for setting up environment variables. - Documented security notes regarding credential handling and image storage. - Stated required runtime dependencies and permissions in metadata. - No core logic or API changes—documentation changes only.
v1.0.0
- Initial release: First universal image reader for 10+ chat platforms (Feishu, DingTalk, WeChat, Discord, Telegram, WhatsApp, Slack, Signal, LINE, iMessage). - Auto platform detection with unified API and platform-specific download logic. - Comprehensive API reference included with documented error codes, rate limits, and fallback strategies if image retrieval fails. - Supports image analysis with common prompts for text, charts, UI, and finance. - Handles reply context and temp file management, with clear configuration steps for each platform. - Fills a unique gap on ClawHub—no similar cross-platform chat image analysis skill exists.
Metadata
Slug chat-image-reader
Version 1.0.1
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 2
Frequently Asked Questions

What is imageReader?

Reads and analyzes images from messages across 10+ chat platforms using platform-specific APIs and unified image processing. It is an AI Agent Skill for Claude Code / OpenClaw, with 93 downloads so far.

How do I install imageReader?

Run "/install chat-image-reader" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is imageReader free?

Yes, imageReader is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does imageReader support?

imageReader is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created imageReader?

It is built and maintained by zqy15306762317 (@zqy15306762317); the current version is v1.0.1.

💬 Comments