/install claw-text-and-pics
claw-text-and-pics
Extract text and images from documents via Mistral OCR
Give your OpenClaw agent the ability to read scanned documents, PDFs, and images — extracting clean Markdown text and cropping out embedded images. Powered by Mistral's OCR API.
When to use
- Extract text from scanned documents, invoices, receipts, contracts
- Pull embedded images from PDFs or scans
- Convert handwritten notes or photos to searchable text
- Send extracted images directly to Telegram
Usage
# Extract text only
python3 ocr.py --input scan.jpg
# Extract text from PDF (3 pages)
python3 ocr.py --input document.pdf --pages 3
# Extract embedded images
python3 ocr.py --input scan.jpg --extract-images --output-dir ./images/
# Extract images and send to Telegram
python3 ocr.py --input scan.jpg --extract-images --send --target 123456789
# Works with URLs too
python3 ocr.py --input https://example.com/document.pdf
Output
- stdout: Extracted text as Markdown
- Files: Cropped images saved to
--output-dir(only with--extract-images)
Configuration
Set in ~/.openclaw/.env or as environment variables:
| Variable | Required | Description |
|---|---|---|
MISTRAL_API_KEY |
Yes | Your Mistral API key |
TELEGRAM_BOT_TOKEN |
Only for --send |
Your Telegram bot token |
TELEGRAM_CHAT_ID |
Optional | Default chat ID (overridable with --target) |
Environment Variables
MISTRAL_API_KEY=required # Mistral API key — get one at console.mistral.ai
TELEGRAM_BOT_TOKEN=optional # Required only when using --send
TELEGRAM_CHAT_ID=optional # Default target chat ID (overridable with --target)
This skill reads ~/.openclaw/.env as a fallback for credentials.
Ensure the file has restricted permissions: chmod 600 ~/.openclaw/.env
Requirements
- Python 3.11+
- Mistral API key (console.mistral.ai)
- Optional (only for
--extract-images):pip install pillow
Parameters
| Parameter | Required | Description |
|---|---|---|
--input |
Yes | Local path or URL to image/PDF |
--extract-images |
No | Crop and save embedded images |
--output-dir |
No | Output directory (default: ./extracted-images) |
--send |
No | Send extracted images via Telegram |
--target |
No | Telegram chat ID (or TELEGRAM_CHAT_ID env var) |
--pages |
No | Number of PDF pages to process |
--debug |
No | Print raw API response |
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install claw-text-and-pics - 安装完成后,直接呼叫该 Skill 的名称或使用
/claw-text-and-pics触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
claw-text-and-pics 是什么?
Extract text and embedded images from scanned documents, PDFs, and photos via Mistral OCR API. Use when reading receipts, invoices, contracts, handwritten no... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 97 次。
如何安装 claw-text-and-pics?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install claw-text-and-pics」即可一键安装,无需额外配置。
claw-text-and-pics 是免费的吗?
是的,claw-text-and-pics 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
claw-text-and-pics 支持哪些平台?
claw-text-and-pics 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 claw-text-and-pics?
由 photon78(@photon78)开发并维护,当前版本 v1.0.1。