/install claw-text-and-pics
claw-text-and-pics
Extract text and images from documents via Mistral OCR
Give your OpenClaw agent the ability to read scanned documents, PDFs, and images — extracting clean Markdown text and cropping out embedded images. Powered by Mistral's OCR API.
When to use
- Extract text from scanned documents, invoices, receipts, contracts
- Pull embedded images from PDFs or scans
- Convert handwritten notes or photos to searchable text
- Send extracted images directly to Telegram
Usage
# Extract text only
python3 ocr.py --input scan.jpg
# Extract text from PDF (3 pages)
python3 ocr.py --input document.pdf --pages 3
# Extract embedded images
python3 ocr.py --input scan.jpg --extract-images --output-dir ./images/
# Extract images and send to Telegram
python3 ocr.py --input scan.jpg --extract-images --send --target 123456789
# Works with URLs too
python3 ocr.py --input https://example.com/document.pdf
Output
- stdout: Extracted text as Markdown
- Files: Cropped images saved to
--output-dir(only with--extract-images)
Configuration
Set in ~/.openclaw/.env or as environment variables:
| Variable | Required | Description |
|---|---|---|
MISTRAL_API_KEY |
Yes | Your Mistral API key |
TELEGRAM_BOT_TOKEN |
Only for --send |
Your Telegram bot token |
TELEGRAM_CHAT_ID |
Optional | Default chat ID (overridable with --target) |
Environment Variables
MISTRAL_API_KEY=required # Mistral API key — get one at console.mistral.ai
TELEGRAM_BOT_TOKEN=optional # Required only when using --send
TELEGRAM_CHAT_ID=optional # Default target chat ID (overridable with --target)
This skill reads ~/.openclaw/.env as a fallback for credentials.
Ensure the file has restricted permissions: chmod 600 ~/.openclaw/.env
Requirements
- Python 3.11+
- Mistral API key (console.mistral.ai)
- Optional (only for
--extract-images):pip install pillow
Parameters
| Parameter | Required | Description |
|---|---|---|
--input |
Yes | Local path or URL to image/PDF |
--extract-images |
No | Crop and save embedded images |
--output-dir |
No | Output directory (default: ./extracted-images) |
--send |
No | Send extracted images via Telegram |
--target |
No | Telegram chat ID (or TELEGRAM_CHAT_ID env var) |
--pages |
No | Number of PDF pages to process |
--debug |
No | Print raw API response |
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install claw-text-and-pics - After installation, invoke the skill by name or use
/claw-text-and-pics - Provide required inputs per the skill's parameter spec and get structured output
What is claw-text-and-pics?
Extract text and embedded images from scanned documents, PDFs, and photos via Mistral OCR API. Use when reading receipts, invoices, contracts, handwritten no... It is an AI Agent Skill for Claude Code / OpenClaw, with 97 downloads so far.
How do I install claw-text-and-pics?
Run "/install claw-text-and-pics" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is claw-text-and-pics free?
Yes, claw-text-and-pics is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does claw-text-and-pics support?
claw-text-and-pics is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created claw-text-and-pics?
It is built and maintained by photon78 (@photon78); the current version is v1.0.1.