← 返回 Skills 市场
openclawzhangchong

free-ocr-zc

作者 张翀 · GitHub ↗ · v1.0.3 · MIT-0
cross-platform ✓ 安全检测通过
38
总下载
0
收藏
0
当前安装
4
版本数
在 OpenClaw 中安装
/install free-ocr-zc
功能描述
Extract text from images via OpenRouter API using Baidu Qianfan OCR model, supporting URLs and local files with customizable prompts.
使用说明 (SKILL.md)

OpenRouter OCR Skill

Overview

This skill provides OCR (Optical Character Recognition) functionality using models available via OpenRouter. It uses the OpenAI Python library to communicate with OpenRouter's API, specifically designed for models like Baidu's Qianfan OCR.

Quick Start

When you need to extract text from an image:

  1. Ensure prerequisites:

    • Python 3.x installed
    • Required packages: openai, requests (install via pip install openai requests)
    • Place your OpenRouter API key in the file: C:\Users\Administrator\.openclaw\secrets\openrouter.env (format: OPENROUTER_API_KEY=your_key_here)
  2. Call the OCR script with an image URL or local file path:

    python ocr.py \x3Cimage_input> [prompt]
    
    • image_input: Either a URL or a local file path to the image
    • prompt: Optional text prompt for the OCR (default: "OCR提取图片所有文字")
  3. Get result: The script prints the extracted text to stdout.

Usage Examples

Basic Usage with Default Prompt

python ocr.py "https://example.com/image.jpg"

Custom Prompt

python ocr.py "https://example.com/image.jpg" "请识别图片中的所有文字"

Local Image File

python ocr.py "C:\path	o\image.jpg"

How It Works

The skill uses the OpenAI client configured with:

  • Base URL: https://openrouter.ai/api/v1
  • Model: baidu/qianfan-ocr-fast:free (configurable via environment variable)
  • API Key: Read from OPENROUTER_API_KEY environment variable

It sends a multimodal request containing:

  1. A text prompt (default: "OCR提取图片所有文字")
  2. The image (encoded as base64 if local, or passed directly if URL)

The model returns the extracted text which is printed to console.

Environment Variables

  • OPENROUTER_API_KEY: Required - Your OpenRouter API key
  • OCR_MODEL: Optional - Model to use (default: baidu/qianfan-ocr-fast:free)
  • OCR_BASE_URL: Optional - OpenRouter base URL (default: https://openrouter.ai/api/v1)

Installation

  1. Create the skill directory: mkdir -p skills/openrouter-ocr
  2. Save the ocr.py script in this directory
  3. Install dependencies: pip install openai requests
  4. Set your OpenRouter API key:
    setx OPENROUTER_API_KEY "your_api_key_here"
    
    (Restart terminal after setting)

Notes

  • The skill works with both HTTP/HTTPS URLs and local file paths
  • For local files, the image is read and base64-encoded before sending
  • Error handling includes network issues, invalid API keys, and model errors
  • The default model is Baidu's Qianfan OCR fast version (free tier)
  • You can change the model by setting the OCR_MODEL environment variable
  • Response time depends on image size and model speed

Troubleshooting

  • API Key Error: Ensure OPENROUTER_API_KEY is set correctly
  • Module Not Found: Install required packages with pip install openai requests
  • Image Access: Verify the image URL is accessible or local path exists
  • Model Not Available: Check if the specified model is available on OpenRouter

Example Output

✅ OCR 识别结果:
------------------------------------------------------------
这是识别出的文本内容
...
------------------------------------------------------------

Security Note

Never commit your API key to version control. Keep it secure in environment variables.

安全使用建议
Before installing, make sure you are comfortable sending OCR images to OpenRouter, use a dedicated or limited API key where possible, and install the Python dependencies in an isolated environment. If you only want text extraction, prefer or request an OCR-only entry point that does not perform the extra image-description step.
能力标签
requires-sensitive-credentials
能力评估
Purpose & Capability
The OCR purpose matches the code and documentation, although the primary ocr.py script also performs an extra image-description request before OCR.
Instruction Scope
Instructions are user-directed and focused on OCR usage; no goal override, prompt-injection behavior, or hidden autonomous workflow is evident.
Install Mechanism
There is no formal install spec, while SKILL.md instructs users to manually install unpinned Python packages with pip; this is common for a Python wrapper but less reproducible.
Credentials
Reading a user-selected local image and sending it to OpenRouter is proportionate for OCR, but users should avoid images they do not want processed by an external service.
Persistence & Privilege
No background persistence is shown, but the scripts read an OpenRouter API key from a local secrets file or environment variable, which is sensitive credential handling.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install free-ocr-zc
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /free-ocr-zc 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.3
- No code or documentation changes in this release. - Version incremented to 1.0.3 for tracking purposes only. ## 技能功能 1. **图片描述**:先使用AI模型详细描述图片内容(物体、场景、颜色等) 2. **文字识别(OCR)**:再提取图片中的文字内容 3. **双重验证**:即使图片中没有文字,也能得到图片描述,避免返回空结果 ## 文件位置 - 技能说明:`C:\Users\Administrator\.openclaw\workspace\skills\openrouter-ocr\SKILL.md` - 主程序:`C:\Users\Administrator\.openclaw\workspace\skills\openrouter-ocr\ocr.py` - API密钥存储:`C:\Users\Administrator\.openclaw\secrets\openrouter.env`(已预配置您的密钥) ## 使用方法 ```bash python ocr.py <图片URL或本地路径> [可选的OCR提示词] ``` ### 示例 ```bash # 使用默认提示词 python ocr.py "https://live.staticflickr.com/3851/14825276609_098cac593d_b.jpg" # 自定义OCR提示词 python ocr.py "https://example.com/image.jpg" "请识别图片中的所有文字" ``` ## 特点 - ✅ 从 `secrets/openrouter.env` 文件读取API密钥,避免环境变量泄露 - ✅ 支持HTTP/HTTPS URL和本地文件路径 - ✅ 已修复Windows控制台编码问题(解决了emoji和特殊字符显示问题) - ✅ 默认使用 Baidu Qianfan OCR fast 模型(免费层级) - ✅ 可通过环境变量 `OCR_MODEL` 自定义模型 ## 测试结果 使用您提供的海豚图片测试: - **图片描述**:正确描述了两只海豚在海水中嬉戏的场景 - **OCR识别**:提取到了文字 "跃出一跃" 技能已就绪,您可以直接使用。如需修改模型或其他配置,请编辑技能目录下的文件。
v1.0.2
- Added new file: ocr_final.py. - No changes to documentation or existing files except for the addition of this script. - Implements additional or updated OCR functionality in ocr_final.py.
v1.0.1
- Added new file: ocr_fixed.py - No changes to documentation or core functionality apart from the new script addition.
v1.0.0
- Initial release of the OpenRouter OCR skill. - Provides OCR functionality using OpenRouter-supported models (default: Baidu Qianfan OCR). - Accepts image input via URL or local file path; outputs extracted text to stdout. - Uses environment variables for API key and model configuration. - Offers error handling for network, authentication, and model selection issues. - Includes a detailed `SKILL.md` with setup, usage, and troubleshooting instructions.
元数据
Slug free-ocr-zc
版本 1.0.3
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 4
常见问题

free-ocr-zc 是什么?

Extract text from images via OpenRouter API using Baidu Qianfan OCR model, supporting URLs and local files with customizable prompts. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 38 次。

如何安装 free-ocr-zc?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install free-ocr-zc」即可一键安装,无需额外配置。

free-ocr-zc 是免费的吗?

是的,free-ocr-zc 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

free-ocr-zc 支持哪些平台?

free-ocr-zc 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 free-ocr-zc?

由 张翀(@openclawzhangchong)开发并维护,当前版本 v1.0.3。

💬 留言讨论