← Back to Skills Marketplace
wangziiiiii

Vision Recognition Ocr

by wangziiiiii · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ⚠ suspicious
943
Downloads
1
Stars
8
Active Installs
2
Versions
Install in OpenClaw
/install vision-recognition-ocr
Description
Vehicle/animal/plant recognition plus OCR for screenshots, photos, invoices, and tables. Use when users ask 识别车型/看图识别/提取文字/OCR. Supports local path, URL, and...
Usage Guidance
This package will send whatever image you provide to Baidu's cloud OCR/classify endpoints and requires Baidu credentials. Before installing: (1) be aware the registry metadata omitted the required env vars — supply BAIDU_BCE_BEARER_TOKEN or API Key+Secret as documented in SKILL.md; (2) do not send sensitive images (personal documents, IDs, private photos) unless you trust Baidu and your account; (3) consider creating a limited/monitored Baidu account and API keys for this skill and rotate keys if needed; (4) if you want to be extra cautious, review the included scripts locally (they are short and readable) and run them in an isolated environment; (5) if the missing metadata concerns you, contact the skill publisher or avoid installing until the metadata matches the implementation.
Capability Analysis
Type: OpenClaw Skill Name: vision-recognition-ocr Version: 1.0.1 The skill bundle is a legitimate integration for Baidu's Vision and OCR APIs, providing tools for image classification (animals, cars, plants) and text extraction. The scripts (e.g., `_baidu_image_classify.py`, `ocr_general_basic.py`) correctly handle authentication via environment variables and interact with official Baidu endpoints (aip.baidubce.com). No evidence of malicious behavior, data exfiltration, or prompt injection was found.
Capability Assessment
Purpose & Capability
Name/description (vision recognition + OCR) match the code and SKILL.md: the scripts call Baidu image-classify and OCR endpoints and accept local path/URL/base64 images. However the registry metadata lists no required environment variables or credentials while the implementation clearly expects Baidu API credentials — an incoherence between metadata and actual capability.
Instruction Scope
SKILL.md instructions and the Python scripts are scoped to classification and OCR tasks. They accept image_path/url/base64 and build requests to Baidu APIs; they do not attempt to read unrelated system files or call unexpected external endpoints beyond Baidu.
Install Mechanism
No external install or remote downloads are used; the package contains local Python scripts. No extract-from-URL or third-party install steps are present. Scripts use the requests library (runtime dependency), which is normal.
Credentials
The code and SKILL.md require Baidu credentials (BAIDU_BCE_BEARER_TOKEN / BAIDU_API_KEY / BAIDU_VISION_API_KEY + secrets). Those credentials are proportionate to the declared purpose (accessing Baidu APIs), but the skill registry metadata incorrectly lists no required env vars or primary credential — this mismatch is a practical risk (you might install without realizing you must supply secrets). Also BAIDU_API_KEY is used in multiple fallback roles which could be confusing and lead to accidental credential exposure.
Persistence & Privilege
Skill is not always-enabled; it does not request elevated system privileges and does not modify other skills or global agent settings. Autonomous invocation is allowed (platform default) but is not by itself a new risk here.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install vision-recognition-ocr
  3. After installation, invoke the skill by name or use /vision-recognition-ocr
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
Sync latest local fixes and docs
v1.0.0
Launch public skill with clearer landing copy
Metadata
Slug vision-recognition-ocr
Version 1.0.1
License MIT-0
All-time Installs 9
Active Installs 8
Total Versions 2
Frequently Asked Questions

What is Vision Recognition Ocr?

Vehicle/animal/plant recognition plus OCR for screenshots, photos, invoices, and tables. Use when users ask 识别车型/看图识别/提取文字/OCR. Supports local path, URL, and... It is an AI Agent Skill for Claude Code / OpenClaw, with 943 downloads so far.

How do I install Vision Recognition Ocr?

Run "/install vision-recognition-ocr" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Vision Recognition Ocr free?

Yes, Vision Recognition Ocr is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Vision Recognition Ocr support?

Vision Recognition Ocr is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Vision Recognition Ocr?

It is built and maintained by wangziiiiii (@wangziiiiii); the current version is v1.0.1.

💬 Comments