← Back to Skills Marketplace
ldt5200-sys

Tiexue Vision

by LDT5200-sys · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
82
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install tiexue-vision
Description
Recognizes text (Chinese/English), objects, and scenes in images from chat, documents, or local files, with optional translation and auto-saving results.
Usage Guidance
This skill mostly does what it says, but consider the following before installing: - Privacy: Recognized English text is sent to translate.googleapis.com — sensitive text will leave your machine. If you need fully local processing, disable translation or provide a local translation alternative. - Missing model: config.json references ./models/yolov5s.onnx but no model file is included. Ask the author how the model is provided, or supply your own local model before running. - Excessive dependencies: package.json/lock include heavy native ML packages (tfjs-node, onnxruntime-node) and node-pre-gyp, which can download prebuilt binaries or trigger native builds during npm install. Expect large downloads and potential network activity during installation; consider installing in an isolated environment (container/VM) and review package-lock.json thoroughly. - Inconsistency: SKILL.md recommends installing system tesseract, but the code uses tesseract.js. Confirm whether a system tesseract binary is actually required in your environment. Recommendations: review/verify the model file and package-lock, run installation in an isolated sandbox, or request an explicit install script and explanation from the author (why tfjs is included, where the ONNX model is hosted). If you are processing sensitive images/text, disable the public-translate call or confirm a privacy-safe translation provider.
Capability Analysis
Type: OpenClaw Skill Name: tiexue-vision Version: 1.0.0 The skill provides OCR and object detection capabilities using Tesseract and ONNX, aligning perfectly with the documentation in SKILL.md. It includes a translation feature that sends extracted text to a public Google Translate endpoint (translate.googleapis.com), which is a disclosed and legitimate function. No evidence of data exfiltration, malicious execution, or harmful prompt injection was found across the codebase or configuration files.
Capability Assessment
Purpose & Capability
Name/description (image OCR + object/scene detection + optional translation) align with the code. However there are mismatches: SKILL.md suggests installing a system tesseract binary and points to an OCR executable path in config.json, but the code uses tesseract.js (a JS worker). The skill claims fully local operation except translation, yet package.json/lock include heavy ML/native deps (e.g., @tensorflow/tfjs-node, onnxruntime-node) that are disproportionate for the simple onnx inference shown in index.js. A required YOLO model path (./models/yolov5s.onnx) is referenced in config.json/index.js but no model file is included in the package manifest.
Instruction Scope
SKILL.md and index.js instructions are mostly consistent: they read images, run OCR and object detection, write back to Feishu or create a .txt. The code only reads local config.json and the provided image, so there is no broad filesystem scraping. Important scope notes: the code transmits recognized English text to the public Google Translate endpoint (translate.googleapis.com) — this is an external network call that sends user data off-host. That is disclosed in SKILL.md ('除翻译外不依赖外部云服务'), but users should be aware recognized text will be sent to Google.
Install Mechanism
No explicit install spec in the registry entry, but package.json lists heavy native dependencies (onnxruntime-node, @tensorflow/tfjs-node). These packages commonly trigger native builds or prebuilt-binary downloads (node-pre-gyp). Because an install mechanism isn't specified, it's unclear how the runtime environment will install dependencies; npm install could download/compile native binaries and contact third-party package servers. The package includes a large package-lock.json which pulls many transitive packages not strictly required by the index.js logic (e.g., tfjs). This is disproportionate and increases attack surface.
Credentials
The skill declares no required environment variables or credentials. The code reads a local config.json and uses a 'feishu' client object passed in context for Feishu integration (so it relies on host-provided client rather than asking for tokens). config.json contains fields for a translation apiKey but the code uses the public Google Translate endpoint without using that key. No other unrelated credentials are requested.
Persistence & Privilege
The skill is not always-enabled and does not request elevated or persistent platform privileges. It does not modify other skills or system settings. It reads and writes local files only in the image's directory (writes image.txt) and updates Feishu content when invoked with Feishu context.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install tiexue-vision
  3. After installation, invoke the skill by name or use /tiexue-vision
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
vision - 图片全能识别 1.0.0 发布 - 支持在聊天、文档和本地文件夹中对图片进行文字、物体和场景识别 - 文字识别支持中英文,英文自动翻译为中文 - 结果可直接写回原始聊天/文档或生成同名 `.txt` 文件 - 提供命令行方式进行图片识别 - 完全本地运行,除翻译外不依赖外部云服务
Metadata
Slug tiexue-vision
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Tiexue Vision?

Recognizes text (Chinese/English), objects, and scenes in images from chat, documents, or local files, with optional translation and auto-saving results. It is an AI Agent Skill for Claude Code / OpenClaw, with 82 downloads so far.

How do I install Tiexue Vision?

Run "/install tiexue-vision" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Tiexue Vision free?

Yes, Tiexue Vision is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Tiexue Vision support?

Tiexue Vision is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Tiexue Vision?

It is built and maintained by LDT5200-sys (@ldt5200-sys); the current version is v1.0.0.

💬 Comments