← 返回 Skills 市场
Tiexue Vision
作者
LDT5200-sys
· GitHub ↗
· v1.0.0
· MIT-0
82
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install tiexue-vision
功能描述
Recognizes text (Chinese/English), objects, and scenes in images from chat, documents, or local files, with optional translation and auto-saving results.
安全使用建议
This skill mostly does what it says, but consider the following before installing:
- Privacy: Recognized English text is sent to translate.googleapis.com — sensitive text will leave your machine. If you need fully local processing, disable translation or provide a local translation alternative.
- Missing model: config.json references ./models/yolov5s.onnx but no model file is included. Ask the author how the model is provided, or supply your own local model before running.
- Excessive dependencies: package.json/lock include heavy native ML packages (tfjs-node, onnxruntime-node) and node-pre-gyp, which can download prebuilt binaries or trigger native builds during npm install. Expect large downloads and potential network activity during installation; consider installing in an isolated environment (container/VM) and review package-lock.json thoroughly.
- Inconsistency: SKILL.md recommends installing system tesseract, but the code uses tesseract.js. Confirm whether a system tesseract binary is actually required in your environment.
Recommendations: review/verify the model file and package-lock, run installation in an isolated sandbox, or request an explicit install script and explanation from the author (why tfjs is included, where the ONNX model is hosted). If you are processing sensitive images/text, disable the public-translate call or confirm a privacy-safe translation provider.
功能分析
Type: OpenClaw Skill
Name: tiexue-vision
Version: 1.0.0
The skill provides OCR and object detection capabilities using Tesseract and ONNX, aligning perfectly with the documentation in SKILL.md. It includes a translation feature that sends extracted text to a public Google Translate endpoint (translate.googleapis.com), which is a disclosed and legitimate function. No evidence of data exfiltration, malicious execution, or harmful prompt injection was found across the codebase or configuration files.
能力评估
Purpose & Capability
Name/description (image OCR + object/scene detection + optional translation) align with the code. However there are mismatches: SKILL.md suggests installing a system tesseract binary and points to an OCR executable path in config.json, but the code uses tesseract.js (a JS worker). The skill claims fully local operation except translation, yet package.json/lock include heavy ML/native deps (e.g., @tensorflow/tfjs-node, onnxruntime-node) that are disproportionate for the simple onnx inference shown in index.js. A required YOLO model path (./models/yolov5s.onnx) is referenced in config.json/index.js but no model file is included in the package manifest.
Instruction Scope
SKILL.md and index.js instructions are mostly consistent: they read images, run OCR and object detection, write back to Feishu or create a .txt. The code only reads local config.json and the provided image, so there is no broad filesystem scraping. Important scope notes: the code transmits recognized English text to the public Google Translate endpoint (translate.googleapis.com) — this is an external network call that sends user data off-host. That is disclosed in SKILL.md ('除翻译外不依赖外部云服务'), but users should be aware recognized text will be sent to Google.
Install Mechanism
No explicit install spec in the registry entry, but package.json lists heavy native dependencies (onnxruntime-node, @tensorflow/tfjs-node). These packages commonly trigger native builds or prebuilt-binary downloads (node-pre-gyp). Because an install mechanism isn't specified, it's unclear how the runtime environment will install dependencies; npm install could download/compile native binaries and contact third-party package servers. The package includes a large package-lock.json which pulls many transitive packages not strictly required by the index.js logic (e.g., tfjs). This is disproportionate and increases attack surface.
Credentials
The skill declares no required environment variables or credentials. The code reads a local config.json and uses a 'feishu' client object passed in context for Feishu integration (so it relies on host-provided client rather than asking for tokens). config.json contains fields for a translation apiKey but the code uses the public Google Translate endpoint without using that key. No other unrelated credentials are requested.
Persistence & Privilege
The skill is not always-enabled and does not request elevated or persistent platform privileges. It does not modify other skills or system settings. It reads and writes local files only in the image's directory (writes image.txt) and updates Feishu content when invoked with Feishu context.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install tiexue-vision - 安装完成后,直接呼叫该 Skill 的名称或使用
/tiexue-vision触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
vision - 图片全能识别 1.0.0 发布
- 支持在聊天、文档和本地文件夹中对图片进行文字、物体和场景识别
- 文字识别支持中英文,英文自动翻译为中文
- 结果可直接写回原始聊天/文档或生成同名 `.txt` 文件
- 提供命令行方式进行图片识别
- 完全本地运行,除翻译外不依赖外部云服务
元数据
常见问题
Tiexue Vision 是什么?
Recognizes text (Chinese/English), objects, and scenes in images from chat, documents, or local files, with optional translation and auto-saving results. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 82 次。
如何安装 Tiexue Vision?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install tiexue-vision」即可一键安装,无需额外配置。
Tiexue Vision 是免费的吗?
是的,Tiexue Vision 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Tiexue Vision 支持哪些平台?
Tiexue Vision 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Tiexue Vision?
由 LDT5200-sys(@ldt5200-sys)开发并维护,当前版本 v1.0.0。
推荐 Skills