Description

国顺工业视觉顾问技能。用于工厂/矿山/园区/巡检场景下的工业视觉项目咨询，包括设备识别、表计读数、开关阀门状态识别、液位检测、人员异常行为、劳保穿戴与违章识别等图像视频 AI 方案分析。适用于用户需要判断现场是否适合做视觉 AI、该用 YOLO/RT-DETR、开放词汇检测、SAM、VLM/OCR、关键点、姿态动...

README (SKILL.md)

国顺工业视觉顾问技能

Name: 国顺工业视觉顾问技能
Author: jimmygx

当用户提出工厂、矿山、园区巡检、设备点检、人员安全监管等视觉识别需求时，使用本技能把问题拆解成可执行的技术路线。

核心原则：先定义业务决策和视觉任务，再选择模型。不要一上来就默认“训练 YOLO”或“直接上 VLM”，必须先明确可见性、数据条件、风险边界和验收标准。

工作方式

Restate the target result and business consequence in one sentence.
Ask only the missing questions that materially change the route. If enough context exists, proceed with explicit assumptions.
Classify the request into visual task types: detection, segmentation, keypoints, OCR, measurement, tracking, pose, action recognition, anomaly detection, VLM review, or rules.
Propose at least two viable routes when practical: rule/traditional vision, dedicated model, open-vocabulary/auto-labeling, VLM-assisted, human-review, or site/process modification.
Separate PoC, pilot, and production architecture. Do not promise production metrics from demos or public benchmarks.
Include data, labeling, deployment, validation, operations, privacy, and safety responsibility in the answer.
If the user requests agent discussion/parallel review, split independent lanes into model/toolchain research, scenario architecture, and risk review, then integrate.

先问什么

Prefer concrete evidence over abstract descriptions. Ask for:

5-20 representative images or 1-3 short videos from the actual camera when possible.
A normal/abnormal definition with examples and edge cases.
Camera position, distance, resolution, frame rate, lighting, dust/water/reflection/occlusion, and target minimum pixel size.
Alarm purpose: record, reminder, human review, enforcement, interlock, shutdown, or quality rejection.
Error tolerance: whether false negatives or false positives are more costly.
Available historical data and who can label/resolve ambiguous samples.
Deployment target: edge box, workstation, server, cloud, existing VMS/SCADA/MES/PLC platform.

Read references/intake-template.md when the request needs structured questions or a material checklist.

决策地图

Use this quick map, then read references/task-taxonomy.md for details.

User asks for	Usually decompose into
Find people, vehicles, gauges, switches, valves, devices	Detection plus optional tracking
Read pointer/analog gauges	Detection -> keypoints/segmentation -> OCR/config -> geometry
Determine switch/valve state	Detection -> keypoints/classification -> device binding rules
Detect liquid level	Detection -> segmentation/keypoints -> OCR/config -> measurement
PPE/violation recognition	Person/object detection -> tracking -> region/relationship/time rules
Abnormal movement/action	Person detection -> tracking -> pose/action model -> time-window rules
Smoke, leakage, crack, dirt, spill, boundary	Segmentation/anomaly detection, sometimes thermal/3D/special lighting
Unknown or changing target names	Open-vocabulary detection for discovery/auto-labeling, then dedicated model if production use
Explain scene, read labels, produce report	VLM/OCR as low-frequency assistant or reviewer

工具链建议

Use current official docs before finalizing model/API choices because model versions and deployment support change. Read references/toolchain.md for the maintained toolchain summary and source links.

Default production posture:

Dedicated YOLO/RT-DETR style detectors for stable, real-time, fixed-category work.
YOLO-World/Grounding DINO/SAM-style tools for cold start, automatic pre-labeling, and open-vocabulary search, not direct safety closure.
Qwen-VL/VLMs for OCR, semantic review, reporting, and low-confidence verification, not standalone high-risk control.
Pose/action/tracking models plus explicit time-window rules for personnel behavior.
Geometry, calibration, and keypoints for meters and measurements.

风险边界

Read references/guardrails.md for the full red lines. Always enforce these:

Do not reduce every industrial vision task to YOLO detection.
Do not claim VLMs are reliable real-time safety controllers without site validation and responsibility boundaries.
Do not accept one number like "99% accuracy" as sufficient; require precision, recall, false alarms, missed events, latency, and scenario slices.
Do not use public demos or vendor samples as production evidence.
Do not ignore hard negatives, rare defects, occlusion, dirty lenses, lighting drift, camera movement, or device model changes.
Do not upload employee images, production drawings, customer products, or process data to cloud services without authorization and privacy review.
Do not frame AI as a legal safety interlock or certified safety control unless the system is formally designed and certified that way.

输出要求

Every answer should include, scaled to the request:

Scenario interpretation and assumptions.
Key clarification questions or required materials.
Visual task decomposition.
Recommended technical routes and why.
Data and labeling plan.
Rules, thresholds, and human-review logic.
Deployment/integration constraints.
Risks, failure modes, and non-AI mitigations.
Validation metrics and acceptance plan.
PoC -> pilot -> production roadmap.
Explicit non-promises and uncertainty.

Use references/output-template.md when the user asks for a formal proposal, plan, or course-style explanation.

典型实施路径

For most production projects:

Site samples and definitions
-> task decomposition
-> camera/lighting feasibility check
-> auto-labeling with open-vocabulary/SAM where useful
-> manual label correction and hard-negative collection
-> train dedicated detector/segmenter/keypoint/action model
-> add tracking, geometry, OCR, and rules
-> VLM only for review/reporting/low-confidence cases
-> offline test on separated data
-> shadow-mode field trial
-> monitored production with sample feedback and retraining

For a new scenario with weak data, output a staged route rather than a final architecture.

Usage Guidance

This skill appears safe to install as an advisory template. Users should still avoid sharing factory footage, employee images, drawings, or process data unless they have authorization and have considered privacy and confidentiality requirements.

Capability Analysis

Type: OpenClaw Skill Name: guoshun-industrial-vision-advisor Version: 1.0.0 The skill bundle is a professional advisory tool for industrial vision projects, providing structured methodologies for analyzing factory and safety scenarios. It contains comprehensive safety guardrails in 'references/guardrails.md' and 'SKILL.md' that explicitly warn against data leakage, over-promising AI capabilities in safety-critical environments, and unauthorized cloud uploads of sensitive production data.

Capability Assessment

✓ Purpose & Capability

The stated purpose and included guidance consistently focus on industrial vision project scoping, model-route selection, validation, deployment, privacy, and safety boundaries.

✓ Instruction Scope

Instructions are advisory and constrain the assistant to ask relevant questions, avoid overpromising, separate PoC from production, and include risk/validation discussion.

✓ Install Mechanism

No install spec, binaries, environment variables, credentials, or executable code are present.

✓ Credentials

The skill may ask users for representative factory images or videos, which is proportionate to visual feasibility consulting, and it explicitly warns against unauthorized cloud upload of employee, production, drawing, or process data.

✓ Persistence & Privilege

Artifacts show no persistence, background execution, account access, credential handling, or privilege escalation.

Version History

v1.0.0

Initial release of guoshun-industrial-vision-advisor — an industrial vision consulting skill for factories, mines, parks, and inspection use cases. - Provides project consulting for industrial vision tasks, including device/person detection, meter reading, switch/valve state, liquid level, PPE/violation, anomaly detection, and more. - Emphasizes clear business definition, scenario decomposition, and selection of suitable AI models or rule-based approaches, avoiding one-size-fits-all solutions. - Guides users through best practices: requirements clarification, task taxonomy, data and validation planning, risk controls, and phased implementation from PoC to production. - Offers structured steps for technical decision, toolchain suggestion, site feasibility checks, labeling and deployment considerations, and acceptance standards. - Highlights critical safety, privacy, and responsibility boundaries for industrial environments.

Metadata

Slug guoshun-industrial-vision-advisor

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is 国顺工业视觉顾问技能?

国顺工业视觉顾问技能。用于工厂/矿山/园区/巡检场景下的工业视觉项目咨询，包括设备识别、表计读数、开关阀门状态识别、液位检测、人员异常行为、劳保穿戴与违章识别等图像视频 AI 方案分析。适用于用户需要判断现场是否适合做视觉 AI、该用 YOLO/RT-DETR、开放词汇检测、SAM、VLM/OCR、关键点、姿态动... It is an AI Agent Skill for Claude Code / OpenClaw, with 40 downloads so far.

How do I install 国顺工业视觉顾问技能?

Run "/install guoshun-industrial-vision-advisor" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is 国顺工业视觉顾问技能 free?

Yes, 国顺工业视觉顾问技能 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does 国顺工业视觉顾问技能 support?

国顺工业视觉顾问技能 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created 国顺工业视觉顾问技能?

It is built and maintained by jimmygx (@jimmygx); the current version is v1.0.0.

More Skills

国顺工业视觉顾问技能

国顺工业视觉顾问技能

工作方式

先问什么

决策地图

工具链建议

风险边界

输出要求

典型实施路径

What is 国顺工业视觉顾问技能?

How do I install 国顺工业视觉顾问技能?

Is 国顺工业视觉顾问技能 free?

Which platforms does 国顺工业视觉顾问技能 support?

Who created 国顺工业视觉顾问技能?

💬 Comments