← 返回 Skills 市场
213
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install vlm-grounding
功能描述
Use GLM-4.7V's multimodal grounding capability to detect and locate objects/text in images. Activate when user asks to find, locate, detect, or ground specif...
安全使用建议
Treat this skill as potentially unsafe until you verify a few things: 1) Who published it and do you trust that owner? 2) Inspect the bundled ssssss.json log — remove or understand why session/tool content and example authorization headers are included. 3) Confirm whether the skill actually needs to read /root/.openclaw/agents/main/agent/models.json or call internal IPs; if so, restrict it to an isolated environment and ensure no sensitive networks/configs are exposed. 4) Watch for prompt-injection patterns in SKILL.md (base64/unicode control chars); ask the author to remove hidden/encoded content and to explicitly declare any needed config paths or credentials. If you cannot validate these points, run the skill only in a sandboxed agent or decline to install.
功能分析
Type: OpenClaw Skill
Name: vlm-grounding
Version: 1.0.0
The skill bundle provides instructions and code snippets for performing image grounding (object detection) using a GLM-4.7V model. It describes a standard workflow of calling an internal API (172.20.112.202), parsing bounding box coordinates from the response, and visualizing them on an image. No indicators of data exfiltration, malicious execution, or prompt injection were found in SKILL.md or _meta.json.
能力评估
Purpose & Capability
SKILL.md describes a reasonable grounding workflow (call model, parse boxes, draw visualizations). However the doc references a system config path (/root/.openclaw/agents/main/agent/models.json) and internal hosts (e.g., 172.20.112.202) without declaring that it needs access to those configs or network endpoints — this is an unexplained dependency on internal configuration.
Instruction Scope
Instructions tell the agent to contact an HTTP model API and to set NO_PROXY to bypass proxying (which affects network routing). They also include guidance that could cause the agent to read or use system-local config to locate model endpoints. The SKILL.md itself contains prompt-like material and the package contains a large session log (ssssss.json) with system/tool lists; combined with detected base64/unicode-control patterns, this raises concern about embedded prompt-injection or unintended privileged instructions.
Install Mechanism
There is no install spec and no code files to be installed; this reduces disk-write risk. The skill is instruction-only, which is lower risk than an install that fetches and executes arbitrary archives.
Credentials
The manifest declares no env vars or credentials, but the instructions tell users to set NO_PROXY and point to internal hosts and a root-owned models.json path. That implies the skill expects access to internal network and possibly system config; those capabilities are not declared. The included session log also exposes an 'authorization' header (Bearer idonthaveakey in the sample)—an unexpected token-like artifact that could confuse or be misused.
Persistence & Privilege
The skill is not marked always:true and does not request persistent privileges. It appears user-invocable only, which is appropriate for this type of helper.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install vlm-grounding - 安装完成后,直接呼叫该 Skill 的名称或使用
/vlm-grounding触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of multimodal grounding skill using GLM-4.7V.
- Supports detecting and locating objects, text, and UI elements in images with bounding box outputs.
- Provides end-to-end workflow: model API call, bounding box parsing, and visualization on images.
- Includes robust parsing for various bracket styles and auto-renormalization of coordinates.
- Triggers automatically on grounding-related user requests in both English and Chinese.
元数据
常见问题
vlm-grounding 是什么?
Use GLM-4.7V's multimodal grounding capability to detect and locate objects/text in images. Activate when user asks to find, locate, detect, or ground specif... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 213 次。
如何安装 vlm-grounding?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install vlm-grounding」即可一键安装,无需额外配置。
vlm-grounding 是免费的吗?
是的,vlm-grounding 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
vlm-grounding 支持哪些平台?
vlm-grounding 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 vlm-grounding?
由 Ji Qi(@qijimrc)开发并维护,当前版本 v1.0.0。
推荐 Skills