← 返回 Skills 市场

Vision Sandbox

Name: Vision Sandbox
Author: johanesalxd

作者 Jo Alex · GitHub ↗ · v1.1.0

cross-platform ✓ 安全检测通过

6564

总下载

当前安装

版本数

在 OpenClaw 中安装

/install vision-sandbox

功能描述

Agentic Vision via Gemini's native Code Execution sandbox. Use for spatial grounding, visual math, and UI auditing.

安全使用建议

Install only if you are comfortable sending the chosen images, screenshots, prompts, and resulting analysis to Google Gemini under your API key. Avoid submitting secrets, credentials, private customer data, or confidential screenshots unless that is allowed by your data-handling policy, and use a constrained or monitored Gemini key where possible.

功能分析

Type: OpenClaw Skill Name: vision-sandbox Version: 1.1.0 The skill is designed to leverage Google Gemini's vision capabilities, including its native code execution sandbox. The core logic in `scripts/vision_executor.py` reads an image, sends it to the Gemini API along with a prompt, and enables code execution *within Google's remote sandbox environment*. The local script does not execute arbitrary code received from the model; it only prints the sandbox code and its output. File operations are limited to reading the user-provided input image and writing output images generated by the Gemini model. There is no evidence of data exfiltration, malicious local execution, persistence mechanisms, or prompt injection attempts against the OpenClaw agent in any of the analyzed files.

能力评估

ℹ Purpose & Capability

The capability matches the stated purpose: the local script reads a user-specified image, sends it with a prompt to Gemini, enables Gemini's hosted code execution tool, prints returned code/results, and saves any returned inline images. This is sensitive but coherent and disclosed as a vision-sandbox workflow.

ℹ Instruction Scope

The instructions focus on visual grounding, visual math, and UI auditing. The README includes an example where another coding agent uses the visual result to update CSS, but that is a user-directed follow-on workflow rather than hidden mutation by this skill.

ℹ Install Mechanism

Installation uses ClawHub or local Python packaging with uv and google-genai. No automatic persistence or privileged installer behavior is shown, though the dependency is version-ranged rather than pinned.

ℹ Credentials

A GEMINI_API_KEY and network submission of selected images/prompts are proportionate for a Gemini vision integration, but users should treat screenshots and prompts as data sent to an external provider.

✓ Persistence & Privilege

No background service, local execution of model-generated code, credential storage, privilege escalation, or broad filesystem access is shown. The only local write found is saving model-returned inline media as sandbox_output_*.png in the current directory.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install vision-sandbox
安装完成后，直接呼叫该 Skill 的名称或使用 /vision-sandbox 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.1.0

Migrate to standard OpenClaw tool configuration

v1.0.0

Initial public release

元数据

Slug vision-sandbox

版本 1.1.0

许可证 —

累计安装 247

当前安装数 35

历史版本数 2

常见问题

Vision Sandbox 是什么？

Agentic Vision via Gemini's native Code Execution sandbox. Use for spatial grounding, visual math, and UI auditing. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 6564 次。

如何安装 Vision Sandbox？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install vision-sandbox」即可一键安装，无需额外配置。

Vision Sandbox 是免费的吗？

是的，Vision Sandbox 完全免费（开源免费），可自由下载、安装和使用。

Vision Sandbox 支持哪些平台？

Vision Sandbox 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Vision Sandbox？

由 Jo Alex（@johanesalxd）开发并维护，当前版本 v1.1.0。