← 返回 Skills 市场

vision-skill

Name: vision-skill
Author: lgwanai

作者 lgwanai · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

375

总下载

当前安装

版本数

在 OpenClaw 中安装

/install vision-skill

功能描述

Use this skill for computer vision tasks including image recognition (OCR, object detection) and image generation (text-to-image, image-to-image). Supports a...

安全使用建议

Key points before installing: - Do NOT trust the registry metadata that says 'no env vars' — this skill requires your Tencent COS keys and a Doubao/Volcengine API key. Only provide those secrets if you intend the skill to upload images to your COS bucket and call the Doubao API. - Verify the API endpoint: the client uses https://ark.cn-beijing.volces.com/api/v3 which does not match the README link to console.volcengine.com; confirm this hostname is legitimate for your provider or replace it with an official endpoint from your Doubao/Volcengine account. - Use least-privilege credentials: create a COS bucket and keys scoped to that bucket (and consider using short-lived tokens if possible) rather than reusing broad permanent keys. - Inspect and run the code in an isolated environment first (e.g., throwaway VM or container). The scripts will write to a local .tasks directory and .tasks/worker.log, spawn background worker processes, and upload local files to COS — confirm that behavior is acceptable. - If you will expose sensitive images, set the COS bucket permissions appropriately (private by default) and review how temporary URLs are generated/used. - If anything (metadata mismatch, unusual base_url, or unexpected network endpoints) looks off, ask the publisher for clarification or consider alternative, better-audited tools.

功能分析

Type: OpenClaw Skill Name: vision-skill Version: 1.0.0 The vision-skill bundle is a legitimate tool for integrating Doubao AI vision and image generation models with Tencent Cloud COS storage. It implements an asynchronous task architecture using a background worker (worker.py) and a local task tracking system (.tasks/ directory). The code follows standard practices for API integration, including environment variable configuration for secrets and retry logic for network calls. No evidence of malicious intent, data exfiltration, or unauthorized execution was found; the use of subprocess in vision_cli.py is limited to hardcoded process management and task execution.

能力评估

ℹ Purpose & Capability

The name/description describe vision recognition and image generation and the code implements Tencent COS uploads and calls a Doubao (Volcengine) API — these capabilities align with the stated purpose. However the registry metadata lists no required env vars while the SKILL.md, README and code require COS_* and DOUBAO_* credentials, which is an incoherence between metadata and actual requirements.

✓ Instruction Scope

SKILL.md and CLI instruct uploading local images to COS, calling Doubao endpoints, storing async task files under a local .tasks/ directory, and optionally downloading generated images — the instructions and included code stay within that scope and do not attempt to read unrelated system files or credentials beyond those needed for COS/Doubao.

ℹ Install Mechanism

This is labelled as instruction-only in the registry, but the package includes Python source and a requirements.txt (requests, python-dotenv, cos-python-sdk-v5). There is no download-from-URL or opaque installer; installing implies pip installing listed deps and running bundled scripts. The discrepancy between 'no install spec' and presence of code is noteworthy but not inherently malicious.

⚠ Credentials

The code requires Tencent COS credentials (COS_SECRET_ID, COS_SECRET_KEY, COS_BUCKET_NAME, COS_REGION) and DOUBAO_API_KEY (plus optional fallback model vars). Those credentials are appropriate for the described cloud storage and model API usage, but the registry metadata incorrectly declared 'Required env vars: none' — a meaningful mismatch. Also the COS client uses permanent keys (Token=None), so users should understand they're providing full access keys rather than short-lived tokens.

✓ Persistence & Privilege

The skill does not request always:true or global agent privileges. It writes task state and logs under a local .tasks/ directory and spawns worker processes when a task is submitted — expected for an async CLI-style skill. It does not modify other skills' configs or system-wide settings.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install vision-skill
安装完成后，直接呼叫该 Skill 的名称或使用 /vision-skill 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of vision-skill, providing end-to-end computer vision and image generation capabilities. - Supports image recognition (OCR, object detection, content description, Q&A) and flexible image generation (text-to-image, image-to-image, sequential images). - Integrates with Tencent Cloud COS for image storage and uses Doubao AI models for processing. - CLI interface via `vision_cli.py` with options for batch tasks, style/format presets, quality modes, and retries. - All tasks execute asynchronously, with options to wait for completion and save outputs. - Comprehensive environment variable setup and task management through a local `.tasks/` directory.

元数据

Slug vision-skill

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题

vision-skill 是什么？

Use this skill for computer vision tasks including image recognition (OCR, object detection) and image generation (text-to-image, image-to-image). Supports a... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 375 次。

如何安装 vision-skill？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install vision-skill」即可一键安装，无需额外配置。

vision-skill 是免费的吗？

是的，vision-skill 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

vision-skill 支持哪些平台？

vision-skill 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 vision-skill？

由 lgwanai（@lgwanai）开发并维护，当前版本 v1.0.0。