← 返回 Skills 市场
18072937735

Large Model Visual Question Answering Skill | 大模型视觉问答技能

作者 smyx-skills · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
74
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install smyx-visual-qa-analysis
功能描述
Conducts open-ended Q&A on image content based on computer vision and large language models, supporting any questions to receive natural language responses....
安全使用建议
This skill will upload images and metadata to external API endpoints (the code points to lifeemergence/open-api hosts by default) and will read/write config files and may create a local SQLite under the workspace. Before installing or running: 1) Confirm the remote API host and privacy policy — sensitive images should not be uploaded unless you trust the service. 2) Inspect or sandbox the skill (run in an isolated environment/container) — it can create files under OPENCLAW_WORKSPACE and save attachments. 3) Review workspace config files for secrets that the skill might read; avoid placing sensitive API keys or tokens in shared configs. 4) If you need strictly local-only VQA, do not use this skill. If anything is unclear, ask the publisher which endpoints receive images and what data is stored remotely vs locally.
功能分析
Type: OpenClaw Skill Name: smyx-visual-qa-analysis Version: 1.0.0 The skill implements a complex integration with a third-party cloud service (lifeemergence.com) and contains highly controlling prompt instructions in SKILL.md that explicitly forbid the AI agent from accessing its own local memory or LanceDB, forcing it to rely solely on the remote API for history. The common library (smyx_common) includes logic in util.py and dao.py to automatically register/login users using identifiers like phone numbers, subsequently storing session tokens in a local SQLite database (smyx-common-claw.db). Additionally, AgentSkill in smyx_common/scripts/skill.py uses subprocess.run to recursively invoke the 'openclaw' agent, which is a high-privilege capability. While these features support the stated VQA and health analysis purposes, the aggressive override of agent memory and the automated credential management/persistence are high-risk behaviors.
能力标签
requires-sensitive-credentials
能力评估
Purpose & Capability
The skill's name/description claim a Visual Question Answering feature, which matches the scripts that call an external VQA API. However the repository includes substantial unrelated functionality (face_analysis, pet-health references, TCM face-diagnosis code and a large common library). That broad code surface and references to multiple analysis endpoints are disproportionate to a single VQA skill and increase the attack/abuse surface (e.g., extra API endpoints, DB code).
Instruction Scope
SKILL.md explicitly forbids reading local memory and requires retrieving an open-id from config files; the runtime scripts do read and write local files (validate and read image files, save attachments, and BaseEnum will create/read config.yaml). The scripts call remote APIs and will upload image data. The 'do not use local memory' requirement in documentation is not strongly enforced by the code, producing a mismatch between instructions and actual behavior.
Install Mechanism
There is no install spec (instruction-only), which is lower-risk from an installer standpoint. However the bundle includes a large requirements.txt in smyx_common and face_analysis, implying many external Python packages would be needed to run; there is no automated, vetted install path provided.
Credentials
Registry metadata declares no required environment variables or credentials, but the code reads environment variables (OPENCLAW_SENDER_OPEN_ID, OPENCLAW_WORKSPACE, FEISHU_OPEN_ID) and expects to fetch api-key/open-id from local config files under the skill or workspace. The SKILL.md forces an 'open-id' and instructs checking skill/workspace config files for api-key, yet these env/config accessors were not declared in metadata — mismatch and potential for unexpected access to workspace config or secrets.
Persistence & Privilege
The skill will save uploaded attachments to a local attachments directory and the shared common modules include a DAO that writes/reads a local SQLite under a workspace 'data' directory. The skill does not set always:true, but it will create or modify files in the workspace if executed.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install smyx-visual-qa-analysis
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /smyx-visual-qa-analysis 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of Visual Q&A Analysis skill: - Enables open-ended natural language Q&A for images using computer vision and large language models. - Supports image understanding, scene description, detail identification, and knowledge reasoning from user questions and images. - Strictly enforces cloud-based history retrieval; never reads or summarizes from local memory or long-term storage. - Requires secure open-id acquisition via config file or user prompt before any operation. - Provides clear operational instructions, output formatting, and usage constraints for reliability and privacy.
元数据
Slug smyx-visual-qa-analysis
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Large Model Visual Question Answering Skill | 大模型视觉问答技能 是什么?

Conducts open-ended Q&A on image content based on computer vision and large language models, supporting any questions to receive natural language responses.... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 74 次。

如何安装 Large Model Visual Question Answering Skill | 大模型视觉问答技能?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install smyx-visual-qa-analysis」即可一键安装,无需额外配置。

Large Model Visual Question Answering Skill | 大模型视觉问答技能 是免费的吗?

是的,Large Model Visual Question Answering Skill | 大模型视觉问答技能 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Large Model Visual Question Answering Skill | 大模型视觉问答技能 支持哪些平台?

Large Model Visual Question Answering Skill | 大模型视觉问答技能 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Large Model Visual Question Answering Skill | 大模型视觉问答技能?

由 smyx-skills(@18072937735)开发并维护,当前版本 v1.0.0。

💬 留言讨论