← Back to Skills Marketplace

uni-vision-engine

Name: uni-vision-engine
Author: jiahuamld

by jiahuamld · GitHub ↗ · v1.0.2 · MIT-0

cross-platform ⚠ suspicious

430

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install uni-vision-engine

Description

Automated high-quality video generation (text-to-video, image-to-video) via a local jimeng-api Docker service. Features native OpenClaw image interception, a...

Usage Guidance

Before installing or enabling this skill, consider the following: - Privacy: The SKILL.md tells the agent to automatically 'intercept' image payloads from the chat; ensure you want the agent to extract and upload user-provided images (including any sensitive content) to a model service. Confirm user consent flows. - Embedded credential: The script contains a hard-coded session token. Treat this as a secret — it may grant access to the Jimeng service. Either remove it or replace it with a configuration mechanism (and do not publish secrets). - Runtime install: The script runs 'npm install' via shell at execution time. If you prefer deterministic installs, pre-install dependencies in a controlled environment rather than allowing runtime network installs. - Local service trust: The script talks to localhost:5100 (jimeng-api Docker). Verify that this local service is under your control and does not forward images or logs externally. If you don't run a local jimeng-api, the script will fail or may reveal that embedded session token is intended to reach a non-local endpoint (investigate first). - Principle of least privilege: If you cannot audit the docker image/service and you do not trust the embedded token, do not enable the automatic interception behavior; require explicit user consent or manual upload instead. If you want help: I can suggest concrete edits to the SKILL.md and scripts to remove the hard-coded token, replace the runtime npm install with an explicit installation step, and make image interception explicit and consent-driven.

Capability Analysis

Type: OpenClaw Skill Name: uni-vision-engine Version: 1.0.2 The skill contains a hardcoded session token in `scripts/generate.js` and uses `execSync` to dynamically install the `form-data` NPM package at runtime, which is a risky practice. While the behavior aligns with the stated purpose of interfacing with a local Docker-based video generation API (localhost:5100), the inclusion of hardcoded credentials and the use of shell execution for dependency management are significant security vulnerabilities.

Capability Assessment

ℹ Purpose & Capability

The skill's name and description match the included code: it expects a local jimeng-api (localhost:5100) and performs image→video/text→video requests. However, the bundled script contains a hard-coded session token (a cached credential) and a default model/port; the SKILL.md says a valid sessionid is required, but the code will silently use the embedded session if you don't supply one. Embedding a credential inside the code is unexpected given the manifest lists no required credentials and is disproportionate to the stated install-free instruction-only approach.

⚠ Instruction Scope

SKILL.md explicitly instructs the agent to 'MUST' intercept image payloads from chat context (base64 or cache path), save them locally (/tmp/target.jpg), and submit them automatically. That gives the agent broad discretion to read chat internals and extract binary content. While this is necessary for image-to-video functionality, the instructions are imperative and broad (could be applied to any image in chat) and therefore increase risk of unintended data access/exfiltration. The instructions also require monitoring Docker logs to retrieve results — no explicit safeguards or user consent checks are specified.

⚠ Install Mechanism

There is no declared install spec, but scripts/generate.js performs a dynamic runtime 'npm install form-data --no-save' via child_process.execSync when run. This is a network operation that installs code from the npm registry at execution time and writes to disk (node_modules). Dynamic, implicit installs via execSync are higher-risk than a declared install step and are surprising for an 'instruction-only' skill.

⚠ Credentials

The skill declares no required environment variables or credentials, but the script contains a hard-coded sessionToken string (b79fcda2...). Embedded credentials are effectively secret material and are disproportionate/unexpected. The script also accepts a --session override, but shipping a usable session inside the code can be abused or leak access. The code sends user-provided images to localhost:5100 only (no obvious external endpoints), but you should verify the local jimeng-api doesn't forward data externally.

ℹ Persistence & Privilege

The skill does not request 'always: true' and does not modify other skills or system-wide configuration. It will, however, create node_modules at runtime when dynamically installing 'form-data' and writes temp image files to /tmp. Those behaviors are relatively limited but worth noting as they create on-disk artifacts and perform network installs the first time they're invoked.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install uni-vision-engine
After installation, invoke the skill by name or use /uni-vision-engine
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.2

- Updated documentation and description for clearer English instructions and broader accessibility. - Removed the UI server component (`ui/server.js`), signaling a shift to fully headless, automation-focused operation. - Skill no longer references integrated web UI; focus is now on native chat-based and script-driven image/video generation. - Clarified workflow for automated image interception, strict file handling, and moderation/error responses. - CLI instructions and usage examples updated for consistency with the latest workflow.

v1.0.1

Uni Vision Engine v1.0.1 Changelog - Added ui/server.js, introducing a web-based UI server component to the project. - Enhanced description: now highlights a built-in visual web upload tool and native image extraction from chat, supporting direct image-to-video generation from chat interfaces. - Updated SKILL.md with clearer best practice guidance for handling image messages and using filesystem-based uploads for high-quality video generation. - Added detailed instructions for dealing with domestic content review mechanisms and error codes.

v1.0.0

Uni Vision Engine 1.0.0 - 首次发布，支持通过本地 jimeng-api 自动生成无水印高质量视频（文生视频、图生视频）。 - 集成自动积分检测，确保账户积分充足后再生成视频。 - 完成后自动写入历史记录，并在本地 HTML 控制台可视化管理所有任务。 - 支持通过 sessionid 完全控制视频生成流程。

Metadata

Slug uni-vision-engine

Version 1.0.2

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 3

Frequently Asked Questions

What is uni-vision-engine?

Automated high-quality video generation (text-to-video, image-to-video) via a local jimeng-api Docker service. Features native OpenClaw image interception, a... It is an AI Agent Skill for Claude Code / OpenClaw, with 430 downloads so far.

How do I install uni-vision-engine?

Run "/install uni-vision-engine" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is uni-vision-engine free?

Yes, uni-vision-engine is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does uni-vision-engine support?

uni-vision-engine is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created uni-vision-engine?

It is built and maintained by jiahuamld (@jiahuamld); the current version is v1.0.2.

More Skills