← Back to Skills Marketplace
430
Downloads
0
Stars
1
Active Installs
3
Versions
Install in OpenClaw
/install uni-vision-engine
Description
Automated high-quality video generation (text-to-video, image-to-video) via a local jimeng-api Docker service. Features native OpenClaw image interception, a...
Usage Guidance
Before installing or enabling this skill, consider the following:
- Privacy: The SKILL.md tells the agent to automatically 'intercept' image payloads from the chat; ensure you want the agent to extract and upload user-provided images (including any sensitive content) to a model service. Confirm user consent flows.
- Embedded credential: The script contains a hard-coded session token. Treat this as a secret — it may grant access to the Jimeng service. Either remove it or replace it with a configuration mechanism (and do not publish secrets).
- Runtime install: The script runs 'npm install' via shell at execution time. If you prefer deterministic installs, pre-install dependencies in a controlled environment rather than allowing runtime network installs.
- Local service trust: The script talks to localhost:5100 (jimeng-api Docker). Verify that this local service is under your control and does not forward images or logs externally. If you don't run a local jimeng-api, the script will fail or may reveal that embedded session token is intended to reach a non-local endpoint (investigate first).
- Principle of least privilege: If you cannot audit the docker image/service and you do not trust the embedded token, do not enable the automatic interception behavior; require explicit user consent or manual upload instead.
If you want help: I can suggest concrete edits to the SKILL.md and scripts to remove the hard-coded token, replace the runtime npm install with an explicit installation step, and make image interception explicit and consent-driven.
Capability Analysis
Type: OpenClaw Skill
Name: uni-vision-engine
Version: 1.0.2
The skill contains a hardcoded session token in `scripts/generate.js` and uses `execSync` to dynamically install the `form-data` NPM package at runtime, which is a risky practice. While the behavior aligns with the stated purpose of interfacing with a local Docker-based video generation API (localhost:5100), the inclusion of hardcoded credentials and the use of shell execution for dependency management are significant security vulnerabilities.
Capability Assessment
Purpose & Capability
The skill's name and description match the included code: it expects a local jimeng-api (localhost:5100) and performs image→video/text→video requests. However, the bundled script contains a hard-coded session token (a cached credential) and a default model/port; the SKILL.md says a valid sessionid is required, but the code will silently use the embedded session if you don't supply one. Embedding a credential inside the code is unexpected given the manifest lists no required credentials and is disproportionate to the stated install-free instruction-only approach.
Instruction Scope
SKILL.md explicitly instructs the agent to 'MUST' intercept image payloads from chat context (base64 or cache path), save them locally (/tmp/target.jpg), and submit them automatically. That gives the agent broad discretion to read chat internals and extract binary content. While this is necessary for image-to-video functionality, the instructions are imperative and broad (could be applied to any image in chat) and therefore increase risk of unintended data access/exfiltration. The instructions also require monitoring Docker logs to retrieve results — no explicit safeguards or user consent checks are specified.
Install Mechanism
There is no declared install spec, but scripts/generate.js performs a dynamic runtime 'npm install form-data --no-save' via child_process.execSync when run. This is a network operation that installs code from the npm registry at execution time and writes to disk (node_modules). Dynamic, implicit installs via execSync are higher-risk than a declared install step and are surprising for an 'instruction-only' skill.
Credentials
The skill declares no required environment variables or credentials, but the script contains a hard-coded sessionToken string (b79fcda2...). Embedded credentials are effectively secret material and are disproportionate/unexpected. The script also accepts a --session override, but shipping a usable session inside the code can be abused or leak access. The code sends user-provided images to localhost:5100 only (no obvious external endpoints), but you should verify the local jimeng-api doesn't forward data externally.
Persistence & Privilege
The skill does not request 'always: true' and does not modify other skills or system-wide configuration. It will, however, create node_modules at runtime when dynamically installing 'form-data' and writes temp image files to /tmp. Those behaviors are relatively limited but worth noting as they create on-disk artifacts and perform network installs the first time they're invoked.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install uni-vision-engine - After installation, invoke the skill by name or use
/uni-vision-engine - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.2
- Updated documentation and description for clearer English instructions and broader accessibility.
- Removed the UI server component (`ui/server.js`), signaling a shift to fully headless, automation-focused operation.
- Skill no longer references integrated web UI; focus is now on native chat-based and script-driven image/video generation.
- Clarified workflow for automated image interception, strict file handling, and moderation/error responses.
- CLI instructions and usage examples updated for consistency with the latest workflow.
v1.0.1
Uni Vision Engine v1.0.1 Changelog
- Added ui/server.js, introducing a web-based UI server component to the project.
- Enhanced description: now highlights a built-in visual web upload tool and native image extraction from chat, supporting direct image-to-video generation from chat interfaces.
- Updated SKILL.md with clearer best practice guidance for handling image messages and using filesystem-based uploads for high-quality video generation.
- Added detailed instructions for dealing with domestic content review mechanisms and error codes.
v1.0.0
Uni Vision Engine 1.0.0
- 首次发布,支持通过本地 jimeng-api 自动生成无水印高质量视频(文生视频、图生视频)。
- 集成自动积分检测,确保账户积分充足后再生成视频。
- 完成后自动写入历史记录,并在本地 HTML 控制台可视化管理所有任务。
- 支持通过 sessionid 完全控制视频生成流程。
Metadata
Frequently Asked Questions
What is uni-vision-engine?
Automated high-quality video generation (text-to-video, image-to-video) via a local jimeng-api Docker service. Features native OpenClaw image interception, a... It is an AI Agent Skill for Claude Code / OpenClaw, with 430 downloads so far.
How do I install uni-vision-engine?
Run "/install uni-vision-engine" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is uni-vision-engine free?
Yes, uni-vision-engine is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does uni-vision-engine support?
uni-vision-engine is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created uni-vision-engine?
It is built and maintained by jiahuamld (@jiahuamld); the current version is v1.0.2.
More Skills