← Back to Skills Marketplace
jaredforreal

GLM-V-Prompt-Gen

by Jared Wen · GitHub ↗ · v1.0.3 · MIT-0
cross-platform ✓ Security Clean
427
Downloads
1
Stars
1
Active Installs
4
Versions
Install in OpenClaw
/install glmv-prompt-gen
Description
Analyze images/videos and generate professional prompts for text-to-image and text-to-video AI tools (Midjourney, Stable Diffusion, DALL-E, Sora, Runway, Kli...
README (SKILL.md)

GLM-V Prompt Generation Skill

Analyze reference images or videos and generate professional prompts for AI image/video generation tools.

When to Use

  • Generate prompts for text-to-image tools (Midjourney, Stable Diffusion, DALL-E, etc.)
  • Generate prompts for text-to-video tools (Sora, Runway, Kling, Pika, etc.)
  • User mentions "生成prompt", "文生图prompt", "文生视频prompt", "prompt工程", "参考图生成prompt", "generate prompt"
  • User provides an image/video and wants to recreate or remix it
  • Extract prompt ideas from reference visual content

Supported Input Types

Type Formats Max Size Max Count Base64
Image jpg, png, jpeg 5MB / 6000×6000px 50
Video mp4, mkv, mov 200MB ❌ (URL only)

⚠️ Images and videos cannot be used in the same request. ⚠️ Videos only support URLs — local paths and base64 are NOT supported.

📋 Output Display Rules (MANDATORY)

After running the script, you must display the full prompt output exactly as returned. Do not summarize, truncate, or only say "prompt generated". Users need the complete prompt (especially the English prompt) for direct copy/paste.

  • Show the full output: content analysis + prompt + prompt breakdown
  • In auto mode, show both text-to-image and text-to-video prompts
  • English prompts are core output and must be shown completely
  • If output was saved (-o), provide the file path and show file content

Output Modes

Mode Description
image Generate prompts for text-to-image tools (default)
video Generate prompts for text-to-video tools
auto Generate prompts for both image and video

Resource Links

Resource Link
Get API Key https://bigmodel.cn/usercenter/proj-mgmt/apikeys
API Docs Chat Completions / 对话补全

Prerequisites

API Key Setup / API Key 配置(Required / 必需)

This script reads the key from the ZHIPU_API_KEY environment variable and shares it with other Zhipu skills. 脚本通过 ZHIPU_API_KEY 环境变量获取密钥,与其他智谱技能共用同一个 key。

Get Key / 获取 Key: Visit Zhipu Open Platform API Keys / 智谱开放平台 API Keys to create or copy your key.

Setup options / 配置方式(任选一种):

  1. OpenClaw config (recommended) / OpenClaw 配置(推荐): Set in openclaw.json under skills.entries.glmv-prompt-gen.env:

    "glmv-prompt-gen": { "enabled": true, "env": { "ZHIPU_API_KEY": "你的密钥" } }
    
  2. Shell environment variable / Shell 环境变量: Add to ~/.zshrc:

    export ZHIPU_API_KEY="你的密钥"
    

💡 If you already configured another Zhipu skill (for example zhipu-tools or glmv-caption), they share the same ZHIPU_API_KEY, so no extra setup is needed. 💡 如果你已为其他智谱 skill(如 zhipu-toolsglmv-caption)配置过 key,它们共享同一个 ZHIPU_API_KEY,无需重复配置。

How to Use

Image → Text-to-Image Prompt

python scripts/prompt_gen.py --images "https://example.com/photo.jpg"
python scripts/prompt_gen.py --images /path/to/photo.png

Image → Text-to-Video Prompt

python scripts/prompt_gen.py --images "https://example.com/scene.jpg" --mode video

Image → Both (Image + Video Prompts)

python scripts/prompt_gen.py --images "https://example.com/photo.jpg" --mode auto

Video → Text-to-Video Prompt

python scripts/prompt_gen.py --videos "https://example.com/clip.mp4" --mode video

Save Result to File

python scripts/prompt_gen.py --images photo.jpg --mode image -o prompt.md

Custom Model

python scripts/prompt_gen.py --images photo.jpg --model glm-4.6v-flash

Output Example (image mode)

### Content Analysis
A cyberpunk cityscape at night, with dense skyscrapers, glowing neon signs, and rain-wet streets reflecting colorful light.

### Prompt
Cyberpunk cityscape at night, towering skyscrapers with glowing neon signs,
rain-wet streets reflecting colorful lights, flying cars in the distance,
volumetric fog, dramatic lighting, ultra detailed, 8K, cinematic composition

### Prompt Breakdown
- **Subject**: Futuristic skyline with skyscrapers and neon lights
- **Style**: Cyberpunk, sci-fi
- **Color**: Cool/warm contrast with blue-purple dominance and neon accents
- **Lighting**: Neon glow, wet-surface reflections, volumetric fog
- **Composition**: Wide-angle perspective with layered depth
- **Mood**: Mysterious, futuristic, high-tech

CLI Reference

python scripts/prompt_gen.py (--images IMG [IMG...] | --videos VID [VID...]) [OPTIONS]
Parameter Required Description
--images, -i One of Image paths or URLs (jpg/png/jpeg, base64 OK)
--videos, -v One of Video URLs (mp4/mkv/mov, URL only)
--mode, -m No Output mode: image (default), video, or auto
--model No Model name (default: glm-4.6v)
--temperature, -t No Sampling temperature 0-1 (default: 0.6)
--max-tokens No Max output tokens (default: 2048)
--thinking No Enable thinking/reasoning mode
--stream No Enable streaming output
--output, -o No Save result to file
--pretty No Pretty-print JSON error output

Error Handling

API key not configured: → Guide user to configure ZHIPU_API_KEY

Authentication failed (401/403): → API key invalid/expired → check at Zhipu API Keys / 智谱官网

Rate limit (429): → Quota exhausted → wait and retry

Content filtered:warning field present → content blocked by safety review

Timeout: → Video processing may take time → increase timeout or use smaller files

Usage Guidance
This skill appears to do what it says: it encodes/local-reads images or accepts URLs and sends them to Zhipu/BigModel to generate prompts. Before installing, consider: (1) Privacy: any local image you provide will be uploaded to an external service — avoid sensitive images. (2) Credential reuse: the ZHIPU_API_KEY will be used by this and other Zhipu skills; store and rotate the key if needed and avoid sharing a production/privileged key. (3) Trust the provider: verify you are comfortable sending your data to bigmodel.cn / open.bigmodel.cn and review their privacy/terms. (4) If you need higher assurance, review the full script locally (it uses only requests and explicit API calls) or run it in an isolated environment. If anything about credential sharing or remote uploads is unacceptable, do not install or use the skill.
Capability Analysis
Type: OpenClaw Skill Name: glmv-prompt-gen Version: 1.0.3 The skill is a legitimate tool designed to generate AI image and video prompts by analyzing visual content via the Zhipu GLM-V API. The Python script `scripts/prompt_gen.py` correctly implements communication with the official Zhipu endpoint (open.bigmodel.cn) and handles local file reading for images without any signs of data exfiltration, obfuscation, or unauthorized execution. The instructions in `SKILL.md` are consistent with the tool's stated purpose and do not contain any malicious prompt injection attempts.
Capability Tags
requires-sensitive-credentials
Capability Assessment
Purpose & Capability
Name/description (prompt generation from visual inputs) aligns with the code and runtime instructions. The script sends image/video content to BigModel (open.bigmodel.cn) and constructs prompts — requiring a ZHIPU_API_KEY is expected for this integration.
Instruction Scope
Instructions and the script legitimately read local images (or accept URLs/base64), encode/send them to the Zhipu API, and require the full prompt output to be displayed. This is within scope, but the skill will upload user-provided images/videos (including local files) to an external service; users should be aware of privacy implications. The SKILL.md also states the API key is shared with other Zhipu skills (operational note about credential reuse).
Install Mechanism
No install spec; instruction-only with a single Python script. No downloads from untrusted URLs or archive extraction are present. The script uses the requests library (normal for network calls).
Credentials
Only a single required environment variable (ZHIPU_API_KEY) is requested, which matches the external BigModel/Zhipu API usage. The policy note that this key is reused across other Zhipu skills is expected but worth user awareness.
Persistence & Privilege
always is false and the skill is user-invocable. It does not request permanent platform-wide privileges or modify other skills' configs. Normal autonomous invocation (not disabled) is allowed but not combined with other red flags.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install glmv-prompt-gen
  3. After installation, invoke the skill by name or use /glmv-prompt-gen
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.3
- Updated dependency metadata: the "bins" requirement for Python was removed under "openclaw" in skill metadata. - No changes to functionality, usage instructions, or code behavior.
v1.0.2
No user-visible changes in this version. - Version 1.0.2 released with no file changes detected.
v1.0.1
- Added OpenClaw-specific metadata, including environment and binary requirements, emoji, and homepage link. - Specified ZHIPU_API_KEY as the primary required environment variable. - No functional changes to core skill; documentation and metadata improvements only. - No code or logic changes detected in this release.
v1.0.0
Initial release of GLM-V Prompt Generation Skill. - Analyze images or videos and generate professional prompts for AI-powered text-to-image and text-to-video tools (supports Midjourney, Stable Diffusion, DALL-E, Sora, Runway, Kling, Pika). - Supports multiple input and output modes, including image-only, video-only, or both (auto). - Provides detailed output: content analysis, full English prompt, and prompt breakdown for direct use or further editing. - Comprehensive usage and setup documentation, including API key configuration and error handling guidance. - CLI options for advanced control—output file saving, custom models, streaming, and more.
Metadata
Slug glmv-prompt-gen
Version 1.0.3
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 4
Frequently Asked Questions

What is GLM-V-Prompt-Gen?

Analyze images/videos and generate professional prompts for text-to-image and text-to-video AI tools (Midjourney, Stable Diffusion, DALL-E, Sora, Runway, Kli... It is an AI Agent Skill for Claude Code / OpenClaw, with 427 downloads so far.

How do I install GLM-V-Prompt-Gen?

Run "/install glmv-prompt-gen" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is GLM-V-Prompt-Gen free?

Yes, GLM-V-Prompt-Gen is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does GLM-V-Prompt-Gen support?

GLM-V-Prompt-Gen is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created GLM-V-Prompt-Gen?

It is built and maintained by Jared Wen (@jaredforreal); the current version is v1.0.3.

💬 Comments