← 返回 Skills 市场
zhang-shubo

Image2Prompt

作者 Zhang-Shubo · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
3352
总下载
5
收藏
11
当前安装
1
版本数
在 OpenClaw 中安装
/install image2prompt
功能描述
Analyze images and generate detailed prompts for image generation. Supports portrait, landscape, product, animal, illustration categories with structured or natural output.
使用说明 (SKILL.md)

Image to Prompt

Analyze images and generate detailed, reproduction-quality prompts for AI image generation.

Workflow

Step 1: Category Detection First, classify the image into one of these categories:

  • portrait — People as main subject (photos, artwork, digital art)
  • landscape — Natural scenery, cityscapes, architecture, outdoor environments
  • product — Commercial product photos, merchandise
  • animal — Animals as main subject
  • illustration — Diagrams, infographics, UI mockups, technical drawings
  • other — Images that don't fit above categories

Step 2: Category-Specific Analysis Generate a detailed prompt based on the detected category.

Usage

Basic Analysis

# Analyze an image (auto-detect category)
openclaw message send --image /path/to/image.jpg "Analyze this image and generate a detailed prompt for reproduction"

Specify Output Format

Natural Language (default):

Analyze this image and write a detailed, flowing prompt description (600-1000 words for portraits, 400-600 for others).

Structured JSON:

Analyze this image and output a structured JSON description with all visual elements categorized.

With Dimensions Extraction

Request dimension highlights to get tagged phrases for each visual aspect:

Analyze this image with dimension extraction. Tag phrases for: backgrounds, objects, characters, styles, actions, colors, moods, lighting, compositions, themes.

Category-Specific Elements

Portrait Analysis Covers:

  • Model/Style: Photography type, quality level, visual style
  • Subject: Gender, age, ethnicity, skin tone, body type
  • Facial Features: Eyes, lips, face shape, expression
  • Hair: Color, length, style, part
  • Pose: Body position, orientation, leg/hand positions, gaze
  • Clothing: Type, color, pattern, fit, material, style
  • Accessories: Jewelry, bags, hats, etc.
  • Environment: Location, ground, background, atmosphere
  • Lighting: Type, time of day, shadows, contrast, color temperature
  • Camera: Angle, height, shot type, lens, depth of field, perspective
  • Technical: Realism, post-processing, resolution

Landscape Analysis Covers:

  • Terrain and water features
  • Sky and atmospheric elements
  • Foreground/background composition
  • Natural lighting and atmosphere
  • Color palette and photography style

Product Analysis Covers:

  • Product features and materials
  • Design elements and shape
  • Staging and background
  • Studio lighting setup
  • Commercial photography style

Animal Analysis Covers:

  • Species identification and markings
  • Pose and behavior
  • Expression and character
  • Habitat and setting
  • Wildlife/pet photography style

Illustration Analysis Covers:

  • Diagram type (flowchart, infographic, UI, etc.)
  • Visual elements (icons, shapes, connectors)
  • Layout and hierarchy
  • Design style (flat, isometric, etc.)
  • Color scheme and meaning

Output Examples

Natural Language Output (Portrait)

{
  "prompt": "A stunning photorealistic portrait of a young woman in her mid-20s with fair porcelain skin and warm pink undertones. She has striking emerald green almond-shaped eyes with long dark lashes, full rose-colored lips curved in a subtle confident smile, and an oval face with high cheekbones..."
}

Structured Output (Portrait)

{
  "structured": {
    "model": "photorealistic",
    "quality": "ultra high",
    "style": "cinematic natural light photography",
    "subject": {
      "identity": "young beautiful woman",
      "gender": "female",
      "age": "mid 20s",
      "ethnicity": "European",
      "skin_tone": "fair porcelain with pink undertones",
      "body_type": "slim athletic",
      "facial_features": {
        "eyes": "emerald green, almond-shaped, intense gaze",
        "lips": "full, rose pink, subtle smile",
        "face_shape": "oval with high cheekbones",
        "expression": "confident and serene"
      },
      "hair": {
        "color": "warm honey blonde",
        "length": "long",
        "style": "soft waves",
        "part": "center"
      }
    },
    "pose": {
      "position": "standing",
      "body_orientation": "three-quarter turn to camera",
      "legs": "weight on right leg, relaxed stance",
      "hands": {
        "right_hand": "resting on hip",
        "left_hand": "hanging naturally at side"
      },
      "gaze": "direct eye contact with camera"
    },
    "clothing": {
      "type": "flowing maxi dress",
      "color": "dusty rose",
      "pattern": "solid",
      "details": "V-neckline, cinched waist, silk material",
      "style": "romantic feminine"
    },
    "accessories": ["delicate gold necklace", "small hoop earrings"],
    "environment": {
      "location": "outdoor garden",
      "ground": "cobblestone path",
      "background": "blooming roses, soft bokeh",
      "atmosphere": "dreamy and romantic"
    },
    "lighting": {
      "type": "natural sunlight",
      "time": "golden hour",
      "shadow_quality": "soft diffused shadows",
      "contrast": "medium",
      "color_temperature": "warm"
    },
    "camera": {
      "angle": "slightly below eye level",
      "camera_height": "chest height",
      "shot_type": "medium shot",
      "lens": "85mm",
      "depth_of_field": "shallow",
      "perspective": "slight compression, flattering"
    },
    "mood": "romantic, confident, ethereal",
    "realism": "highly photorealistic",
    "post_processing": "soft color grading, subtle glow",
    "resolution": "8k"
  }
}

With Dimensions

{
  "prompt": "...",
  "dimensions": {
    "backgrounds": ["outdoor garden", "blooming roses", "soft bokeh"],
    "objects": ["delicate gold necklace", "small hoop earrings"],
    "characters": ["young beautiful woman", "mid 20s", "European"],
    "styles": ["photorealistic", "cinematic natural light photography"],
    "actions": ["standing", "three-quarter turn", "direct eye contact"],
    "colors": ["dusty rose", "honey blonde", "emerald green"],
    "moods": ["romantic", "confident", "ethereal", "dreamy"],
    "lighting": ["golden hour", "natural sunlight", "soft diffused shadows"],
    "compositions": ["medium shot", "85mm", "shallow depth of field"],
    "themes": ["romantic feminine", "portrait photography"]
  }
}

Tips for Best Results

  1. High-resolution images produce more detailed prompts
  2. Clear, well-lit images yield better category detection
  3. Request structured output when you need programmatic access to individual elements
  4. Use dimensions extraction when building prompt databases or training data
  5. Specify word count expectations for natural language output if needed

Integration

This skill works with any vision-capable model. For best results, use:

  • GPT-4 Vision
  • Claude 3 (Opus/Sonnet)
  • Gemini Pro Vision
安全使用建议
This skill will analyze images and is permitted to extract detailed demographic and biometric attributes (age, gender, ethnicity, skin tone, facial features, body type). Before installing or using it: 1) Do not submit images of private people without their explicit consent. The skill as-written provides no privacy or consent safeguards. 2) Expect that images and extracted descriptions will be sent to the OpenAI API (you supply OPENAI_API_KEY) — treat this as data exfiltration to that service. 3) If you plan to use it on photos of identifiable people, remove or disable the demographic/identity extraction to reduce privacy risk. 4) Ask the skill author (or require) explicit limits: a) do not attempt to identify real people; b) avoid inferring protected attributes (ethnicity, religion, etc.); c) log minimal data and avoid long reproductions that could expose a person's identity. 5) If you need formal assurance, request the SKILL.md be updated to include privacy/consent rules and a clear statement about where image data is sent; absence of code reduces supply-chain risk, but the behavioral scope (sensitive inferences) is the core concern. Providing those mitigations would move this toward benign; absent them, treat the skill as suspicious.
功能分析
Type: OpenClaw Skill Name: image2prompt Version: 1.0.0 The OpenClaw AgentSkills skill bundle `image2prompt` is benign. The `SKILL.md` file provides instructions for an AI agent to analyze images and generate detailed prompts for AI image generation, which aligns perfectly with its stated purpose. There is no evidence of data exfiltration, malicious execution, persistence, obfuscation, or prompt injection attempts to manipulate the agent into unauthorized actions. The usage examples demonstrate standard interaction with the `openclaw` CLI tool, and the agent's instructions are solely focused on image analysis and prompt generation.
能力评估
Purpose & Capability
The name/description (generate prompts from images) matches the SKILL.md and the single declared credential (OPENAI_API_KEY) — this is plausible because image analysis/generation workflows commonly call an LLM/vision API. Requiring an openclaw client (optional) is reasonable for an OpenClaw-based CLI workflow.
Instruction Scope
The instructions explicitly direct the agent to infer and produce fine-grained, potentially sensitive biometric and demographic attributes (ethnicity, age, gender, skin tone, body type, detailed facial features). There is no guidance about consent, permissible uses, or limits on identifying or describing private individuals. That expands the scope beyond innocuous 'visual description' into high-risk inference (privacy/biometric profiling).
Install Mechanism
Instruction-only skill with no install spec and no code files — lowest installation risk. Nothing is downloaded or written to disk by the skill itself.
Credentials
Only OPENAI_API_KEY is declared as the primary credential, which is proportionate for calling an external vision/LLM API. However, the skill's behavior (sending images + sensitive inferences) means that providing that key will transmit potentially private image data to whatever OpenAI endpoint the agent uses — users should be aware of that exfiltration surface.
Persistence & Privilege
always:false and user-invocable:true — the skill is not force-included and does not request elevated persistence. It can be invoked by the agent, which is the platform default and acceptable here.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install image2prompt
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /image2prompt 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of image2prompt skill. - Analyze images and auto-detect category: portrait, landscape, product, animal, illustration, or other. - Generate detailed natural-language or structured JSON prompts for AI image generation based on the detected category. - Optional "dimension extraction": get tagged visual aspects (backgrounds, objects, styles, etc.) for more granular prompt composition. - Extensive category-specific analysis, covering detailed elements for portraits, landscapes, products, animals, and illustrations. - Supports flexible output formats (flowing prompt, structured object, with or without dimension tags). more about image2prompt see <a href="https://image2prompt.art">here</a>
元数据
Slug image2prompt
版本 1.0.0
许可证
累计安装 11
当前安装数 11
历史版本数 1
常见问题

Image2Prompt 是什么?

Analyze images and generate detailed prompts for image generation. Supports portrait, landscape, product, animal, illustration categories with structured or natural output. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 3352 次。

如何安装 Image2Prompt?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install image2prompt」即可一键安装,无需额外配置。

Image2Prompt 是免费的吗?

是的,Image2Prompt 完全免费(开源免费),可自由下载、安装和使用。

Image2Prompt 支持哪些平台?

Image2Prompt 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Image2Prompt?

由 Zhang-Shubo(@zhang-shubo)开发并维护,当前版本 v1.0.0。

💬 留言讨论