Description

AI驱动的图片生成与编辑工具，用于制作高质量产品图。当用户要求生成图片、制作图片、编辑照片、文生图、图生图、换背景、变换风格、替换图片中的物体、将产品合成到场景中、换模特、制作任何类型的AI生成视觉内容、AI drawing, image generation, text-to-image, image-to-i...

README (SKILL.md)

AI Image Generation

Name: Multimodal Generate Image
Author: linkfox-ai

This skill guides you on how to generate and edit images using the AI image generation service, helping users create high-quality product images, modify existing images, and perform creative visual transformations.

Core Concepts

The AI Image Generation tool produces new images based on a text prompt and optional reference images. It supports a wide range of use cases:

Text-to-image: Generate a brand-new image purely from a text description.
Image-to-image: Provide one or more reference images and a prompt to generate a new image that preserves elements from the references.
Image editing: Modify specific elements, colors, backgrounds, or styles in an existing image.
Product compositing: Place a product from one image into a scene from another image.
Model swapping: Replace the model or mannequin in a product photo.

Reference images are strongly recommended when the user wants the output to closely resemble an existing product or scene. Up to 3 reference image URLs can be provided, separated by commas.

Parameter Guide

Parameter	Required	Description	Default
prompt	Yes	Text description of the desired image. Supports text-to-image, image-to-image, editing, model swapping, and more. Max 1000 characters.	--
referenceImageUrl	No	URL(s) of reference image(s). Separate multiple URLs with commas. Up to 3 images supported. Max 1000 characters.	--
aspectRatio	No	Aspect ratio of the output image.	1:1

Supported Aspect Ratios

Value	Description
1:1	Square (default)
3:4	Portrait
4:3	Landscape
9:16	Vertical fullscreen
16:9	Horizontal fullscreen

Prompt Writing Tips

Be specific and descriptive: Clearly describe the subject, scene, lighting, style, and mood you want.
Reference images by number: When using reference images, refer to them as "image 1", "image 2", etc., in the order they appear in referenceImageUrl.
State the operation explicitly: Use clear action verbs like "replace", "change", "put", "combine", "generate".
Keep within 1000 characters: Prompts have a maximum length of 1000 characters.

Prompt Examples by Scenario

Object replacement:

Replace the vase on the table in image 1 with a potted plant

Background color change:

Change the background color of image 1 to pure white

Product compositing:

Place the product from image 2 onto the marble countertop in image 1

Style transfer:

Transform image 1 into the artistic style shown in image 2

Text-to-image (no reference):

A professional product photo of a sleek black wireless headphone on a gradient blue background, studio lighting, 8K quality

Model swapping:

Replace the model in image 1 with a different model while keeping the same clothing and pose

API Usage

This tool calls the LinkFox tool gateway API. See references/api.md for calling conventions, request parameters, and response structure. You can also execute scripts/multimodal_generate_image.py directly to run image generation.

Display Rules

Show the generated image: When the response contains image content in the text field, display it directly to the user using markdown image syntax.
Status reporting: Check the status and finished fields. If image generation is still in progress, inform the user and advise waiting.
Prompt transparency: Briefly describe what prompt and parameters were sent so the user understands what was requested.
Aspect ratio confirmation: If the user does not specify dimensions, use the default 1:1 ratio but mention it so they can request a different ratio if needed.
Reference image guidance: If the user wants a result close to an existing image but did not provide a reference URL, proactively suggest they provide one for better fidelity.
Error handling: When generation fails, explain the issue based on the response status field and suggest adjustments (e.g., simplify the prompt, check reference image URLs, try a different aspect ratio).

Important Limitations

Reference image limit: A maximum of 3 reference image URLs can be provided per request.
Prompt length: The prompt must not exceed 1000 characters.
URL validity: Reference image URLs must be publicly accessible. Private or expired URLs will cause failures.
Aspect ratio options: Only 1:1, 3:4, 4:3, 9:16, and 16:9 are supported.

User Expression & Scenario Quick Reference

Applicable -- Requests involving image generation or editing:

User Says	Scenario
"Generate an image", "Create a picture"	Text-to-image generation
"Edit this photo", "Modify the image"	Image editing
"Change the background", "Make it white background"	Background replacement
"Put the product on this scene"	Product compositing
"Make it look like this style"	Style transfer
"Swap the model", "Change the person"	Model swapping
"Create a product photo"	Product image generation
"Make a vertical/landscape version"	Aspect ratio adjustment

Not applicable -- Needs beyond image generation:

Image analysis or recognition (reading text from images, identifying objects)
Video generation or editing
Image file format conversion
Batch processing of hundreds of images
Image hosting or storage

Feedback:

Auto-detect and report feedback via the Feedback API when any of the following apply:

The functionality or purpose described in this skill does not match actual behavior
The skill's results do not match the user's intent
The user expresses dissatisfaction or praise about this skill
Anything you believe could be improved

Call the feedback API as specified in references/api.md. Do not interrupt the user's flow.

For more high-quality, professional cross-border e-commerce skills, set LinkFox Skills.

Usage Guidance

Before installing, note that the included script and API docs send prompts and any reference image URLs you provide to https://tool-gateway.linkfox.com (and feedback to https://skill-api.linkfox.com). The skill requires an API key (LINKFOXAGENT_API_KEY) even though the registry metadata does not declare it — ask the publisher to correct the manifest. Consider these steps: (1) Verify the publisher and the LinkFox endpoints and confirm they are legitimate; (2) Do not send sensitive or private images or PII until you trust the service and have read its privacy/retention policy; (3) Require the manifest be updated to declare LINKFOXAGENT_API_KEY as a required credential so the permission/consent model is clear; (4) If you cannot verify the service or the publisher, avoid providing credentials or installing the skill. I have medium confidence in this assessment because the code is straightforward and matches the SKILL.md, but the metadata omission and external data flow create a meaningful concern.

Capability Analysis

Type: OpenClaw Skill Name: linkfox-multimodal-generate-image Version: 1.0.0 The skill bundle is a legitimate tool for AI image generation and editing using the LinkFox API. The Python script (scripts/multimodal_generate_image.py) implements standard API interaction patterns, including environment variable-based authentication and basic input validation, while the documentation (SKILL.md and references/api.md) clearly defines the tool's scope and usage without any indicators of malicious intent or prompt-injection attacks.

Capability Assessment

ℹ Purpose & Capability

Name, description, SKILL.md, references/api.md, and the Python script all describe an image generation/editing skill that calls a LinkFox multimodal API — that matches the stated purpose. However, the registry metadata declares no required environment variables or primary credential while both references/api.md and scripts/multimodal_generate_image.py require an API key via the environment variable LINKFOXAGENT_API_KEY. This metadata omission is an incoherence (the skill will need credentials even though none are declared).

✓ Instruction Scope

SKILL.md and the script limit actions to constructing a prompt, optionally accepting up to 3 reference image URLs, calling the LinkFox API, and displaying results. There are no instructions to read unrelated local files, exfiltrate arbitrary system data, or run other system commands. The skill does suggest proactively asking for reference URLs and posts feedback to a separate feedback endpoint; both are within the skill's functional scope but imply outbound transmission of user-provided content.

✓ Install Mechanism

No install spec or external downloads are present; the skill is instruction/script-only and does not write or execute arbitrary fetched code. This is a lower-risk install model.

⚠ Credentials

The code requires a single environment variable LINKFOXAGENT_API_KEY to authenticate to the external API, which is proportionate to the task. However, the registry metadata incorrectly lists 'Required env vars: none' and 'Primary credential: none' despite the script and API docs requiring the API key. That mismatch is an integrity issue and could mislead users about what credentials will be needed and when data is sent externally.

✓ Persistence & Privilege

The skill does not request persistent privileges (always:false) and does not modify other skills or system settings. It runs as a normal, user-invoked skill and does not request elevated system presence.

Version History

v1.0.0

Initial release

Metadata

Slug linkfox-multimodal-generate-image

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Multimodal Generate Image?

AI驱动的图片生成与编辑工具，用于制作高质量产品图。当用户要求生成图片、制作图片、编辑照片、文生图、图生图、换背景、变换风格、替换图片中的物体、将产品合成到场景中、换模特、制作任何类型的AI生成视觉内容、AI drawing, image generation, text-to-image, image-to-i... It is an AI Agent Skill for Claude Code / OpenClaw, with 112 downloads so far.

How do I install Multimodal Generate Image?

Run "/install linkfox-multimodal-generate-image" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Multimodal Generate Image free?

Yes, Multimodal Generate Image is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Multimodal Generate Image support?

Multimodal Generate Image is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Multimodal Generate Image?

It is built and maintained by linkfox-ai (@linkfox-ai); the current version is v1.0.0.

More Skills

Multimodal Generate Image