Text-to-Carousel
/install text-to-carousel
Text-to-Carousel
Generate professional carousel images from text content using Gemini image generation API.
Requirements
- Gemini API key with billing enabled (check TOOLS.md or ask user)
- Model:
gemini-3-pro-image-preview(REQUIRED for correct Chinese/CJK text rendering) - VPN: May need US VPN if Gemini returns location errors
Workflow
1. Gather Input
Determine carousel content from one of:
- Direct text/bullet points from user
- Article URL (fetch and extract key points)
- WordPress post (fetch via API)
- User-provided topic (generate content)
Collect:
- Brand info: name, colors, style (check TOOLS.md for known brands)
- Product image: URL or path (for CTA slide)
- Slide count: default 6 slides
- Size: default 1024x1024
- Language: detect from content
2. Plan Slide Structure
For health/product carousels, use this proven 6-slide structure:
| # | Type | Purpose |
|---|---|---|
| 1 | Cover | Hook + brand + topic |
| 2 | Problem | Why reader should care |
| 3 | Solution | How product/topic solves it |
| 4 | Details | Key features, data, ingredients |
| 5 | Social Proof | Testimonials, results, evidence |
| 6 | CTA | Product image + buy/contact |
For other structures, see references/prompt-patterns.md.
3. Write Prompts
For each slide, write a Gemini prompt following these rules:
Design prompt structure:
Create a [SIZE] [STYLE_PRESET] Instagram slide for [BRAND].
LAYOUT:
- Background: [COLORS/GRADIENT]
- [ELEMENT DESCRIPTIONS WITH EXACT TEXT]
- "[SLIDE_NUM] / [TOTAL]" bottom right
CRITICAL: All Chinese/CJK text must be exactly as written above.
Key rules:
- Specify EXACT text to render — quote every Chinese character
- Include slide number (e.g., "01 / 06")
- Reference brand name and consistent color palette
- For CTA slide with product image: attach the image via
inlineDatain API call - For style presets and templates, read
references/prompt-patterns.md
4. Generate Images
Use scripts/generate_carousel.py or call Gemini API directly:
import urllib.request, json, base64
API_KEY = "..." # from TOOLS.md
MODEL = "gemini-3-pro-image-preview" # REQUIRED for CJK text
url = f"https://generativelanguage.googleapis.com/v1beta/models/{MODEL}:generateContent?key={API_KEY}"
parts = [{"text": prompt}]
# For CTA slide with product image:
# parts.insert(0, {"inlineData": {"mimeType": "image/jpeg", "data": base64_image}})
payload = {
"contents": [{"parts": parts}],
"generationConfig": {"responseModalities": ["image", "text"]}
}
data = json.dumps(payload).encode("utf-8")
req = urllib.request.Request(url, data=data, headers={"Content-Type": "application/json"})
resp = urllib.request.urlopen(req, timeout=180)
result = json.loads(resp.read())
Add 5-second delay between slides to avoid rate limits.
5. Verify Output
After generation, verify each slide with vision model:
- Chinese/CJK text accuracy (character-level check)
- Design consistency across slides
- Product image visibility on CTA slide
- Brand elements present (logo, colors, slide numbers)
If text is garbled, regenerate that slide. Pro model rarely fails on Chinese but verify anyway.
Model Selection Guide
| Model | Chinese Text | Design Quality | Speed | Use When |
|---|---|---|---|---|
gemini-3-pro-image-preview |
✅ Perfect | ✅ High | Slower | Default choice — CJK content |
gemini-2.5-flash-image |
❌ Garbled | ✅ High | Fast | English-only content |
gemini-3.1-flash-image-preview |
⚠️ Untested | ✅ High | Fast | Try for English content |
Common Issues
| Problem | Solution |
|---|---|
| 429 quota exceeded | Check billing is linked to correct GCP project |
| Location not supported | Use US VPN |
| Chinese text garbled | Switch to gemini-3-pro-image-preview |
| Product image not matching | Attach actual product image via inlineData |
| Inconsistent design across slides | Include brand color hex codes and style description in every prompt |
File Structure
text-to-carousel/
├── SKILL.md # This file
├── scripts/
│ └── generate_carousel.py # Batch generation script (config-driven)
└── references/
└── prompt-patterns.md # Design presets, slide templates, tips
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install text-to-carousel - 安装完成后,直接呼叫该 Skill 的名称或使用
/text-to-carousel触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Text-to-Carousel 是什么?
Generate professional social media carousel images (Instagram, LinkedIn, TikTok, Xiaohongshu) from text content, articles, or URLs. Use when asked to create... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 121 次。
如何安装 Text-to-Carousel?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install text-to-carousel」即可一键安装,无需额外配置。
Text-to-Carousel 是免费的吗?
是的,Text-to-Carousel 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Text-to-Carousel 支持哪些平台?
Text-to-Carousel 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Text-to-Carousel?
由 jiangyisheng9-bot(@jiangyisheng9-bot)开发并维护,当前版本 v1.0.0。