功能描述

สร้างรูปภาพและวิดีโอ AI ฟรีผ่าน Google Gemini และ Google Flow โดยใช้ browser automation ไม่ต้องจ่าย API fee ใช้เมื่อต้องการสร้างสื่อ visual (รูปปก, thumbnail...

使用说明 (SKILL.md)

Google Free Media Generator

Name: Google Free Media Skill
Author: pbseiya

Skill สำหรับสร้างรูปภาพและวิดีโอ AI ฟรีผ่าน Google Gemini และ Google Flow โดยใช้ browser automation

🎯 เมื่อไหร่ควรใช้ Skill นี้

ใช้เมื่อผู้ใช้ต้องการ:

สร้างรูปภาพ AI สำหรับ cover, thumbnail, banner
สร้างวิดีโอจากข้อความหรือรูปภาพ (text-to-video, image-to-video)
ประหยัดค่า API (0 บาท vs 1-3 บาท/รูป ผ่าน API ปกติ)
สร้างสื่อจำนวนมากโดยไม่กังวลเรื่องต้นทุน

⚠️ ข้อจำกัดที่ต้องรู้

Quota ฟรีจำกัด: Gemini ~100 รูป/วัน, Flow ~50 credits/วัน (อาจเปลี่ยนแปลง)
ช้ากว่า API: ต้องเปิด browser และรอ UI load (5-10x ช้ากว่า)
เสี่ยง UI เปลี่ยน: Google เปลี่ยนปุ่ม/ตำแหน่งบ่อย → อาจต้อง update skill
Terms of Service: Automation อาจขัดกับ ToS ของ Google free tier

📋 ขั้นตอนการทำงาน

1. ตรวจสอบ Quota ก่อนเริ่ม

node scripts/quota_manager.mjs check

ดูว่าเหลือ quota เท่าไหร่
แจ้งเตือนถ้าใกล้หมด

2. สร้างรูปภาพ (Gemini)

node scripts/generate_image.mjs --prompt "คำอธิบายรูป" --output /path/to/output.jpg

การทำงาน:

เปิด browser ไปยัง gemini.google.com
Login (ถ้ายังไม่ได้ login)
กดปุ่มสร้างรูป (Image generation)
ส่ง prompt ที่ enhance แล้ว
รอ generate และดึงรูป full resolution (=s0 trick)
บันทึกลงไฟล์

3. สร้างวิดีโอ (Google Flow)

node scripts/generate_video.mjs --prompt "คำอธิบายวิดีโอ" --output /path/to/output.mp4

การทำงาน:

เปิด browser ไปยัง labs.google/flow
เลือกโหมด (Text-to-Video หรือ Image-to-Video)
ส่ง prompt หรืออัพโหลดรูป
รอ generate
ดาวน์โหลดวิดีโอ

🔧 Scripts

generate_image.mjs

สร้างรูปภาพผ่าน Google Gemini

Arguments:

--prompt: คำอธิบายรูป (required)
--output: path ไฟล์ output (required)
--style: style ของรูป (optional: realistic, artistic, minimalist)
--enhance: ให้ AI enhance prompt อัตโนมัติ (default: true)

generate_video.mjs

สร้างวิดีโอผ่าน Google Flow (Veo 3.1)

Arguments:

--prompt: คำอธิบายวิดีโอ (required)
--output: path ไฟล์ output (required)
--mode: โหมดการสร้าง (text-to-video, image-to-video)
--image: path รูปต้นทาง (สำหรับ image-to-video)
--duration: ระยะเวลาวิดีโอ (5-10 วินาที)

quota_manager.mjs

จัดการและติดตาม quota การใช้งาน

Commands:

check: ตรวจสอบ quota ที่เหลือ
reset: รีเซ็ต counter (เริ่มวันใหม่)
log: ดู log การใช้งาน

Config File: configs/quota.json

{
  "dailyLimits": {
    "images": 100,
    "videoCredits": 50
  },
  "currentUsage": {
    "images": 0,
    "videoCredits": 0
  },
  "lastReset": "2026-03-02T00:00:00+07:00"
}

💡 เทคนิคสำคัญ

1. ดึงรูป Full Resolution

รูปบน Gemini แสดงที่ 1024px แต่สามารถดึง full resolution (1408x768) ได้โดยเปลี่ยน URL:

จาก: https://.../image=s1024
เป็น: https://.../image=s0

2. Session Persistence

Login ครั้งเดียวแล้วเก็บ cookie ไว้ใช้ต่อ
ไม่ต้อง login ใหม่ทุกครั้งที่สร้างรูป
ใช้ Puppeteer/Playwright session storage

3. Prompt Enhancement

ก่อนส่งให้ Gemini ควร enhance prompt ให้มี:

Lighting (soft lighting, dramatic lighting, golden hour)
Composition (rule of thirds, centered, wide angle)
Style (photorealistic, cinematic, minimalist, vibrant)
Quality keywords (4K, ultra detailed, professional)

ตัวอย่าง:

Input: "รูปแมวใส่แว่น"
Enhanced: "A photorealistic portrait of a cute cat wearing round glasses, 
soft studio lighting, centered composition, professional photography, 
4K ultra detailed, warm tones"

📁 Storage Organization

ไฟล์ที่สร้างจะเก็บที่:

/mnt/storage/ada_projects/ai_media/
├── images/YYYY-MM/
├── videos/YYYY-MM/
└── metadata.json

🔄 Fallback Strategy

ถ้า Google ใช้ไม่ได้ มีทางเลือกสำรอง:

Bing Image Creator (ฟรี)
Leonardo.ai (ฟรี tier)
Stable Diffusion Online

🚨 การแก้ปัญหา

Login ไม่ได้

ตรวจสอบว่า browser ไม่ใช่ headless mode
ถ้าใช้ VPS ต้องตั้ง Xvfb เป็นจอเสมือน
ลอง clear cookie แล้ว login ใหม่

UI เปลี่ยน/ปุ่มหาย

Update selector ใน scripts
ตรวจสอบ Google เปลี่ยนตำแหน่งฟีเจอร์

Quota หมด

รอวันถัดไป (reset ตอน 00:00)
ใช้ fallback services แทน

📝 ตัวอย่างการใช้งาน

# สร้างรูปปกโพสต์
node scripts/generate_image.mjs \
  --prompt "AI workflow diagram, futuristic style, blue and purple gradient" \
  --output /mnt/storage/ada_projects/ai_media/images/2026-03/cover_001.jpg \
  --style artistic

# สร้างวิดีโอจากข้อความ
node scripts/generate_video.mjs \
  --prompt "Ocean waves at sunset, cinematic slow motion" \
  --output /mnt/storage/ada_projects/ai_media/videos/2026-03/sunset.mp4 \
  --duration 8

# ตรวจสอบ quota
node scripts/quota_manager.mjs check

安全使用建议

This skill is coherent with what it says: it’s a local browser-automation skeleton for using Google web UIs to produce images/videos. Before installing/running: 1) Be aware this method may violate Google’s Terms of Service — use at your own risk. 2) Inspect and understand the Puppeteer/Playwright code you will add; the included scripts are placeholders/skeletons and expect you to implement selectors and login handling. 3) Do not store or expose Google session cookies on shared or public machines; prefer a dedicated account with limited permissions. 4) Install Puppeteer from the official npm registry and review its version/lockfile; avoid running unreviewed install scripts as root. 5) Run in an isolated environment (VM/container) if you are concerned about account/session exposure. 6) If you want the agent to run this autonomously, explicitly weigh the risk that the agent could open a logged-in browser session and act without further prompts. If any of the above is unacceptable, do not run the scripts or request clarifications from the author.

功能分析

Type: OpenClaw Skill Name: google-free-media-skill Version: 1.0.0 The skill bundle provides a framework for automating image and video generation via Google Gemini and Google Flow to avoid API fees. The provided JavaScript files (generate_image.mjs, generate_video.mjs) are currently 'skeleton' implementations that simulate the workflow and manage local usage quotas via quota_manager.mjs. There is no evidence of data exfiltration, credential theft, or malicious execution; the scripts use standard file system operations and hardcoded shell checks for browser availability.

能力评估

✓ Purpose & Capability

The name/description (use browser automation to generate images/videos via Google Gemini/Flow) matches the included files and SKILL.md. The repository contains image/video generator scripts and a local quota manager consistent with the stated purpose. No unrelated credentials, cloud services, or unrelated binaries are requested.

ℹ Instruction Scope

SKILL.md and the scripts instruct the agent/user to open a browser, log into Google, and drive the web UI via Puppeteer/Playwright (skeleton pseudo-code present). The scripts read/write only local paths under the skill (outputs, configs, logs). They reference session persistence (cookies/session storage) which is logical for this use case but is a potential privacy/security consideration (storing session cookies could expose account access if handled insecurely). The instructions do not attempt to read unrelated system files or environment variables.

ℹ Install Mechanism

There is no install spec; this is instruction-plus-scripts. README instructs 'npm install puppeteer' which is expected for browser automation, but the skill does not automate or pin dependency installation. This is lower-risk than remote download/extract, but the user must manually install and trust the npm package used (puppeteer).

✓ Credentials

The skill requests no environment variables, no credentials, and no config paths outside its own repo. The main sensitive operation is requiring you to use your Google account via a browser login; that is proportional to the stated purpose but raises the usual account/session safety considerations (don't use high-privilege accounts; be careful where session cookies are stored).

✓ Persistence & Privilege

always:false and no modifications to other skills or global agent config. The skill stores per-repo quota/config/log files under configs/ and writes outputs under the specified output path. Autonomous invocation is allowed by default (not unusual) but does not introduce extra privileges here.

版本历史

v1.0.0

Google Free Media Generator skill v1.0.0 - Initial release: generate AI images (via Gemini) and videos (via Google Flow) for free using browser automation—no API fees required. - Manage daily quota usage automatically for both image and video generation. - Supports full-resolution image downloads and session persistence for faster logins. - Includes fallback options to Bing, Leonardo, and Stable Diffusion if Google is unavailable. - Comprehensive CLI scripts provided for image, video, and quota management.

元数据

Slug google-free-media-skill

版本 1.0.0

许可证 —

累计安装 1

当前安装数 1

历史版本数 1

常见问题

Google Free Media Skill 是什么？

สร้างรูปภาพและวิดีโอ AI ฟรีผ่าน Google Gemini และ Google Flow โดยใช้ browser automation ไม่ต้องจ่าย API fee ใช้เมื่อต้องการสร้างสื่อ visual (รูปปก, thumbnail... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 264 次。

如何安装 Google Free Media Skill？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install google-free-media-skill」即可一键安装，无需额外配置。

Google Free Media Skill 是免费的吗？

是的，Google Free Media Skill 完全免费（开源免费），可自由下载、安装和使用。

Google Free Media Skill 支持哪些平台？

Google Free Media Skill 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Google Free Media Skill？

由 pbseiya（@pbseiya）开发并维护，当前版本 v1.0.0。

Google Free Media Skill