← Back to Skills Marketplace

Gemini Image Gen + Watermark Removal

Name: Gemini Image Gen + Watermark Removal
Author: imkingjh999

by imkingjh999 · GitHub ↗ · v1.1.0 · MIT-0

cross-platform ✓ Security Clean

237

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install gemini-image-gen-watermark-removal

Description

Google Gemini 网页端生图并去水印。通过 OpenClaw Browser Tool 控制浏览器生成、下载图片，再用 GeminiWatermarkTool 去除水印。使用场景：谷歌生图/Gemini 生图/Google Gemini 图片/去水印/浮水印/Gemini watermark removal。

README (SKILL.md)

Google Gemini 生图

通过 OpenClaw Browser Tool 操控已登录 Google 账号的浏览器，在 Gemini 网页端生成并下载图片。

前置条件

浏览器已登录 Google 账号
OpenClaw Browser Tool 可用（确保 openclaw browser status 正常）
profile 使用 user 连接已打开的 Chrome

执行流程

1. 打开 Gemini 页面

browser(action="open", profile="user", url="https://gemini.google.com")

也可以直接打开已有对话链接复用图片：

browser(action="open", profile="user", url="https://gemini.google.com/app/\x3C对话ID>")

2. 点击「制作图片」

snapshot 找到按钮 ref，然后 click：

browser(action="snapshot", profile="user", compact=true)
// 找到「制作图片」按钮的 ref，然后 click
browser(action="act", profile="user", request={"kind": "click", "ref": "\x3Cref>"})

新对话会先展示风格选择界面（单色/色块/跑跑等），可以直接忽略，在输入框输入 prompt 即可。

3. 输入 Prompt 并发送

browser(action="act", profile="user", request={"kind": "type", "ref": "\x3Ctextarea ref>", "text": "你的Prompt"})
browser(action="act", profile="user", request={"kind": "press", "key": "Enter"})

⚠️ Prompt 规则：

避免使用"唱""弹奏"等动词关键词，否则 Gemini 会误触发音乐生成而非图片生成
改为纯视觉描述，如"wearing a microphone headset"而非"singing with a microphone"
需要文字时直接在 prompt 中写明，如 The text "畢士傾訴" appears on a banner

4. 等待图片生成

⚠️ 关键：不要用 act(kind="wait")！

act(kind="wait") 在 CDP 层面没有真正的"等待页面变化"机制，它只是在等 WebSocket 响应，8 秒无响应就会超时并导致整个 browser tool session 不可用。

正确做法：用 exec sleep 等待后再 snapshot

exec: sleep 20 && echo "done"
// 等待 exec 完成后
browser(action="snapshot", profile="user", compact=true)

生成完成标志：页面出现「下载完整尺寸的图片」「复制图片」「分享图片」等按钮。

如果 snapshot 显示还在生成中（有 "Creating your image..." 按钮），再 sleep 一轮。

5. 下载图片

点击「下载完整尺寸的图片」按钮：

browser(action="act", profile="user", request={"kind": "click", "ref": "\x3C下载按钮ref>"})

等待下载完成后检查下载目录：

sleep 5 && ls -lt ~/Downloads/Gemini_Generated_Image* | head -3

6. 去水印

Gemini 生成的图片带有水印，使用 GeminiWatermarkTool 去除。

安装（macOS / Linux）：

brew install allenk/tap/gwt

或从 GitHub Releases 下载二进制文件。

已知可用路径（若 brew 不可用）：

~/.claude/skills/gwt/bin/GeminiWatermarkTool

使用：

gwt --force -i \x3C输入图片> -o \x3C输出图片>

7. 发送到飞书（可选）

使用 send-feishu-image 技能：

import sys
sys.path.insert(0, "~/.openclaw/workspace/skills/send-feishu-image")
from send_feishu_image import send_image
result = send_image(
    image_path="/path/to/output.png",
    user_id="ou_7abe0c2af8a0f7b5b1c1171bcd8707d8",
    caption="图片说明"
)

已知问题

问题	解决方案
`act(kind="wait")` 超时导致 browser tool 不可用	永远不要用 `act(kind="wait")`，改用 `exec sleep` + `snapshot` 轮询
snapshot 超时	重启 Gateway（菜单栏 OpenClaw → Restart）
标签页未找到	`browser(action="snapshot")` 查看当前页面状态
触发了音乐生成	prompt 去掉"唱""弹"等词，改为纯视觉描述
图片长时间未生成	Gemini 模型较慢，sleep 20-25 秒再 snapshot
gwt 安装失败（GitHub 不可达）	检查 `~/.claude/skills/gwt/bin/GeminiWatermarkTool` 是否已存在
下载后找不到新文件	注意文件名变化，用 `ls -lt` 按时间排序查看最新的

完成后

关闭不用的标签页：browser(action="close", targetId="\x3CID>")

Usage Guidance

This skill appears to do what it says, but consider these points before installing/using it: 1) The skill drives a logged-in browser profile — it will act with your Google session's privileges. Only use it with accounts you trust for automation. 2) Removing watermarks may violate service terms or copyright; ensure you have the right to alter the images. 3) GeminiWatermarkTool is recommended to be installed via brew or direct GitHub release; verify the source and signatures before running downloaded binaries. 4) The optional send-feishu-image step will transmit images to an external service (Feishu); double-check recipients and that you trust the target workspace. 5) As a precaution, test in a throwaway/isolated account or environment first, and inspect any binaries you install. If you need a higher-assurance review, ask for the exact GeminiWatermarkTool release links and checksums and verify the brew tap and GitHub repo reputations.

Capability Assessment

✓ Purpose & Capability

Name/description (Gemini image generation + watermark removal) matches the SKILL.md: it uses the OpenClaw Browser Tool to drive a logged-in Google browser and then calls an external GeminiWatermarkTool binary. The skill does not request unrelated environment variables or binaries.

ℹ Instruction Scope

Instructions are explicit about browser actions, shell commands (sleep, ls), and calling a local binary. They require operating on an already logged-in browser profile (profile="user"), and optionally reference a local send-feishu-image skill to transmit images. These are within the declared purpose but mean the agent will act with the privileges of the logged-in browser and can send generated images externally if the optional step is used.

ℹ Install Mechanism

This is an instruction-only skill (no install spec). The SKILL.md recommends installing GeminiWatermarkTool via brew or GitHub Releases. Both are reasonable but downloading binaries from releases has inherent supply-chain risk — the skill itself does not perform the download.

ℹ Credentials

The skill declares no env vars or credentials, which is proportional. However it expects access to a logged-in Google browser profile and local filesystem paths (~/Downloads, ~/.claude/skills/gwt, ~/.openclaw/workspace/skills/send-feishu-image). Those accesses are consistent with the stated tasks but are sensitive (can access user account state and local files).

✓ Persistence & Privilege

always is false, no install step writes to disk as part of the skill bundle, and the skill does not request elevated or persistent platform privileges. Autonomous invocation is allowed but is the platform default and not itself suspicious here.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install gemini-image-gen-watermark-removal
After installation, invoke the skill by name or use /gemini-image-gen-watermark-removal
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.1.0

Fix: replace act(wait) with exec sleep + snapshot polling to avoid CDP timeout. Add send-feishu-image integration. Add gwt fallback path. Add style selection notes.

v0.4.0

Fix: revert to OpenClaw browser tool API instead of Chrome DevTools MCP

v0.3.0

Migrate from Browser Relay CDP to Chrome DevTools MCP tools

v0.2.0

Rename to gemini-image-gen-watermark-removal; add watermark removal to description

Metadata

Slug gemini-image-gen-watermark-removal

Version 1.1.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 4

Frequently Asked Questions

What is Gemini Image Gen + Watermark Removal?

Google Gemini 网页端生图并去水印。通过 OpenClaw Browser Tool 控制浏览器生成、下载图片，再用 GeminiWatermarkTool 去除水印。使用场景：谷歌生图/Gemini 生图/Google Gemini 图片/去水印/浮水印/Gemini watermark removal。 It is an AI Agent Skill for Claude Code / OpenClaw, with 237 downloads so far.

How do I install Gemini Image Gen + Watermark Removal?

Run "/install gemini-image-gen-watermark-removal" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Gemini Image Gen + Watermark Removal free?

Yes, Gemini Image Gen + Watermark Removal is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Gemini Image Gen + Watermark Removal support?

Gemini Image Gen + Watermark Removal is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Gemini Image Gen + Watermark Removal?

It is built and maintained by imkingjh999 (@imkingjh999); the current version is v1.1.0.

More Skills