← 返回 Skills 市场
imkingjh999

Gemini Image Gen + Watermark Removal

作者 imkingjh999 · GitHub ↗ · v1.1.0 · MIT-0
cross-platform ✓ 安全检测通过
237
总下载
0
收藏
0
当前安装
4
版本数
在 OpenClaw 中安装
/install gemini-image-gen-watermark-removal
功能描述
Google Gemini 网页端生图并去水印。通过 OpenClaw Browser Tool 控制浏览器生成、下载图片,再用 GeminiWatermarkTool 去除水印。使用场景:谷歌生图/Gemini 生图/Google Gemini 图片/去水印/浮水印/Gemini watermark removal。
使用说明 (SKILL.md)

Google Gemini 生图

通过 OpenClaw Browser Tool 操控已登录 Google 账号的浏览器,在 Gemini 网页端生成并下载图片。

前置条件

  • 浏览器已登录 Google 账号
  • OpenClaw Browser Tool 可用(确保 openclaw browser status 正常)
  • profile 使用 user 连接已打开的 Chrome

执行流程

1. 打开 Gemini 页面

browser(action="open", profile="user", url="https://gemini.google.com")

也可以直接打开已有对话链接复用图片:

browser(action="open", profile="user", url="https://gemini.google.com/app/\x3C对话ID>")

2. 点击「制作图片」

snapshot 找到按钮 ref,然后 click:

browser(action="snapshot", profile="user", compact=true)
// 找到「制作图片」按钮的 ref,然后 click
browser(action="act", profile="user", request={"kind": "click", "ref": "\x3Cref>"})

新对话会先展示风格选择界面(单色/色块/跑跑等),可以直接忽略,在输入框输入 prompt 即可。

3. 输入 Prompt 并发送

browser(action="act", profile="user", request={"kind": "type", "ref": "\x3Ctextarea ref>", "text": "你的Prompt"})
browser(action="act", profile="user", request={"kind": "press", "key": "Enter"})

⚠️ Prompt 规则

  • 避免使用"唱""弹奏"等动词关键词,否则 Gemini 会误触发音乐生成而非图片生成
  • 改为纯视觉描述,如"wearing a microphone headset"而非"singing with a microphone"
  • 需要文字时直接在 prompt 中写明,如 The text "畢士傾訴" appears on a banner

4. 等待图片生成

⚠️ 关键:不要用 act(kind="wait")

act(kind="wait") 在 CDP 层面没有真正的"等待页面变化"机制,它只是在等 WebSocket 响应,8 秒无响应就会超时并导致整个 browser tool session 不可用。

正确做法:用 exec sleep 等待后再 snapshot

exec: sleep 20 && echo "done"
// 等待 exec 完成后
browser(action="snapshot", profile="user", compact=true)

生成完成标志:页面出现「下载完整尺寸的图片」「复制图片」「分享图片」等按钮。

如果 snapshot 显示还在生成中(有 "Creating your image..." 按钮),再 sleep 一轮。

5. 下载图片

点击「下载完整尺寸的图片」按钮:

browser(action="act", profile="user", request={"kind": "click", "ref": "\x3C下载按钮ref>"})

等待下载完成后检查下载目录:

sleep 5 && ls -lt ~/Downloads/Gemini_Generated_Image* | head -3

6. 去水印

Gemini 生成的图片带有水印,使用 GeminiWatermarkTool 去除。

安装(macOS / Linux):

brew install allenk/tap/gwt

或从 GitHub Releases 下载二进制文件。

已知可用路径(若 brew 不可用):

~/.claude/skills/gwt/bin/GeminiWatermarkTool

使用

gwt --force -i \x3C输入图片> -o \x3C输出图片>

7. 发送到飞书(可选)

使用 send-feishu-image 技能:

import sys
sys.path.insert(0, "~/.openclaw/workspace/skills/send-feishu-image")
from send_feishu_image import send_image
result = send_image(
    image_path="/path/to/output.png",
    user_id="ou_7abe0c2af8a0f7b5b1c1171bcd8707d8",
    caption="图片说明"
)

已知问题

问题 解决方案
act(kind="wait") 超时导致 browser tool 不可用 永远不要用 act(kind="wait"),改用 exec sleep + snapshot 轮询
snapshot 超时 重启 Gateway(菜单栏 OpenClaw → Restart)
标签页未找到 browser(action="snapshot") 查看当前页面状态
触发了音乐生成 prompt 去掉"唱""弹"等词,改为纯视觉描述
图片长时间未生成 Gemini 模型较慢,sleep 20-25 秒再 snapshot
gwt 安装失败(GitHub 不可达) 检查 ~/.claude/skills/gwt/bin/GeminiWatermarkTool 是否已存在
下载后找不到新文件 注意文件名变化,用 ls -lt 按时间排序查看最新的

完成后

  • 关闭不用的标签页:browser(action="close", targetId="\x3CID>")
安全使用建议
This skill appears to do what it says, but consider these points before installing/using it: 1) The skill drives a logged-in browser profile — it will act with your Google session's privileges. Only use it with accounts you trust for automation. 2) Removing watermarks may violate service terms or copyright; ensure you have the right to alter the images. 3) GeminiWatermarkTool is recommended to be installed via brew or direct GitHub release; verify the source and signatures before running downloaded binaries. 4) The optional send-feishu-image step will transmit images to an external service (Feishu); double-check recipients and that you trust the target workspace. 5) As a precaution, test in a throwaway/isolated account or environment first, and inspect any binaries you install. If you need a higher-assurance review, ask for the exact GeminiWatermarkTool release links and checksums and verify the brew tap and GitHub repo reputations.
能力评估
Purpose & Capability
Name/description (Gemini image generation + watermark removal) matches the SKILL.md: it uses the OpenClaw Browser Tool to drive a logged-in Google browser and then calls an external GeminiWatermarkTool binary. The skill does not request unrelated environment variables or binaries.
Instruction Scope
Instructions are explicit about browser actions, shell commands (sleep, ls), and calling a local binary. They require operating on an already logged-in browser profile (profile="user"), and optionally reference a local send-feishu-image skill to transmit images. These are within the declared purpose but mean the agent will act with the privileges of the logged-in browser and can send generated images externally if the optional step is used.
Install Mechanism
This is an instruction-only skill (no install spec). The SKILL.md recommends installing GeminiWatermarkTool via brew or GitHub Releases. Both are reasonable but downloading binaries from releases has inherent supply-chain risk — the skill itself does not perform the download.
Credentials
The skill declares no env vars or credentials, which is proportional. However it expects access to a logged-in Google browser profile and local filesystem paths (~/Downloads, ~/.claude/skills/gwt, ~/.openclaw/workspace/skills/send-feishu-image). Those accesses are consistent with the stated tasks but are sensitive (can access user account state and local files).
Persistence & Privilege
always is false, no install step writes to disk as part of the skill bundle, and the skill does not request elevated or persistent platform privileges. Autonomous invocation is allowed but is the platform default and not itself suspicious here.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install gemini-image-gen-watermark-removal
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /gemini-image-gen-watermark-removal 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.1.0
Fix: replace act(wait) with exec sleep + snapshot polling to avoid CDP timeout. Add send-feishu-image integration. Add gwt fallback path. Add style selection notes.
v0.4.0
Fix: revert to OpenClaw browser tool API instead of Chrome DevTools MCP
v0.3.0
Migrate from Browser Relay CDP to Chrome DevTools MCP tools
v0.2.0
Rename to gemini-image-gen-watermark-removal; add watermark removal to description
元数据
Slug gemini-image-gen-watermark-removal
版本 1.1.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 4
常见问题

Gemini Image Gen + Watermark Removal 是什么?

Google Gemini 网页端生图并去水印。通过 OpenClaw Browser Tool 控制浏览器生成、下载图片,再用 GeminiWatermarkTool 去除水印。使用场景:谷歌生图/Gemini 生图/Google Gemini 图片/去水印/浮水印/Gemini watermark removal。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 237 次。

如何安装 Gemini Image Gen + Watermark Removal?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install gemini-image-gen-watermark-removal」即可一键安装,无需额外配置。

Gemini Image Gen + Watermark Removal 是免费的吗?

是的,Gemini Image Gen + Watermark Removal 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Gemini Image Gen + Watermark Removal 支持哪些平台?

Gemini Image Gen + Watermark Removal 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Gemini Image Gen + Watermark Removal?

由 imkingjh999(@imkingjh999)开发并维护,当前版本 v1.1.0。

💬 留言讨论