Description

通过 Gemini 官网（gemini.google.com）执行问答与生图操作。用户提到“问问Gemini/让Gemini回答/去Gemini问”，或出现“生图/画图/绘图/nano banana/nanobanana/生成图片”等关键词时触发。默认使用可用模型中最强档（优先 Gemini 3.1 Pro），按...

README (SKILL.md)

Gemini Web Ops

Name: Gemini Skill
Author: wjz-p

核心规则

使用 OpenClaw 内置浏览器，profile="openclaw"。
涉及生图关键词（如：生图、绘图、画一张、nano banana）时，优先用无头浏览器流程执行。
文本问答任务（如“问问Gemini xxx”）走 Gemini 文本提问链路。
默认模型：可用列表中最强模型，优先 Gemini 3.1 Pro。
执行生图后先向用户回报“正在绘图中”，完成后回传图片。

任务分流

文本问答触发词：问问Gemini、让Gemini回答、去Gemini问。
生图任务触发词：生图、画、绘图、海报、nano banana、nanobanana、image generation。
若请求含糊，先确认：是文本回答还是要出图。

标准执行流程

A. 文本问答

打开 https://gemini.google.com。
校验登录态（头像/输入框可见）。
选择最强可用模型（优先 Gemini 3.1 Pro）。
将用户问题原样输入并发送。
等待完整输出，提炼后回传（必要时附原文要点）。

B. 生图流程

打开 Gemini 页面并确认登录。
选择最强可用模型（优先 Gemini 3.1 Pro）。
将用户提示词原样输入。
开启/勾选图片生成能力（若 UI 有“生成图片/图片”开关）。
发送后立即通知用户：正在绘图中。
结果出现后：
- 优先用“下载原图”按钮获取原图。
- 若无下载按钮或失败，可对图片右键另存（通常是标清图）。
把图片返回用户；若有多张，按顺序全部回传。

失败回退

元素定位失败：刷新页面后重试一次。
模型不可用：降级到次优 Gemini 模型并告知。
生成超时：回报“仍在生成中”，继续等待一次；再次超时则请用户换短提示词。

低 token 优先策略

优先使用 scripts/gemini_ui_shortcuts.js 的快捷选择器。
先 evaluate 批量动作，再 snapshot 精准兜底。
避免高频全量快照。

参考

详细执行与回退：references/gemini-flow.md
关键词与路由：references/intent-routing.md

Usage Guidance

This skill is internally consistent and appears to do only web-automation of the Gemini UI using the OpenClaw browser profile. Before installing, ensure you understand that: (1) it will use the specified browser profile and any Gemini login in that profile (cookies/account), (2) it may download images to local storage to return them to you, and (3) automated UI scripts interact with page DOM selectors that may break if the site changes. If you prefer extra isolation, test with a secondary Gemini account or confirm the 'openclaw' profile is limited to non-sensitive sessions. If you need the skill to access other services or credentials, require justification before granting them.

Capability Analysis

Type: OpenClaw Skill Name: gemini-skill Version: 0.1.0 The skill bundle is a legitimate automation tool for interacting with the Gemini web interface (gemini.google.com) for text queries and image generation. The JavaScript helper (scripts/gemini_ui_shortcuts.js) contains standard DOM manipulation logic for UI automation, and the instructions in SKILL.md and the reference files are consistent with the stated purpose without any signs of data exfiltration, malicious execution, or harmful prompt injection.

Capability Assessment

✓ Purpose & Capability

Name and description match the actual behavior: open the Gemini web UI, choose a model, send prompts, and download/return generated images. No unrelated cloud credentials, binaries, or external services are requested.

✓ Instruction Scope

SKILL.md confines operations to the Gemini website via the built-in browser profile (checking login, selecting model, entering text, toggling image generation, downloading results). It does instruct downloading images to local disk so they can be returned to the user, which is consistent with its purpose and is the main file I/O implied.

✓ Install Mechanism

There is no install spec; the skill is instruction-only plus one small page-injection helper script (DOM selectors, click/fill utilities). No external downloads or package installs are requested.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. Its requirement to use the 'openclaw' browser profile is proportional to automating a logged-in web session (it relies on browser cookies/auth present in that profile).

✓ Persistence & Privilege

always:false and default invocation model are used. The skill does not request permanent platform-level presence nor modify other skills or system-wide settings.

Version History

v0.1.0

Initial release of gemini-skill with core web-based Gemini Q&A and image generation support. - Supports both text Q&A and image generation via the Gemini website using the strongest available model (default: Gemini 3.1 Pro). - Detects trigger keywords for task routing: separate flows for "问问Gemini" (text Q&A) and "生图/画图/nano banana" (image generation). - Provides user feedback during image generation and returns output images or answers accordingly. - Includes robust fallback logic for element detection, model availability, and timeout handling. - Optimized for low token consumption with shortcut scripts and streamlined UI actions.

Metadata

Slug gemini-skill

Version 0.1.0

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 1

Frequently Asked Questions

What is Gemini Skill?

通过 Gemini 官网（gemini.google.com）执行问答与生图操作。用户提到“问问Gemini/让Gemini回答/去Gemini问”，或出现“生图/画图/绘图/nano banana/nanobanana/生成图片”等关键词时触发。默认使用可用模型中最强档（优先 Gemini 3.1 Pro），按... It is an AI Agent Skill for Claude Code / OpenClaw, with 712 downloads so far.

How do I install Gemini Skill?

Run "/install gemini-skill" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Gemini Skill free?

Yes, Gemini Skill is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Gemini Skill support?

Gemini Skill is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Gemini Skill?

It is built and maintained by WJZ-P (@wjz-p); the current version is v0.1.0.

More Skills

Gemini Skill