← 返回 Skills 市场
eccstartup

Gemini Browser

作者 Yushan Ren · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
272
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install gemini-browser
功能描述
Query Google Gemini via browser automation using OpenClaw's Browser Relay. Use when you need to ask Gemini questions and get AI responses. Requires OpenClaw...
使用说明 (SKILL.md)

Gemini Browser Skill

Query Google Gemini (gemini.google.com) via OpenClaw Browser Relay and extract responses.

⚠️ Security Notice: This skill operates on your real Chrome browser with your logged-in Google session via CDP (Chrome DevTools Protocol). The agent will have access to anything visible in the attached tab. Only attach tabs you explicitly intend for the agent to control. See Security Considerations.

Prerequisites

  • OpenClaw installed and running (this skill uses OpenClaw's browser command)
  • OpenClaw Browser Relay Chrome extension installed and configured
    • Extension binds to loopback 127.0.0.1:18792 by default
    • Gateway auth token must be configured in extension options
  • Google account logged in within Chrome (Gemini requires authentication)
  • Use profile=chrome to relay through your existing Chrome (not the isolated profile=openclaw-managed)

Quick Start

# 1. Open Gemini in Chrome
open -a "Google Chrome" "https://gemini.google.com"

# 2. Manually click the Browser Relay extension icon on the Gemini tab to attach
#    (the badge will show "ON" when attached)

# 3. Verify relay is connected
browser action=status profile=chrome
# Should show cdpReady: true

# 4. List tabs
browser action=tabs profile=chrome
# Note the targetId for the Gemini tab

Input Method

Gemini uses a Quill rich-text editor (contenteditable div), not a standard \x3Ctextarea>. You must inject text via JavaScript:

browser action=act profile=chrome targetId=\x3Cid> request={
  "kind": "evaluate",
  "fn": "(() => { const editor = document.querySelector('div.ql-editor[contenteditable=\"true\"]'); if (!editor) return 'editor not found'; editor.focus(); editor.innerHTML = '\x3Cp>YOUR_QUERY_HERE\x3C/p>'; editor.dispatchEvent(new Event('input', { bubbles: true })); return 'ok'; })()"
}

Then submit:

browser action=act profile=chrome targetId=\x3Cid> request={"kind":"press","key":"Enter"}

Complete Workflow

1. Prepare

Open Gemini in Chrome and manually attach the Browser Relay extension to the tab.

open -a "Google Chrome" "https://gemini.google.com"
# Then click the Browser Relay extension icon on the Gemini tab

2. Get Tab ID

browser action=tabs profile=chrome

Find the Gemini tab entry and note its targetId.

3. Input Query

browser action=act profile=chrome targetId=\x3Cid> request={
  "kind": "evaluate",
  "fn": "(() => { const editor = document.querySelector('div.ql-editor[contenteditable=\"true\"]'); if (!editor) return 'editor not found'; editor.focus(); editor.innerHTML = '\x3Cp>What is quantum computing?\x3C/p>'; editor.dispatchEvent(new Event('input', { bubbles: true })); return 'ok'; })()"
}

4. Submit

browser action=act profile=chrome targetId=\x3Cid> request={"kind":"press","key":"Enter"}

5. Wait for Response

Gemini may take 10–60 seconds. Poll for completion by checking if the stop button has disappeared:

browser action=act profile=chrome targetId=\x3Cid> request={
  "kind": "evaluate",
  "fn": "(() => { const stop = document.querySelector('button[aria-label*=\"Stop\"]'); return stop ? 'generating' : 'done'; })()"
}

6. Extract Response

Option A — Clipboard (recommended, preserves Markdown formatting):

# Take a snapshot and find the Copy button
browser action=snapshot profile=chrome targetId=\x3Cid>

# Click the Copy button by its ref from the snapshot
browser action=act profile=chrome targetId=\x3Cid> request={"kind":"click","ref":"\x3Ccopy_button_ref>"}

# Read from clipboard
pbpaste

Option B — DOM extraction (fallback):

browser action=act profile=chrome targetId=\x3Cid> request={
  "kind": "evaluate",
  "fn": "(() => { const msgs = document.querySelectorAll('.model-response-text'); if (msgs.length === 0) return 'no response found'; return msgs[msgs.length - 1].innerText; })()"
}

New Chat

For unrelated queries, start a fresh chat to avoid context pollution:

browser action=navigate profile=chrome targetId=\x3Cid> targetUrl="https://gemini.google.com"

Response Completion Signals

The response is complete when:

  • The stop button disappears
  • A copy button appears below the response
  • Suggested follow-up chips appear

Security Considerations

⚠️ Important: Understand these risks before using this skill.

  1. Session access: profile=chrome uses your real Chrome with all logged-in sessions. The agent can see and interact with anything in the attached tab, including your Google account context.
  2. JavaScript evaluation: The evaluate action runs arbitrary JavaScript in the page context. This skill limits it to DOM manipulation for the input field, but the mechanism itself is powerful.
  3. Manual attachment required: The Browser Relay extension must be manually clicked by you to attach — the agent cannot auto-attach to arbitrary tabs. Only attach the specific Gemini tab.
  4. Loopback only: The relay binds to 127.0.0.1 and requires an auth token, preventing remote access.
  5. Recommendation: Use a separate Chrome profile dedicated to AI automation, logged into a non-primary Google account, to limit exposure.

Troubleshooting

Problem Solution
cdpReady: false Click the Browser Relay extension icon on the Gemini tab to re-attach
Tab not found Run browser action=tabs profile=chrome to refresh tab list
Editor not found Page may not be fully loaded; wait and retry. Gemini may have changed DOM — check for div.ql-editor
Copy button not found Response may still be generating; poll stop button status first
Login wall Ensure Chrome is logged into a Google account
Context overflow Navigate to gemini.google.com for a fresh chat
安全使用建议
This skill appears to do what it says: it automates Gemini by attaching to a real Chrome tab via OpenClaw Browser Relay. Important cautions before you install/use it: - Only attach the extension to a tab you explicitly want the agent to control — the agent can read and interact with anything visible in that tab (including your account context and page data). - Use a dedicated Chrome profile (or a secondary Google account) for automation to limit exposure, as the author recommends. - The evaluate action runs arbitrary JavaScript in-page. That capability is required for Quill-based input, but it also means a malicious or misused agent could extract other page content. Rely on the manual-click attachment safeguard. - The SKILL.md contains macOS-specific commands (open -a, pbpaste). If you’re on Windows/Linux, these commands will fail; the skill metadata does not declare an OS restriction or required binaries. Ensure you adapt the clipboard/read commands or only use on macOS. - Verify you trust the OpenClaw Browser Relay extension and the environment (the extension’s auth token and loopback binding are the gatekeepers for local access). If you accept these risks and follow the mitigations (manual attach, separate profile/account), the skill is internally coherent. If you need higher assurance, ask the author for explicit OS compatibility notes and for the minimal set of commands/tools required (e.g., pbpaste alternative for other OSes).
功能分析
Type: OpenClaw Skill Name: gemini-browser Version: 1.0.0 The skill utilizes high-risk capabilities including arbitrary JavaScript execution (`evaluate`) within the user's primary, authenticated Chrome session and the use of the `pbpaste` shell command to read the system clipboard. While these behaviors are aligned with the stated purpose of automating Google Gemini and are accompanied by security warnings in SKILL.md and README.md, they grant the agent broad access to sensitive session data and potentially unrelated clipboard content, fitting the criteria for risky capabilities without clear malicious intent.
能力评估
Purpose & Capability
The skill's name/description match the actions in SKILL.md: it uses OpenClaw Browser Relay to control a real Chrome tab and interact with gemini.google.com. Required pieces (Browser Relay, OpenClaw, a logged-in Chrome profile) are appropriate. Minor inconsistency: the instructions use macOS-specific commands (open -a, pbpaste) and assume a local pbpaste tool, but the skill metadata does not declare an OS restriction or required binaries.
Instruction Scope
The SKILL.md stays within the stated purpose (open Gemini, inject text into the Quill editor, submit, and extract response). However, it requires executing arbitrary JavaScript in the page context via the evaluate action — a necessary capability for this automation but high-privilege: anything visible in the attached tab can be read or manipulated. The doc repeatedly warns about this and requires a manual extension click to attach, which is a mitigating control.
Install Mechanism
Instruction-only skill with no install spec or external downloads. This is low risk from an installation standpoint.
Credentials
The skill does not request environment variables, credentials, or config paths. That is proportionate for a browser-automation skill that operates through the user's existing browser session. Note: it does rely on local clipboard access (pbpaste) and OpenClaw being installed, which are not declared as required binaries.
Persistence & Privilege
always is false and the skill is user-invocable; it does not request persistent or privileged presence. Autonomous invocation (disable-model-invocation=false) is the platform default and not a standalone concern here. The skill does not attempt to modify other skills or system-wide settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install gemini-browser
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /gemini-browser 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release: Gemini browser automation skill with security guide
元数据
Slug gemini-browser
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Gemini Browser 是什么?

Query Google Gemini via browser automation using OpenClaw's Browser Relay. Use when you need to ask Gemini questions and get AI responses. Requires OpenClaw... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 272 次。

如何安装 Gemini Browser?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install gemini-browser」即可一键安装,无需额外配置。

Gemini Browser 是免费的吗?

是的,Gemini Browser 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Gemini Browser 支持哪些平台?

Gemini Browser 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Gemini Browser?

由 Yushan Ren(@eccstartup)开发并维护,当前版本 v1.0.0。

💬 留言讨论