← 返回 Skills 市场
clarezoe

Browser Driver

作者 clarezoe · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
38
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install browser-driver
功能描述
Attach to the user's OWN already-logged-in system browser over the Chrome DevTools Protocol (CDP) with Playwright, so automation reuses their existing sessio...
使用说明 (SKILL.md)

Browser Driver (system browser over CDP)

Drive the user's own, already-logged-in browser via CDP + Playwright. The user watches every step; you screenshot after each mutating action so both you and the user can audit.

Why the system browser, not a fresh Playwright Chromium

A fresh Playwright profile has no sessions — the user would re-log-in with 2FA from scratch. The user's own Chromium browser already holds the session, so there is zero login friction. You attach to it over a debug port; you do not replace it.

When to use this — and when NOT to (avoid tool conflict)

This skill exists for exactly one thing the other browser tools cannot do: attach to the user's real browser and its existing login. Reach for it only when all of these hold:

  • The task needs the user's existing session / login (would otherwise require logging in, often with 2FA), AND
  • A fresh or headless browser profile would land on a login screen, AND
  • The user wants it done in their own browser, typically watching live.

Signals: "do it in my browser", "use my login / my session", "I'm already signed in", "don't make me log in again", a console/dashboard behind SSO+2FA.

Do NOT use this skill — defer to the agent's built-in browser tooling — for:

  • ordinary browsing, fetching, or scraping of public or unauthenticated pages,
  • headless end-to-end tests or CI automation,
  • anything where a clean throwaway profile is fine (no real login needed),
  • when the agent already has a browser tool / Playwright MCP / chrome-devtools MCP / computer-use that satisfies the task without the user's live session.

If the built-in tooling can do it, use that and stop — do not load this skill's reference files. Only load references/* once you have actually committed to driving the system browser, so an irrelevant browser task costs ~0 extra tokens (just this short description).

Scope boundary (read first)

Only ever drive the user's own browser, own sessions, own accounts, with their knowledge and present consent. Never use this to bypass authentication, defeat a security control, or reach an account that is not theirs. Identity walls (Touch ID, security key, liveness/QR) are handed back to the user — never attempt to defeat or proxy them.

Workflow

  1. Check where the agent runs vs where the browser is. If you (the agent) run on the same machine as the user's logged-in browser, proceed normally. If you run on a remote host (VPS) and the browser is on the user's local machine, you must bridge the CDP port over an SSH reverse tunnel first — references/remote-over-tunnel.md. If the only browser is a headless one on the VPS, there is no existing user session to reuse and this skill does not apply.
  2. Pick the browser + port. Any Chromium-based browser works (Chrome, Edge, Brave). Choose a non-default debug port (e.g. 9223) to avoid clashes.
  3. Launch with the debug port so tabs/session restore — references/launch-and-drive.md.
  4. Probe the port until it answers, then attach.
  5. Drive one step at a time with short, one-shot scripts that connect fresh, act, screenshot, and detach — references/launch-and-drive.md.
  6. Read the screenshot after every mutating step. That is how you "see" the page and how the user audits.
  7. Handle selectors for modern SPAs (shadow DOM, overlays, display-vs-internal names) — references/selectors-and-handoffs.md.
  8. Hand control to the user at identity walls, then poll for completion — references/selectors-and-handoffs.md.
  9. Capture one-time secrets straight to a file, never to chat — references/selectors-and-handoffs.md.
  10. Clean up: restart the browser normally so no open debug port lingers — references/selectors-and-handoffs.md.

Core rules

  • One-shot scripts, not a long-lived process. A long-running Playwright process dies with the shell. Connect fresh per step with connectOverCDP, act, then browser.close() (this only detaches — the user's browser stays open).
  • Screenshot every mutating step and Read the PNG before the next action. No blind clicking.
  • Locators over querySelectorAll. Playwright locators pierce shadow DOM; raw document.querySelectorAll inside page.evaluate does not.
  • DOM-click through overlays. When something "intercepts pointer events", locator.evaluate(el => el.click()) fires a DOM click and skips hit-testing; force: true is often not enough.
  • Stop at identity walls. Click up to the wall, tell the user exactly what to do, then poll page text or a source-of-truth API until it clears.
  • Secrets shown once go to a file, never to chat. Regex the value out of the page text into a temp file, store it in the user's password manager, then shred the temp file.

Reference files (load on demand)

  • references/launch-and-drive.md — launch-with-debug-port sequence (and why open --args is unreliable right after quit), the probe, and the per-step one-shot script pattern with screenshot auditing.
  • references/selectors-and-handoffs.md — SPA selector gotchas (shadow DOM, overlays, display vs internal names), identity-wall handoff + polling, one-time-secret capture, cleanup, and a stdout-corruption gotcha.
  • references/remote-over-tunnel.md — when the agent is remote (VPS) and the browser is on the user's local machine: bridge the CDP port over a loopback-only SSH reverse tunnel; security rules for unauthenticated CDP; when this skill does not apply (headless VPS, no user session).
安全使用建议
Install only if you intentionally want an agent to control your already-signed-in local browser. Keep the CDP port bound to localhost, watch the session while it runs, close the tunnel and restart the browser normally afterward, and avoid temporary plaintext secret files unless there is no safer password-manager path.
能力标签
crypto
能力评估
Purpose & Capability
The skill is explicitly for attaching to the user's own already-logged-in Chromium browser over CDP, including dashboards behind 2FA, screenshot-audited actions, identity-wall handoff, and one-time secret capture. These are sensitive capabilities but coherent with the stated purpose.
Instruction Scope
The instructions repeatedly limit use to the user's own browser, own sessions, own accounts, present consent, and cases where a fresh browser is insufficient. They also say not to use it for ordinary browsing or public scraping.
Install Mechanism
The package contains Markdown guidance and a small YAML agent prompt only; no executable installers, hooks, hidden scripts, or automatic startup behavior were present.
Credentials
Opening a CDP port and optionally tunneling it is powerful, but the remote-tunnel reference warns that CDP has no authentication, requires loopback binding, forbids public forwarding, and instructs teardown after use.
Persistence & Privilege
The skill uses a detached browser launch for the task and writes one-time secrets to a temporary local file before password-manager storage and deletion. This is disclosed and purpose-aligned, but users should treat both the debug port and any temporary secret file as sensitive.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install browser-driver
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /browser-driver 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release: drive the user's own logged-in system browser over CDP with Playwright (reuse session, no 2FA re-login). Per-step scripts, shadow-DOM/overlay handling, identity-wall handoff, one-time-secret capture, remote VPS+local browser SSH tunnel, cleanup.
元数据
Slug browser-driver
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Browser Driver 是什么?

Attach to the user's OWN already-logged-in system browser over the Chrome DevTools Protocol (CDP) with Playwright, so automation reuses their existing sessio... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 38 次。

如何安装 Browser Driver?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install browser-driver」即可一键安装,无需额外配置。

Browser Driver 是免费的吗?

是的,Browser Driver 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Browser Driver 支持哪些平台?

Browser Driver 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Browser Driver?

由 clarezoe(@clarezoe)开发并维护,当前版本 v1.0.0。

💬 留言讨论