← 返回 Skills 市场

Midscene Automations Skills for Browser with Bridge

Name: Midscene Automations Skills for Browser with Bridge
Author: quanru

作者 Leyang · GitHub ↗ · v1.0.3

cross-platform ⚠ suspicious

910

总下载

当前安装

版本数

在 OpenClaw 中安装

/install midscene-computer-chrome-bridge

功能描述

Vision-driven browser automation using Midscene Bridge mode. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with...

安全使用建议

This skill will drive your real Chrome and send screenshots and interactions to external model endpoints. Before installing or using it: 1) Ask the publisher for a source repository or homepage and a clear provenance for the Midscene Chrome extension and the @midscene/web package. Do not proceed if there is no trusted source. 2) Recognize that model API keys configured for this skill will allow third-party services to receive screenshots — do NOT use on accounts/sites containing sensitive information (banking, healthcare, private messages, MFA tokens) unless you fully trust the endpoint. 3) Verify the npm package @midscene/web@1 (check the code, release signatures, and publisher) or request a pinned tarball/sha to avoid unexpected remote code execution via npx. 4) Prefer running in an isolated environment: a disposable browser profile or VM, and limit the scope of pages the skill may access. 5) Require explicit user confirmation before interacting with any sensitive page and consider using a private/local model endpoint (set MIDSCENE_MODEL_BASE_URL to a trusted internal host) so screenshots are not sent to third parties. 6) Ask the author to update registry metadata to declare required env vars and runtime binaries (npx/node), and to publish source/homepage; until then treat this skill as untrusted. If you cannot verify these items, do not install or use it on sensitive systems.

功能分析

Type: OpenClaw Skill Name: midscene-computer-chrome-bridge Version: 1.0.3 The skill bundle provides a legitimate interface for Midscene.js, a vision-driven browser automation tool. It uses the official `@midscene/web` npm package to interact with a user's Chrome browser via a dedicated extension. The instructions in `SKILL.md` are well-structured, focusing on operational reliability (synchronous execution, result reporting) and standard AI model configuration (Gemini, Qwen, Doubao). No evidence of data exfiltration, malicious command execution, or deceptive prompt injection was found.

能力评估

⚠ Purpose & Capability

The SKILL.md describes vision-driven automation of the user's real Chrome via a Midscene Chrome extension and requires model credentials to run. However the registry metadata lists no required environment variables or binaries and there is no source/homepage. The SKILL.md implicitly depends on node/npx and a remote @midscene/web package (via npx), which are not declared. The required capabilities (access to browser state and external model endpoints) are plausible for the stated purpose, but the missing declarations and absent source/homepage are incoherent and concerning.

⚠ Instruction Scope

The instructions explicitly tell the agent to connect to the user's real Chrome, preserve cookies/sessions, take screenshots, read the saved image files, and send high-level prompts to the Midscene tool. That means screenshots (and therefore potentially passwords, 2FA, private messages, bank details, etc.) will be seen by downstream model endpoints. The SKILL.md also tells the agent not to verify extension status and to connect directly, which removes a safety/check step. The agent is given broad discretion to interact with any visible element and to scrape data, which is consistent with the stated purpose but increases privacy risk.

ℹ Install Mechanism

There is no declared install spec (instruction-only), which lowers static install risk. However the runtime commands use 'npx @midscene/web@1', meaning npx will download and execute code from the npm registry at runtime. That is an implicit install/execute step not represented in the registry metadata and carries risk if the package or its release source is untrusted.

⚠ Credentials

SKILL.md requires multiple environment variables (MIDSCENE_MODEL_API_KEY, MIDSCENE_MODEL_NAME, MIDSCENE_MODEL_BASE_URL, MIDSCENE_MODEL_FAMILY, etc.) for external model providers (Google, OpenRouter, Aliyun, Doubao examples). None of these required env vars appear in the registry metadata. These credentials would allow external services to receive screenshots and page content — a high-sensitivity capability. Requesting model API keys is consistent with the skill's function, but the absence of these requirements from the declared metadata is an incoherence and increases risk.

✓ Persistence & Privilege

The skill is not always-enabled and does not declare config paths or persistent system-wide changes. Autonomous model invocation is allowed (platform default) but not combined with 'always: true'. The skill does instruct storing a .env in the working directory (local only), which is normal for credentials but should be treated carefully.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install midscene-computer-chrome-bridge
安装完成后，直接呼叫该 Skill 的名称或使用 /midscene-computer-chrome-bridge 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.3

**Adds strict result reporting requirements and updates model guidance.** - Now mandates a clear summary of results after every automation task, including data found, actions completed, screenshots taken, and relevant findings. - Updates environment variable examples and recommended models (adds Qwen 3.5, Doubao Seed 2.0 Lite, and related provider instructions). - Clarifies workflow: do not disconnect from Bridge mode unless the user's entire task is fully complete; keep sessions available for continued use. - Revises best practices and workflow pattern for improved clarity and efficiency.

v1.0.1

- Updated documentation to emphasize "vision-driven" automation — all actions operate from screenshots, no DOM/selector logic required. - Expanded model setup instructions: now includes required environment variables and detailed configuration examples for Gemini, Qwen3-VL, Doubao, and others. - Changed command usage examples to prefer high-level, natural language `act` commands for batching multi-step UI operations, improving efficiency. - Clarified best practices: batch related actions into single `act` commands, summarize report files for the user, and never run commands in background. - Updated troubleshooting guides for connection and screenshot issues; added direct Chrome Extension store link. - Minor formatting and clarity improvements throughout for easier onboarding.

v1.0.0

Chrome Bridge Automation skill initial release: - Enables AI-powered automation of the user's real Chrome browser via the Midscene Extension in Bridge mode. - Supports browsing, navigation, interaction with authenticated pages, scraping, form filling, UI testing, and screenshot capture. - Follows strict command formats and workflow patterns for reliable operation. - Preserves browser context, sessions, and cookies for seamless web automation. - Includes troubleshooting guidance and best practices for stable multi-step workflows.

元数据

Slug midscene-computer-chrome-bridge

版本 1.0.3

许可证 —

累计安装 3

当前安装数 3

历史版本数 3

常见问题

Midscene Automations Skills for Browser with Bridge 是什么？

Vision-driven browser automation using Midscene Bridge mode. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 910 次。

如何安装 Midscene Automations Skills for Browser with Bridge？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install midscene-computer-chrome-bridge」即可一键安装，无需额外配置。

Midscene Automations Skills for Browser with Bridge 是免费的吗？

是的，Midscene Automations Skills for Browser with Bridge 完全免费（开源免费），可自由下载、安装和使用。

Midscene Automations Skills for Browser with Bridge 支持哪些平台？

Midscene Automations Skills for Browser with Bridge 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Midscene Automations Skills for Browser with Bridge？

由 Leyang（@quanru）开发并维护，当前版本 v1.0.3。