← 返回 Skills 市场
quanru

Midscene Automations Skills for Computer

作者 Leyang · GitHub ↗ · v1.0.3
cross-platform ⚠ suspicious
2646
总下载
3
收藏
10
当前安装
4
版本数
在 OpenClaw 中安装
/install midscene-computer-automation
功能描述
Vision-driven desktop automation using Midscene. Control your desktop (macOS, Windows, Linux) with natural language commands. Operates entirely from screensh...
安全使用建议
This skill will take screenshots and control your mouse/keyboard while it runs, and it requires remote-model API keys (MIDSCENE_MODEL_API_KEY and related vars). However the registry metadata does not list those env vars and the skill has no declared source or homepage — that mismatch is a red flag. Before installing or using it: 1) Verify the package source (official midscenejs site or GitHub repo and publisher identity); request the skill's code or homepage and confirm the exact npx package and version it runs. 2) Use dedicated, limited-scope API keys (not your primary cloud account keys), and consider creating test-only model accounts. 3) Run the tool in a contained environment or VM and avoid exposing sensitive apps/documents while testing. 4) Ask the publisher why registry metadata omits required env vars and request a full manifest (package.json, exact npx package SHA). 5) Be prepared to rotate/revoke any model API keys after testing. If you cannot verify the source or provenance of the npx package, treat the skill as risky and avoid providing high-privilege credentials or running it on machines with sensitive data.
功能分析
Type: OpenClaw Skill Name: midscene-computer-automation Version: 1.0.3 The skill bundle provides a legitimate integration for Midscene, a vision-driven desktop automation tool. It utilizes the official `@midscene/computer` npm package to perform screen-based actions (clicking, typing, etc.) via natural language prompts. The instructions in SKILL.md are focused on operational reliability, such as synchronous execution, health checks, and reporting results, and do not contain any evidence of malicious intent, data exfiltration, or unauthorized persistence.
能力评估
Purpose & Capability
The SKILL.md describes a vision-driven desktop automation tool that uses remote LLM/vision models and a CLI (npx @midscene/computer). The environment variables it documents (MIDSCENE_MODEL_API_KEY, MIDSCENE_MODEL_NAME, MIDSCENE_MODEL_BASE_URL, MIDSCENE_MODEL_FAMILY, etc.) are coherent with that purpose. However the registry metadata claims no required env vars or primary credential, which is inconsistent with the runtime instructions.
Instruction Scope
Instructions are explicit: run synchronous npx commands, take screenshots, read the saved image files, use act to perform UI interactions, and summarize results. All actions described are within the scope of a desktop automation tool. Note: the workflow inherently captures screen contents and controls input devices — high-sensitivity operations but expected for this skill's purpose.
Install Mechanism
This is an instruction-only skill with no install spec and no code files present, so nothing is written to disk by the skill itself. The runtime depends on running an external package via npx (@midscene/computer), which will fetch code at runtime — that is normal but means runtime code provenance matters.
Credentials
The SKILL.md requires multiple environment variables including API keys and base URLs for third-party model providers. That is proportionate to using remote models, but it contradicts the registry's 'no required env vars' fields. Requesting API keys for cloud models is legitimate here, but these are sensitive credentials; the skill asks for them without declaring them in metadata, and the source/homepage is missing, which raises concern about where those credentials are used and who can access them.
Persistence & Privilege
No persistent installation, no always: true flag, and the skill is user-invocable only. The skill will run commands that give it active control of the desktop while invoked, but it does not request elevated persistent platform privileges in the metadata.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install midscene-computer-automation
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /midscene-computer-automation 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.3
Version 1.0.3 - Enforces a new requirement to proactively report automation results after task completion, summarizing actions, findings, screenshots, and files. - Adds an explicit "Report results" step to the workflow and documents it as best practice. - Updates example environment variable setup with new supported models (Qwen 3.5, Doubao Seed 2.0 Lite). - Minor naming update: skill is now "desktop-computer-automation". - Refines and clarifies documentation, especially around closing response loops and model selection.
v1.0.2
- Updated description and best practices to clarify that automation is fully vision-driven (no DOM or accessibility labels required). - Overhauled prerequisites: now requires users to explicitly set both API key and model information for visual AI models, with detailed examples for Gemini, Qwen3-VL, and Doubao. - All commands now require the explicit package version: instructions use `npx @midscene/computer@1` for clarity and version consistency. - Simplified command set: removed step-wise CLI actions (like single clicks/keyboard events)—instead, guides users to use the higher-level `act` command for all desktop automation interactions. - Expanded health check and workflow guidance for higher reliability, including explicit health check steps after connect and troubleshooting tips for multi-display setups. - Emphasized batching related UI operations into a single `act` command for better reliability and faster execution. - Explained recommended environment variable setup, PATH configuration for macOS, and the importance of visual confirmation before automating app windows.
v1.0.1
- Expanded support from only macOS to include Windows and Linux desktops. - Updated documentation to reference cross-platform usage, commands, and workflows. - Revised usage examples and troubleshooting to address platform-specific steps for all major operating systems. - Broadened trigger and action descriptions for compatibility with different desktop environments.
v1.0.0
Initial release of Android Device Automation skill—control Android devices via Midscene and ADB using natural language. - Enables AI-driven automation (tap, swipe, input, launch apps, screenshots, etc.) on Android devices. - Provides strict workflow and rules for interacting with the Midscene CLI and Bash tool. - Requires .env file with API key; includes troubleshooting steps for common ADB/device issues. - Supports both single-step and multi-step (transient) UI interactions. - Includes best practices for screenshot frequency and precise UI targeting.
元数据
Slug midscene-computer-automation
版本 1.0.3
许可证
累计安装 11
当前安装数 10
历史版本数 4
常见问题

Midscene Automations Skills for Computer 是什么?

Vision-driven desktop automation using Midscene. Control your desktop (macOS, Windows, Linux) with natural language commands. Operates entirely from screensh... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 2646 次。

如何安装 Midscene Automations Skills for Computer?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install midscene-computer-automation」即可一键安装,无需额外配置。

Midscene Automations Skills for Computer 是免费的吗?

是的,Midscene Automations Skills for Computer 完全免费(开源免费),可自由下载、安装和使用。

Midscene Automations Skills for Computer 支持哪些平台?

Midscene Automations Skills for Computer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Midscene Automations Skills for Computer?

由 Leyang(@quanru)开发并维护,当前版本 v1.0.3。

💬 留言讨论