← 返回 Skills 市场
1740
总下载
0
收藏
8
当前安装
3
版本数
在 OpenClaw 中安装
/install midscene-android-automation
功能描述
Vision-driven Android device automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all v...
安全使用建议
What to consider before installing/using this skill:
- Metadata mismatch: The registry claims no required binaries or environment variables, but the SKILL.md requires Node (npx), ADB, and multiple model API keys/BASE_URLs. Ask the publisher to correct the metadata before trusting the skill.
- Sensitive data exposure: The workflow captures screenshots of your Android device and (by design) sends them to a model endpoint or Midscene service configured by MIDSCENE_MODEL_BASE_URL. Those screenshots can contain passwords, 2FA codes, messages, or other sensitive data. Only use with providers and endpoints whose privacy/security policies you trust.
- Dynamic code execution: npx will fetch and run @midscene/android from npm at runtime. If you want to proceed, inspect the package source (or run in an isolated environment) to verify behavior.
- Secrets handling: The skill suggests storing API keys in a .env file which Midscene will load. Ensure your .env contains only the intended keys and is not shared. Prefer provider-scoped API keys with minimal privileges and short lifetimes when possible.
- Test safely: If you must use the skill, test on an emulator or a disposable device to avoid leaking personal data. Monitor network traffic and limit which model endpoints you configure.
- Ask for provenance: There is no homepage or source listed. Prefer skills with a verifiable publisher, source repository, and documentation. If you cannot verify origin, exercise caution.
If you want help: I can extract the exact env vars and commands the SKILL.md requires, suggest safer configuration choices (e.g., local/private model endpoints, scoped API keys), or draft questions to ask the publisher to clarify metadata and data handling.
功能分析
Type: OpenClaw Skill
Name: midscene-android-automation
Version: 1.0.2
The skill facilitates Android automation by executing shell commands via `npx @midscene/android@1`, which involves downloading and running remote code at runtime. It requires users to provide sensitive LLM API keys through environment variables and grants the agent broad UI control over connected devices. While these capabilities are necessary for the stated purpose of vision-driven automation, the reliance on `Bash` and remote package execution represents a significant high-risk attack surface (SKILL.md).
能力评估
Purpose & Capability
The SKILL.md describes vision-driven Android automation via Midscene and ADB which is internally coherent for the stated purpose. However the registry metadata claims no required binaries or env vars, while the instructions clearly require Node (npx @midscene/android@1), ADB usage (adb shell ...), and model credentials. The omitted declarations in the metadata are a mismatch that reduces transparency and is unexpected for this capability.
Instruction Scope
Instructions direct the agent to run npx commands, take screenshots, read saved image files, and supply model configuration (MIDSCENE_MODEL_*) including a BASE_URL. That implies screenshots and device UI content will be sent to remote model endpoints or Midscene services. Exfiltration of potentially sensitive screen contents to external providers is not called out in the registry metadata and is material to risk. The instructions also advise using ADB (powerful device control), which is consistent with purpose but increases the threat surface.
Install Mechanism
There is no install spec in the registry (instruction-only), which is lower friction. However the runtime uses npx to fetch @midscene/android at invocation time — this will download and run code from npm dynamically. The metadata did not list Node/npm as a required binary. Dynamically pulling code at runtime is normal for npx but worth noting because it executes third-party code on demand.
Credentials
The SKILL.md requires multiple environment variables (MIDSCENE_MODEL_API_KEY, MIDSCENE_MODEL_NAME, MIDSCENE_MODEL_BASE_URL, MIDSCENE_MODEL_FAMILY, etc.) and suggests provider-specific keys (Google, Alibaba, OpenRouter, Doubao). These are appropriate for remote-model driven automation, but the skill registry declared 'none' for required env vars/primary credential. In addition, placing keys in a .env file (as recommended) means the tool will read local secret files; that access is not declared in metadata and could expose unrelated secrets if present.
Persistence & Privilege
The skill is instruction-only, has no install spec, always:false, and does not request to modify other skills or system-wide settings. It does require ADB access at runtime but does not request forced persistent inclusion or elevated platform privileges.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install midscene-android-automation - 安装完成后,直接呼叫该 Skill 的名称或使用
/midscene-android-automation触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.2
- Added rule: always summarize and report automation results to the user after completing tasks; never end silently.
- Updated model configuration section with new examples for Qwen 3.5 and Doubao Seed 2.0 Lite, replacing older Qwen3-VL and Doubao Seed 1.6 references.
- Enhanced workflow pattern and best practices to mandate proactive user reporting, specifying what information must be included in the summary.
- Minor clarifications and improvements to existing documentation for prerequisites and expected output.
v1.0.1
**Major update: Moves to vision-first, prompt-driven UI automation and enhances setup instructions.**
- Introduced vision-based automation with no reliance on DOM or accessibility labels; all UI interaction is screenshot-driven.
- Expanded and clarified environment variable setup for multiple AI model providers (Doubao, Gemini, Qwen3-VL, Zhipu, etc.).
- Simplified the recommended workflow: focus actions into high-level `act` prompts instead of step-by-step commands.
- Added best practices for speed and reliability, such as pre-launching target apps via ADB and batching actions in single prompts.
- Updated troubleshooting and usage documentation for clarity, reflecting current CLI command usage and model setup conventions.
v1.0.0
Initial release of Android Device Automation skill using Midscene.
- Enables automation of Android devices via natural language commands using Midscene and adb.
- Supports taps, swipes, text input, app launches, screenshots, and more.
- Provides strict workflow and CLI usage guidelines to ensure reliable operation.
- Emphasizes best practices for screenshot-driven automation.
- Includes troubleshooting section for common device and connectivity issues.
元数据
常见问题
Midscene Automations Skills for Android 是什么?
Vision-driven Android device automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all v... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1740 次。
如何安装 Midscene Automations Skills for Android?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install midscene-android-automation」即可一键安装,无需额外配置。
Midscene Automations Skills for Android 是免费的吗?
是的,Midscene Automations Skills for Android 完全免费(开源免费),可自由下载、安装和使用。
Midscene Automations Skills for Android 支持哪些平台?
Midscene Automations Skills for Android 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Midscene Automations Skills for Android?
由 Leyang(@quanru)开发并维护,当前版本 v1.0.2。
推荐 Skills