← 返回 Skills 市场
txmonkey

adb-phone-control

作者 txmonkey · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ✓ 安全检测通过
143
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install adb-phone-control
功能描述
Use when the user asks to control, operate, or automate an Android phone via ADB — tapping, swiping, typing, launching apps, or any UI interaction on a conne...
使用说明 (SKILL.md)

ADB Phone Control

Control Android devices through ADB with a structured observe-locate-act-verify loop.

Requirements

  • adb — Android Debug Bridge, must be in PATH
  • python3 — Required for app_explorer.py
  • ADB_OUTPUT_DIR (optional env var) — Directory for saving screenshots and UI dumps; defaults to current working directory

Permissions Used

This skill executes the following on the connected Android device:

  • adb shell input — tap, swipe, text input
  • adb shell uiautomator dump — UI hierarchy extraction
  • adb shell screencap — screen capture
  • adb shell am broadcast — ADBKeyboard IME input (for CJK text)
  • adb shell service call clipboard — clipboard-based text input fallback

Prerequisites

Before any operation, verify device connection:

adb devices

If no device found, instruct the user to:

  1. Connect via USB and enable USB Debugging
  2. Or connect wirelessly: adb connect \x3Cip>:5555

Core Principle

NEVER guess coordinates from screenshots. ALWAYS use UI hierarchy as the primary locator.

Screenshots are for human-readable context and visual verification. UI dumps give exact pixel bounds.

Operation Loop

Every interaction follows this cycle:

┌─────────────────────────────────────────┐
│  1. OBSERVE  — dump UI + screenshot     │
│  2. LOCATE   — find element by text/id  │
│  3. ACT      — tap / swipe / type       │
│  4. VERIFY   — screenshot + dump again  │
│  5. REPEAT   — next action or done      │
└─────────────────────────────────────────┘

Do NOT skip the VERIFY step. UI transitions may take time; always confirm before proceeding.

Helper Functions

Source the helper script before starting any operation session:

source "$(dirname "${BASH_SOURCE[0]:-$0}")/adb-helpers.sh" 2>/dev/null || source ./adb-helpers.sh

Available Functions

Function Usage Description
adb_dump adb_dump Dump UI hierarchy to /tmp/ui_dump.xml
adb_screenshot adb_screenshot Capture screen to /tmp/adb_screen.png
adb_observe adb_observe Dump UI + screenshot in one call
adb_tap_text "Submit" Find element by text, tap center
adb_tap_id "btn_send" Find element by resource-id, tap center
adb_tap_xy 540 1200 Tap exact coordinates
adb_swipe x1 y1 x2 y2 [ms] Swipe between points (default 300ms)
adb_input_text "hello" Type text (supports spaces and CJK)
adb_key \x3Ckeycode> Send keyevent (BACK, HOME, ENTER, etc.)
adb_hide_keyboard Press BACK to dismiss keyboard
adb_scroll_down Swipe up to scroll content down
adb_scroll_up Swipe down to scroll content up
adb_long_press x y [ms] Long press at coordinates (default 1000ms)
adb_wait [seconds] Sleep before next action (default 1s)
adb_screen_size Get device screen resolution
adb_launch_app \x3Cpackage> Launch app by package name
adb_find_package \x3Ckeyword> Search installed packages by keyword
adb_bounds_center "bounds_string" Parse "[x1,y1][x2,y2]" → center x y

Element Lookup Details

adb_tap_text and adb_tap_id work by:

  1. Running adb_dump to get fresh UI hierarchy
  2. Parsing the XML for matching text= or resource-id= attributes
  3. Extracting the bounds="[x1,y1][x2,y2]" attribute
  4. Computing center point: ((x1+x2)/2, (y1+y2)/2)
  5. Executing adb shell input tap \x3Ccx> \x3Ccy>

If multiple matches are found, the function taps the first match and prints a warning. If no match is found, the function prints an error — fall back to adb_screenshot + Read tool for visual inspection.

Standard Operating Procedure

Phase 1: Setup

# Source helpers
source "$(dirname "${BASH_SOURCE[0]:-$0}")/adb-helpers.sh" 2>/dev/null || source ./adb-helpers.sh

# Verify connection
adb devices

# Get screen resolution (important for swipe calculations)
adb_screen_size

Phase 2: Navigate & Operate

For each interaction step:

# 1. Observe current state
adb_observe
# Then read /tmp/adb_screen.png with the Read tool to see the screen

# 2. Locate and act (prefer text/id over raw coordinates)
adb_tap_text "Create"
# or: adb_tap_id "iv_send"
# or as last resort: adb_tap_xy 540 2009

# 3. Wait for transition
adb_wait 2

# 4. Verify result
adb_screenshot
# Then read /tmp/adb_screen.png to confirm the action worked

Phase 3: Text Input

# Tap the input field first
adb_tap_text "Search..."
adb_wait 1

# Type text
adb_input_text "Hello World"

# Hide keyboard before tapping other elements
adb_hide_keyboard
adb_wait 1

# Now safe to tap other buttons
adb_tap_text "Send"

Critical Rules

1. UI Dump First, Screenshot Second

  • uiautomator dump gives exact bounds, element states (enabled/focused/clickable), text content, and resource IDs
  • Screenshots only for: visual verification, understanding layout context, or when UI dump fails (e.g., animations, WebView content)
  • When UI dump returns elements with NAF="true", the element has No Accessible Framework info — use screenshot + coordinates as fallback

2. Keyboard Awareness

  • Always hide keyboard before tapping non-input elements. The keyboard shifts the layout, making UI dump bounds stale.
  • After typing, call adb_hide_keyboard then adb_dump before tapping anything else.
  • If uiautomator dump returns ERROR: could not get idle state, the keyboard animation may still be running — wait 1s and retry.

3. Wait Strategy

  • After tap: wait 1s before next dump/screenshot
  • After launching app: wait 2-3s
  • After page navigation: wait 2s
  • After typing: wait 0.5s
  • If UI hasn't changed after action: wait longer, up to 5s, then re-check
  • Never blindly chain actions without verification

4. Chinese / CJK Text Input

adb shell input text does not support CJK characters natively. The helper adb_input_text handles this by:

  • Using adb shell am broadcast with ADBKeyboard if available
  • Falling back to clipboard-based input: copy to clipboard via adb shell service call clipboard, then paste

If ADB Keyboard IME is installed (com.android.adbkeyboard), enable it:

adb shell ime set com.android.adbkeyboard/.AdbIME

5. Coordinate System

  • All coordinates are in physical pixels matching the device resolution
  • adb shell wm size returns the canonical resolution (e.g., 1080x2340)
  • Screenshot pixel dimensions may differ from device resolution — never estimate coordinates from screenshot pixel positions
  • Always derive coordinates from uiautomator dump bounds

6. Handling Failures

If an action doesn't produce the expected result:

  1. Re-dump UI hierarchy — the element may have moved or state changed
  2. Take a screenshot — visual context may reveal popups, loading states, or errors
  3. Check if the element is enabled="true" and clickable="true" before tapping
  4. If element is not found by text, try partial match or search by resource-id
  5. If the app is in a WebView, UI dump may not capture web elements — use screenshot + coordinate estimation as fallback

7. App Launch

Prefer adb_find_package + adb_launch_app over monkey command:

# Find the app
adb_find_package "wechat"
# Launch it
adb_launch_app "com.tencent.mm"

Limitations

  • uiautomator dump doesn't work during animations — wait for idle state
  • WebView/Flutter/game content may not appear in UI hierarchy — use screenshot-based approach
  • Some custom views may have empty text and no resource-id — use bounds + screenshot cross-reference
  • Maximum ~100 actions per task is a reasonable limit to avoid infinite loops
安全使用建议
This skill appears to do what it claims (ADB-based UI automation). However: 1) automated exploration will click/tap elements and can perform destructive or privacy-sensitive actions on the device (send messages, change settings, make purchases). Only run it on a test device or after explicitly reviewing which package and actions will be targeted; prefer manual observe-first runs. 2) Review the included scripts (adb-helpers.sh and app_explorer.py) before use to confirm no unexpected commands for your environment. 3) Keep ADB enabled only when needed and disconnect when done. 4) If you intend to let an agent run this autonomously, restrict it to devices you control and consider disabling autonomous invocation until you’ve tested behavior interactively.
功能分析
Type: OpenClaw Skill Name: adb-phone-control Version: 1.0.1 The adb-phone-control skill bundle provides a legitimate set of tools for automating and exploring Android devices via the Android Debug Bridge (ADB). It includes a bash helper script (adb-helpers.sh) for UI interactions like tapping elements by text/ID and a Python script (app_explorer.py) for recursively crawling an app's UI to generate a navigation tree. The code uses standard ADB commands for screen capture, UI dumping, and input events, with no evidence of data exfiltration, malicious persistence, or prompt injection.
能力评估
Purpose & Capability
Name/description (ADB control) matches the code and SKILL.md: both require adb and python3 and implement UI dump, screenshot, tap/swipe/input, and an app explorer. No unrelated binaries or credentials are requested.
Instruction Scope
SKILL.md and helper scripts instruct the agent to dump UI, pull screenshots/dumps to local path, and run automated taps/swipes/inputs. This is expected for device automation, but the app_explorer recursively clicks elements which can trigger side effects (sending messages, purchases, or destructive actions). The instructions also tell the agent to 'read' screenshots for verification — that will expose device screen content to whatever model/tool is used to view images.
Install Mechanism
Instruction-only install (no download/extract). Two local code files are included and executed by sourcing/running them. No external installers or remote downloads are used by the skill itself.
Credentials
No secret environment variables or credentials are requested; ADB_OUTPUT_DIR is optional and reasonable. The skill does not ask for cloud keys or unrelated secrets.
Persistence & Privilege
always is false and the skill does not request persistent system-wide privileges. It operates only by invoking adb commands against a connected device and writing output into a local output directory.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install adb-phone-control
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /adb-phone-control 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
**Changelog for adb-phone-control v1.0.1** - Added explicit requirements section: now documents dependencies on `adb`, `python3`, and optional `ADB_OUTPUT_DIR` environment variable. - Detailed permission usage for device operations via ADB, covering input events, UI dump, screencap, IME broadcast, and clipboard. - Minor documentation edits: new "Requirements" and "Permissions Used" sections, clarifications for user setup. - No code or functional changes; this is a documentation enhancement for improved clarity and onboarding.
v1.0.0
Initial release of ADB Phone Control. - Provides a structured approach for automating Android devices via ADB using an observe-locate-act-verify loop. - Emphasizes using UI hierarchy (via `uiautomator dump`) for locating elements; screenshots are only for verification or fallback. - Includes a comprehensive Bash helper toolset for tapping by text/id, swiping, typing (including CJK characters), scrolling, launching apps, and more. - Documents critical rules for reliable automation: waiting strategies, keyboard management, safe use of coordinates, and robust error handling. - Details standard operating procedures for setup, navigation, text input, and failure recovery.
元数据
Slug adb-phone-control
版本 1.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

adb-phone-control 是什么?

Use when the user asks to control, operate, or automate an Android phone via ADB — tapping, swiping, typing, launching apps, or any UI interaction on a conne... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 143 次。

如何安装 adb-phone-control?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install adb-phone-control」即可一键安装,无需额外配置。

adb-phone-control 是免费的吗?

是的,adb-phone-control 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

adb-phone-control 支持哪些平台?

adb-phone-control 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 adb-phone-control?

由 txmonkey(@txmonkey)开发并维护,当前版本 v1.0.1。

💬 留言讨论