← 返回 Skills 市场
alexbingquanxu-cpu

Desktop Control Custom

作者 alexbingquanxu-cpu · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
124
总下载
0
收藏
2
当前安装
1
版本数
在 OpenClaw 中安装
/install desktop-control-custom
功能描述
Advanced desktop automation with mouse, keyboard, and screen control
使用说明 (SKILL.md)

\r \r

Desktop Control Skill\r

\r The most advanced desktop automation skill for OpenClaw. Provides pixel-perfect mouse control, lightning-fast keyboard input, screen capture, window management, and clipboard operations.\r \r

🎯 Features\r

\r

Mouse Control\r

  • Absolute positioning - Move to exact coordinates\r
  • Relative movement - Move from current position\r
  • Smooth movement - Natural, human-like mouse paths\r
  • Click types - Left, right, middle, double, triple clicks\r
  • Drag & drop - Drag from point A to point B\r
  • Scroll - Vertical and horizontal scrolling\r
  • Position tracking - Get current mouse coordinates\r \r

Keyboard Control\r

  • Text typing - Fast, accurate text input\r
  • Hotkeys - Execute keyboard shortcuts (Ctrl+C, Win+R, etc.)\r
  • Special keys - Enter, Tab, Escape, Arrow keys, F-keys\r
  • Key combinations - Multi-key press combinations\r
  • Hold & release - Manual key state control\r
  • Typing speed - Configurable WPM (instant to human-like)\r \r

Screen Operations\r

  • Screenshot - Capture entire screen or regions\r
  • Image recognition - Find elements on screen (via OpenCV)\r
  • Color detection - Get pixel colors at coordinates\r
  • Multi-monitor - Support for multiple displays\r \r

Window Management\r

  • Window list - Get all open windows\r
  • Activate window - Bring window to front\r
  • Window info - Get position, size, title\r
  • Minimize/Maximize - Control window states\r \r

Safety Features\r

  • Failsafe - Move mouse to corner to abort\r
  • Pause control - Emergency stop mechanism\r
  • Approval mode - Require confirmation for actions\r
  • Bounds checking - Prevent out-of-screen operations\r
  • Logging - Track all automation actions\r \r ---\r \r

🚀 Quick Start\r

\r

Installation\r

\r First, install required dependencies:\r \r

pip install pyautogui pillow opencv-python pygetwindow\r
```\r
\r
### Basic Usage\r
\r
```python\r
from skills.desktop_control import DesktopController\r
\r
# Initialize controller\r
dc = DesktopController(failsafe=True)\r
\r
# Mouse operations\r
dc.move_mouse(500, 300)  # Move to coordinates\r
dc.click()  # Left click at current position\r
dc.click(100, 200, button="right")  # Right click at position\r
\r
# Keyboard operations\r
dc.type_text("Hello from OpenClaw!")\r
dc.hotkey("ctrl", "c")  # Copy\r
dc.press("enter")\r
\r
# Screen operations\r
screenshot = dc.screenshot()\r
position = dc.get_mouse_position()\r
```\r
\r
---\r
\r
## 📋 Complete API Reference\r
\r
### Mouse Functions\r
\r
#### `move_mouse(x, y, duration=0, smooth=True)`\r
Move mouse to absolute screen coordinates.\r
\r
**Parameters:**\r
- `x` (int): X coordinate (pixels from left)\r
- `y` (int): Y coordinate (pixels from top)\r
- `duration` (float): Movement time in seconds (0 = instant, 0.5 = smooth)\r
- `smooth` (bool): Use bezier curve for natural movement\r
\r
**Example:**\r
```python\r
# Instant movement\r
dc.move_mouse(1000, 500)\r
\r
# Smooth 1-second movement\r
dc.move_mouse(1000, 500, duration=1.0)\r
```\r
\r
#### `move_relative(x_offset, y_offset, duration=0)`\r
Move mouse relative to current position.\r
\r
**Parameters:**\r
- `x_offset` (int): Pixels to move horizontally (positive = right)\r
- `y_offset` (int): Pixels to move vertically (positive = down)\r
- `duration` (float): Movement time in seconds\r
\r
**Example:**\r
```python\r
# Move 100px right, 50px down\r
dc.move_relative(100, 50, duration=0.3)\r
```\r
\r
#### `click(x=None, y=None, button='left', clicks=1, interval=0.1)`\r
Perform mouse click.\r
\r
**Parameters:**\r
- `x, y` (int, optional): Coordinates to click (None = current position)\r
- `button` (str): 'left', 'right', 'middle'\r
- `clicks` (int): Number of clicks (1 = single, 2 = double)\r
- `interval` (float): Delay between multiple clicks\r
\r
**Example:**\r
```python\r
# Simple left click\r
dc.click()\r
\r
# Double-click at specific position\r
dc.click(500, 300, clicks=2)\r
\r
# Right-click\r
dc.click(button='right')\r
```\r
\r
#### `drag(start_x, start_y, end_x, end_y, duration=0.5, button='left')`\r
Drag and drop operation.\r
\r
**Parameters:**\r
- `start_x, start_y` (int): Starting coordinates\r
- `end_x, end_y` (int): Ending coordinates\r
- `duration` (float): Drag duration\r
- `button` (str): Mouse button to use\r
\r
**Example:**\r
```python\r
# Drag file from desktop to folder\r
dc.drag(100, 100, 500, 500, duration=1.0)\r
```\r
\r
#### `scroll(clicks, direction='vertical', x=None, y=None)`\r
Scroll mouse wheel.\r
\r
**Parameters:**\r
- `clicks` (int): Scroll amount (positive = up/left, negative = down/right)\r
- `direction` (str): 'vertical' or 'horizontal'\r
- `x, y` (int, optional): Position to scroll at\r
\r
**Example:**\r
```python\r
# Scroll down 5 clicks\r
dc.scroll(-5)\r
\r
# Scroll up 10 clicks\r
dc.scroll(10)\r
\r
# Horizontal scroll\r
dc.scroll(5, direction='horizontal')\r
```\r
\r
#### `get_mouse_position()`\r
Get current mouse coordinates.\r
\r
**Returns:** `(x, y)` tuple\r
\r
**Example:**\r
```python\r
x, y = dc.get_mouse_position()\r
print(f"Mouse is at: {x}, {y}")\r
```\r
\r
---\r
\r
### Keyboard Functions\r
\r
#### `type_text(text, interval=0, wpm=None)`\r
Type text with configurable speed.\r
\r
**Parameters:**\r
- `text` (str): Text to type\r
- `interval` (float): Delay between keystrokes (0 = instant)\r
- `wpm` (int, optional): Words per minute (overrides interval)\r
\r
**Example:**\r
```python\r
# Instant typing\r
dc.type_text("Hello World")\r
\r
# Human-like typing at 60 WPM\r
dc.type_text("Hello World", wpm=60)\r
\r
# Slow typing with 0.1s between keys\r
dc.type_text("Hello World", interval=0.1)\r
```\r
\r
#### `press(key, presses=1, interval=0.1)`\r
Press and release a key.\r
\r
**Parameters:**\r
- `key` (str): Key name (see Key Names section)\r
- `presses` (int): Number of times to press\r
- `interval` (float): Delay between presses\r
\r
**Example:**\r
```python\r
# Press Enter\r
dc.press('enter')\r
\r
# Press Space 3 times\r
dc.press('space', presses=3)\r
\r
# Press Down arrow\r
dc.press('down')\r
```\r
\r
#### `hotkey(*keys, interval=0.05)`\r
Execute keyboard shortcut.\r
\r
**Parameters:**\r
- `*keys` (str): Keys to press together\r
- `interval` (float): Delay between key presses\r
\r
**Example:**\r
```python\r
# Copy (Ctrl+C)\r
dc.hotkey('ctrl', 'c')\r
\r
# Paste (Ctrl+V)\r
dc.hotkey('ctrl', 'v')\r
\r
# Open Run dialog (Win+R)\r
dc.hotkey('win', 'r')\r
\r
# Save (Ctrl+S)\r
dc.hotkey('ctrl', 's')\r
\r
# Select All (Ctrl+A)\r
dc.hotkey('ctrl', 'a')\r
```\r
\r
#### `key_down(key)` / `key_up(key)`\r
Manually control key state.\r
\r
**Example:**\r
```python\r
# Hold Shift\r
dc.key_down('shift')\r
dc.type_text("hello")  # Types "HELLO"\r
dc.key_up('shift')\r
\r
# Hold Ctrl and click (for multi-select)\r
dc.key_down('ctrl')\r
dc.click(100, 100)\r
dc.click(200, 100)\r
dc.key_up('ctrl')\r
```\r
\r
---\r
\r
### Screen Functions\r
\r
#### `screenshot(region=None, filename=None)`\r
Capture screen or region.\r
\r
**Parameters:**\r
- `region` (tuple, optional): (left, top, width, height) for partial capture\r
- `filename` (str, optional): Path to save image\r
\r
**Returns:** PIL Image object\r
\r
**Example:**\r
```python\r
# Full screen\r
img = dc.screenshot()\r
\r
# Save to file\r
dc.screenshot(filename="screenshot.png")\r
\r
# Capture specific region\r
img = dc.screenshot(region=(100, 100, 500, 300))\r
```\r
\r
#### `get_pixel_color(x, y)`\r
Get color of pixel at coordinates.\r
\r
**Returns:** RGB tuple `(r, g, b)`\r
\r
**Example:**\r
```python\r
r, g, b = dc.get_pixel_color(500, 300)\r
print(f"Color at (500, 300): RGB({r}, {g}, {b})")\r
```\r
\r
#### `find_on_screen(image_path, confidence=0.8)`\r
Find image on screen (requires OpenCV).\r
\r
**Parameters:**\r
- `image_path` (str): Path to template image\r
- `confidence` (float): Match threshold (0-1)\r
\r
**Returns:** `(x, y, width, height)` or None\r
\r
**Example:**\r
```python\r
# Find button on screen\r
location = dc.find_on_screen("button.png")\r
if location:\r
    x, y, w, h = location\r
    # Click center of found image\r
    dc.click(x + w//2, y + h//2)\r
```\r
\r
#### `get_screen_size()`\r
Get screen resolution.\r
\r
**Returns:** `(width, height)` tuple\r
\r
**Example:**\r
```python\r
width, height = dc.get_screen_size()\r
print(f"Screen: {width}x{height}")\r
```\r
\r
---\r
\r
### Window Functions\r
\r
#### `get_all_windows()`\r
List all open windows.\r
\r
**Returns:** List of window titles\r
\r
**Example:**\r
```python\r
windows = dc.get_all_windows()\r
for title in windows:\r
    print(f"Window: {title}")\r
```\r
\r
#### `activate_window(title_substring)`\r
Bring window to front by title.\r
\r
**Parameters:**\r
- `title_substring` (str): Part of window title to match\r
\r
**Example:**\r
```python\r
# Activate Chrome\r
dc.activate_window("Chrome")\r
\r
# Activate VS Code\r
dc.activate_window("Visual Studio Code")\r
```\r
\r
#### `get_active_window()`\r
Get currently focused window.\r
\r
**Returns:** Window title (str)\r
\r
**Example:**\r
```python\r
active = dc.get_active_window()\r
print(f"Active window: {active}")\r
```\r
\r
---\r
\r
### Clipboard Functions\r
\r
#### `copy_to_clipboard(text)`\r
Copy text to clipboard.\r
\r
**Example:**\r
```python\r
dc.copy_to_clipboard("Hello from OpenClaw!")\r
```\r
\r
#### `get_from_clipboard()`\r
Get text from clipboard.\r
\r
**Returns:** str\r
\r
**Example:**\r
```python\r
text = dc.get_from_clipboard()\r
print(f"Clipboard: {text}")\r
```\r
\r
---\r
\r
## ⌨️ Key Names Reference\r
\r
### Alphabet Keys\r
`'a'` through `'z'`\r
\r
### Number Keys\r
`'0'` through `'9'`\r
\r
### Function Keys\r
`'f1'` through `'f24'`\r
\r
### Special Keys\r
- `'enter'` / `'return'`\r
- `'esc'` / `'escape'`\r
- `'space'` / `'spacebar'`\r
- `'tab'`\r
- `'backspace'`\r
- `'delete'` / `'del'`\r
- `'insert'`\r
- `'home'`\r
- `'end'`\r
- `'pageup'` / `'pgup'`\r
- `'pagedown'` / `'pgdn'`\r
\r
### Arrow Keys\r
- `'up'` / `'down'` / `'left'` / `'right'`\r
\r
### Modifier Keys\r
- `'ctrl'` / `'control'`\r
- `'shift'`\r
- `'alt'`\r
- `'win'` / `'winleft'` / `'winright'`\r
- `'cmd'` / `'command'` (Mac)\r
\r
### Lock Keys\r
- `'capslock'`\r
- `'numlock'`\r
- `'scrolllock'`\r
\r
### Punctuation\r
- `'.'` / `','` / `'?'` / `'!'` / `';'` / `':'`\r
- `'['` / `']'` / `'{'` / `'}'`\r
- `'('` / `')'`\r
- `'+'` / `'-'` / `'*'` / `'/'` / `'='`\r
\r
---\r
\r
## 🛡️ Safety Features\r
\r
### Failsafe Mode\r
\r
Move mouse to **any corner** of the screen to abort all automation.\r
\r
```python\r
# Enable failsafe (enabled by default)\r
dc = DesktopController(failsafe=True)\r
```\r
\r
### Pause Control\r
\r
```python\r
# Pause all automation for 2 seconds\r
dc.pause(2.0)\r
\r
# Check if automation is safe to proceed\r
if dc.is_safe():\r
    dc.click(500, 500)\r
```\r
\r
### Approval Mode\r
\r
Require user confirmation before actions:\r
\r
```python\r
dc = DesktopController(require_approval=True)\r
\r
# This will ask for confirmation\r
dc.click(500, 500)  # Prompt: "Allow click at (500, 500)? [y/n]"\r
```\r
\r
---\r
\r
## 🎨 Advanced Examples\r
\r
### Example 1: Automated Form Filling\r
\r
```python\r
dc = DesktopController()\r
\r
# Click name field\r
dc.click(300, 200)\r
dc.type_text("John Doe", wpm=80)\r
\r
# Tab to next field\r
dc.press('tab')\r
dc.type_text("[email protected]", wpm=80)\r
\r
# Tab to password\r
dc.press('tab')\r
dc.type_text("SecurePassword123", wpm=60)\r
\r
# Submit form\r
dc.press('enter')\r
```\r
\r
### Example 2: Screenshot Region and Save\r
\r
```python\r
# Capture specific area\r
region = (100, 100, 800, 600)  # left, top, width, height\r
img = dc.screenshot(region=region)\r
\r
# Save with timestamp\r
import datetime\r
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")\r
img.save(f"capture_{timestamp}.png")\r
```\r
\r
### Example 3: Multi-File Selection\r
\r
```python\r
# Hold Ctrl and click multiple files\r
dc.key_down('ctrl')\r
dc.click(100, 200)  # First file\r
dc.click(100, 250)  # Second file\r
dc.click(100, 300)  # Third file\r
dc.key_up('ctrl')\r
\r
# Copy selected files\r
dc.hotkey('ctrl', 'c')\r
```\r
\r
### Example 4: Window Automation\r
\r
```python\r
# Activate Calculator\r
dc.activate_window("Calculator")\r
time.sleep(0.5)\r
\r
# Type calculation\r
dc.type_text("5+3=", interval=0.2)\r
time.sleep(0.5)\r
\r
# Take screenshot of result\r
dc.screenshot(filename="calculation_result.png")\r
```\r
\r
### Example 5: Drag & Drop File\r
\r
```python\r
# Drag file from source to destination\r
dc.drag(\r
    start_x=200, start_y=300,  # File location\r
    end_x=800, end_y=500,       # Folder location\r
    duration=1.0                 # Smooth 1-second drag\r
)\r
```\r
\r
---\r
\r
## ⚡ Performance Tips\r
\r
1. **Use instant movements** for speed: `duration=0`\r
2. **Batch operations** instead of individual calls\r
3. **Cache screen positions** instead of recalculating\r
4. **Disable failsafe** for maximum performance (use with caution)\r
5. **Use hotkeys** instead of menu navigation\r
\r
---\r
\r
## ⚠️ Important Notes\r
\r
- **Screen coordinates** start at (0, 0) in top-left corner\r
- **Multi-monitor setups** may have negative coordinates for secondary displays\r
- **Windows DPI scaling** may affect coordinate accuracy\r
- **Failsafe corners** are: (0,0), (width-1, 0), (0, height-1), (width-1, height-1)\r
- **Some applications** may block simulated input (games, secure apps)\r
\r
---\r
\r
## 🔧 Troubleshooting\r
\r
### Mouse not moving to correct position\r
- Check DPI scaling settings\r
- Verify screen resolution matches expectations\r
- Use `get_screen_size()` to confirm dimensions\r
\r
### Keyboard input not working\r
- Ensure target application has focus\r
- Some apps require admin privileges\r
- Try increasing `interval` for reliability\r
\r
### Failsafe triggering accidentally\r
- Increase screen border tolerance\r
- Move mouse away from corners during normal use\r
- Disable if needed: `DesktopController(failsafe=False)`\r
\r
### Permission errors\r
- Run Python with administrator privileges for some operations\r
- Some secure applications block automation\r
\r
---\r
\r
## 📦 Dependencies\r
\r
- **PyAutoGUI** - Core automation engine\r
- **Pillow** - Image processing\r
- **OpenCV** (optional) - Image recognition\r
- **PyGetWindow** - Window management\r
\r
Install all:\r
```bash\r
pip install pyautogui pillow opencv-python pygetwindow\r
```\r
\r
---\r
\r
**Built for OpenClaw** - The ultimate desktop automation companion 🦞\r
安全使用建议
Plain-language checklist before installing: - Function matches description: This skill will control your mouse/keyboard, take screenshots, read/write the clipboard, and can run planned/automated tasks. That's expected for a desktop-automation skill. - Metadata and packaging: The registry metadata (owner/slug) and the included _meta.json/module names differ. Confirm which package/module path you should import (e.g., 'desktop_control' vs 'skills.desktop_control') and that the publisher is who you expect. - Privacy & safety: The AI agent can capture screenshots and clipboard contents. Don’t run it while sensitive documents, password managers, banking sessions, or other private material are visible. - Failsafe & approval: Use failsafe=True and consider require_approval=True while testing so you can abort (move mouse to corner). Test demos in a disposable VM or a non-critical account first. - Installation: SKILL.md recommends pip installing third-party packages. Review those packages and install them in a virtualenv. The skill does not pull code from remote URLs at install time. - Autonomous use: If you plan to let the agent run autonomously, restrict what tasks it can run and monitor its action history/snapshots. Don’t give it long-running, unsupervised permissions on a machine with sensitive data. - If unsure: Run the demo scripts first to verify behavior, or inspect the remaining truncated code paths for any network calls before granting broad usage. If you want, I can list the exact lines/files that show screenshot/clipboard access, or scan the remaining truncated portions for network/shell calls before you install.
功能分析
Type: OpenClaw Skill Name: desktop-control-custom Version: 1.0.0 The skill provides extensive desktop automation capabilities, including full mouse/keyboard control, screen capture, and clipboard access via 'pyautogui' and 'pygetwindow'. While these features are aligned with the stated purpose of desktop control, they represent high-risk capabilities, particularly the 'ai_agent.py' module which can autonomously launch applications by simulating the Windows Run dialog ('Win+R') and typing commands. Although the code includes safety mechanisms like a failsafe and an optional manual approval mode, the broad level of control over the host system and the potential for the AI agent to be manipulated into executing arbitrary commands via prompt injection warrants a suspicious classification.
能力评估
Purpose & Capability
The code and SKILL.md implement desktop automation (pyautogui, screenshots, window management, clipboard) and an autonomous AI agent for planning—this matches the stated purpose. Minor inconsistencies: registry metadata (slug/ownerId) does not match the included _meta.json, and the package/module import paths (skills.desktop_control vs desktop_control) differ from the registry slug 'desktop-control-custom'. These are likely packaging/metadata issues but worth confirming before installation.
Instruction Scope
Runtime instructions focus on installing desktop automation dependencies and using the DesktopController/AIDesktopAgent APIs. The skill instructs the agent to capture screenshots, read/modify the clipboard, list and activate windows, and control mouse/keyboard — all expected for this purpose. There are no instructions to read unrelated system credentials or to transmit data to external endpoints.
Install Mechanism
No automated install spec is included in the registry (files are shipped with the skill). SKILL.md recommends installing dependencies with pip (pyautogui, pillow, opencv-python, pygetwindow, etc.). There are no downloads from obscure URLs or extract steps in the registry metadata; this is low-risk, but note the user must run pip installs themselves.
Credentials
The skill declares no required environment variables, credentials, or config paths. The functionality (desktop control, screenshots, clipboard) does not require external secrets. This is proportional to the described features.
Persistence & Privilege
always:false (normal). The skill includes an autonomous AI agent (ai_agent.py) capable of planning and executing sequences of desktop actions and capturing screenshots. Autonomous invocation is the platform default; combined with the agent's capabilities this raises expected privacy/impact considerations (it can act on the user's desktop and capture screen contents). This is coherent with the skill's purpose but worth caution.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install desktop-control-custom
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /desktop-control-custom 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Desktop-control-custom 1.0.0 - Initial release with advanced desktop automation features for OpenClaw. - Provides pixel-perfect mouse control, including absolute/relative moves, smooth paths, multiple click types, drag & drop, scroll, and failsafe options. - Comprehensive keyboard automation: fast typing, hotkey support, key combinations, manual key control, and adjustable typing speed. - Screen capture utilities, pixel color detection, image recognition (OpenCV), and multi-monitor support. - Window management: list, activate, get info, minimize/maximize windows. - Safety mechanisms: emergency stop, approval mode, bounds checking, and detailed action logging. - Includes detailed API documentation and quick start instructions in SKILL.md.
元数据
Slug desktop-control-custom
版本 1.0.0
许可证 MIT-0
累计安装 2
当前安装数 2
历史版本数 1
常见问题

Desktop Control Custom 是什么?

Advanced desktop automation with mouse, keyboard, and screen control. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 124 次。

如何安装 Desktop Control Custom?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install desktop-control-custom」即可一键安装,无需额外配置。

Desktop Control Custom 是免费的吗?

是的,Desktop Control Custom 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Desktop Control Custom 支持哪些平台?

Desktop Control Custom 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Desktop Control Custom?

由 alexbingquanxu-cpu(@alexbingquanxu-cpu)开发并维护,当前版本 v1.0.0。

💬 留言讨论