desktop-control
/install abe-desktop-control
\r \r
Desktop Control Skill\r
\r The most advanced desktop automation skill for SkillBoss. Provides pixel-perfect mouse control, lightning-fast keyboard input, screen capture, window management, and clipboard operations.\r \r
🎯 Features\r
\r
Mouse Control\r
- ✅ Absolute positioning - Move to exact coordinates\r
- ✅ Relative movement - Move from current position\r
- ✅ Smooth movement - Natural, human-like mouse paths\r
- ✅ Click types - Left, right, middle, double, triple clicks\r
- ✅ Drag & drop - Drag from point A to point B\r
- ✅ Scroll - Vertical and horizontal scrolling\r
- ✅ Position tracking - Get current mouse coordinates\r \r
Keyboard Control\r
- ✅ Text typing - Fast, accurate text input\r
- ✅ Hotkeys - Execute keyboard shortcuts (Ctrl+C, Win+R, etc.)\r
- ✅ Special keys - Enter, Tab, Escape, Arrow keys, F-keys\r
- ✅ Key combinations - Multi-key press combinations\r
- ✅ Hold & release - Manual key state control\r
- ✅ Typing speed - Configurable WPM (instant to human-like)\r \r
Screen Operations\r
- ✅ Screenshot - Capture entire screen or regions\r
- ✅ Image recognition - Find elements on screen (via OpenCV)\r
- ✅ Color detection - Get pixel colors at coordinates\r
- ✅ Multi-monitor - Support for multiple displays\r \r
Window Management\r
- ✅ Window list - Get all open windows\r
- ✅ Activate window - Bring window to front\r
- ✅ Window info - Get position, size, title\r
- ✅ Minimize/Maximize - Control window states\r \r
Safety Features\r
- ✅ Failsafe - Move mouse to corner to abort\r
- ✅ Pause control - Emergency stop mechanism\r
- ✅ Approval mode - Require confirmation for actions\r
- ✅ Bounds checking - Prevent out-of-screen operations\r
- ✅ Logging - Track all automation actions\r \r ---\r \r
🚀 Quick Start\r
\r
Installation\r
\r First, install required dependencies:\r \r
pip install pyautogui pillow opencv-python pygetwindow\r
```\r
\r
### Basic Usage\r
\r
```python\r
from skills.desktop_control import DesktopController\r
\r
# Initialize controller\r
dc = DesktopController(failsafe=True)\r
\r
# Mouse operations\r
dc.move_mouse(500, 300) # Move to coordinates\r
dc.click() # Left click at current position\r
dc.click(100, 200, button="right") # Right click at position\r
\r
# Keyboard operations\r
dc.type_text("Hello from SkillBoss!")\r
dc.hotkey("ctrl", "c") # Copy\r
dc.press("enter")\r
\r
# Screen operations\r
screenshot = dc.screenshot()\r
position = dc.get_mouse_position()\r
```\r
\r
---\r
\r
## 📋 Complete API Reference\r
\r
### Mouse Functions\r
\r
#### `move_mouse(x, y, duration=0, smooth=True)`\r
Move mouse to absolute screen coordinates.\r
\r
**Parameters:**\r
- `x` (int): X coordinate (pixels from left)\r
- `y` (int): Y coordinate (pixels from top)\r
- `duration` (float): Movement time in seconds (0 = instant, 0.5 = smooth)\r
- `smooth` (bool): Use bezier curve for natural movement\r
\r
**Example:**\r
```python\r
# Instant movement\r
dc.move_mouse(1000, 500)\r
\r
# Smooth 1-second movement\r
dc.move_mouse(1000, 500, duration=1.0)\r
```\r
\r
#### `move_relative(x_offset, y_offset, duration=0)`\r
Move mouse relative to current position.\r
\r
**Parameters:**\r
- `x_offset` (int): Pixels to move horizontally (positive = right)\r
- `y_offset` (int): Pixels to move vertically (positive = down)\r
- `duration` (float): Movement time in seconds\r
\r
**Example:**\r
```python\r
# Move 100px right, 50px down\r
dc.move_relative(100, 50, duration=0.3)\r
```\r
\r
#### `click(x=None, y=None, button='left', clicks=1, interval=0.1)`\r
Perform mouse click.\r
\r
**Parameters:**\r
- `x, y` (int, optional): Coordinates to click (None = current position)\r
- `button` (str): 'left', 'right', 'middle'\r
- `clicks` (int): Number of clicks (1 = single, 2 = double)\r
- `interval` (float): Delay between multiple clicks\r
\r
**Example:**\r
```python\r
# Simple left click\r
dc.click()\r
\r
# Double-click at specific position\r
dc.click(500, 300, clicks=2)\r
\r
# Right-click\r
dc.click(button='right')\r
```\r
\r
#### `drag(start_x, start_y, end_x, end_y, duration=0.5, button='left')`\r
Drag and drop operation.\r
\r
**Parameters:**\r
- `start_x, start_y` (int): Starting coordinates\r
- `end_x, end_y` (int): Ending coordinates\r
- `duration` (float): Drag duration\r
- `button` (str): Mouse button to use\r
\r
**Example:**\r
```python\r
# Drag file from desktop to folder\r
dc.drag(100, 100, 500, 500, duration=1.0)\r
```\r
\r
#### `scroll(clicks, direction='vertical', x=None, y=None)`\r
Scroll mouse wheel.\r
\r
**Parameters:**\r
- `clicks` (int): Scroll amount (positive = up/left, negative = down/right)\r
- `direction` (str): 'vertical' or 'horizontal'\r
- `x, y` (int, optional): Position to scroll at\r
\r
**Example:**\r
```python\r
# Scroll down 5 clicks\r
dc.scroll(-5)\r
\r
# Scroll up 10 clicks\r
dc.scroll(10)\r
\r
# Horizontal scroll\r
dc.scroll(5, direction='horizontal')\r
```\r
\r
#### `get_mouse_position()`\r
Get current mouse coordinates.\r
\r
**Returns:** `(x, y)` tuple\r
\r
**Example:**\r
```python\r
x, y = dc.get_mouse_position()\r
print(f"Mouse is at: {x}, {y}")\r
```\r
\r
---\r
\r
### Keyboard Functions\r
\r
#### `type_text(text, interval=0, wpm=None)`\r
Type text with configurable speed.\r
\r
**Parameters:**\r
- `text` (str): Text to type\r
- `interval` (float): Delay between keystrokes (0 = instant)\r
- `wpm` (int, optional): Words per minute (overrides interval)\r
\r
**Example:**\r
```python\r
# Instant typing\r
dc.type_text("Hello World")\r
\r
# Human-like typing at 60 WPM\r
dc.type_text("Hello World", wpm=60)\r
\r
# Slow typing with 0.1s between keys\r
dc.type_text("Hello World", interval=0.1)\r
```\r
\r
#### `press(key, presses=1, interval=0.1)`\r
Press and release a key.\r
\r
**Parameters:**\r
- `key` (str): Key name (see Key Names section)\r
- `presses` (int): Number of times to press\r
- `interval` (float): Delay between presses\r
\r
**Example:**\r
```python\r
# Press Enter\r
dc.press('enter')\r
\r
# Press Space 3 times\r
dc.press('space', presses=3)\r
\r
# Press Down arrow\r
dc.press('down')\r
```\r
\r
#### `hotkey(*keys, interval=0.05)`\r
Execute keyboard shortcut.\r
\r
**Parameters:**\r
- `*keys` (str): Keys to press together\r
- `interval` (float): Delay between key presses\r
\r
**Example:**\r
```python\r
# Copy (Ctrl+C)\r
dc.hotkey('ctrl', 'c')\r
\r
# Paste (Ctrl+V)\r
dc.hotkey('ctrl', 'v')\r
\r
# Open Run dialog (Win+R)\r
dc.hotkey('win', 'r')\r
\r
# Save (Ctrl+S)\r
dc.hotkey('ctrl', 's')\r
\r
# Select All (Ctrl+A)\r
dc.hotkey('ctrl', 'a')\r
```\r
\r
#### `key_down(key)` / `key_up(key)`\r
Manually control key state.\r
\r
**Example:**\r
```python\r
# Hold Shift\r
dc.key_down('shift')\r
dc.type_text("hello") # Types "HELLO"\r
dc.key_up('shift')\r
\r
# Hold Ctrl and click (for multi-select)\r
dc.key_down('ctrl')\r
dc.click(100, 100)\r
dc.click(200, 100)\r
dc.key_up('ctrl')\r
```\r
\r
---\r
\r
### Screen Functions\r
\r
#### `screenshot(region=None, filename=None)`\r
Capture screen or region.\r
\r
**Parameters:**\r
- `region` (tuple, optional): (left, top, width, height) for partial capture\r
- `filename` (str, optional): Path to save image\r
\r
**Returns:** PIL Image object\r
\r
**Example:**\r
```python\r
# Full screen\r
img = dc.screenshot()\r
\r
# Save to file\r
dc.screenshot(filename="screenshot.png")\r
\r
# Capture specific region\r
img = dc.screenshot(region=(100, 100, 500, 300))\r
```\r
\r
#### `get_pixel_color(x, y)`\r
Get color of pixel at coordinates.\r
\r
**Returns:** RGB tuple `(r, g, b)`\r
\r
**Example:**\r
```python\r
r, g, b = dc.get_pixel_color(500, 300)\r
print(f"Color at (500, 300): RGB({r}, {g}, {b})")\r
```\r
\r
#### `find_on_screen(image_path, confidence=0.8)`\r
Find image on screen (requires OpenCV).\r
\r
**Parameters:**\r
- `image_path` (str): Path to template image\r
- `confidence` (float): Match threshold (0-1)\r
\r
**Returns:** `(x, y, width, height)` or None\r
\r
**Example:**\r
```python\r
# Find button on screen\r
location = dc.find_on_screen("button.png")\r
if location:\r
x, y, w, h = location\r
# Click center of found image\r
dc.click(x + w//2, y + h//2)\r
```\r
\r
#### `get_screen_size()`\r
Get screen resolution.\r
\r
**Returns:** `(width, height)` tuple\r
\r
**Example:**\r
```python\r
width, height = dc.get_screen_size()\r
print(f"Screen: {width}x{height}")\r
```\r
\r
---\r
\r
### Window Functions\r
\r
#### `get_all_windows()`\r
List all open windows.\r
\r
**Returns:** List of window titles\r
\r
**Example:**\r
```python\r
windows = dc.get_all_windows()\r
for title in windows:\r
print(f"Window: {title}")\r
```\r
\r
#### `activate_window(title_substring)`\r
Bring window to front by title.\r
\r
**Parameters:**\r
- `title_substring` (str): Part of window title to match\r
\r
**Example:**\r
```python\r
# Activate Chrome\r
dc.activate_window("Chrome")\r
\r
# Activate VS Code\r
dc.activate_window("Visual Studio Code")\r
```\r
\r
#### `get_active_window()`\r
Get currently focused window.\r
\r
**Returns:** Window title (str)\r
\r
**Example:**\r
```python\r
active = dc.get_active_window()\r
print(f"Active window: {active}")\r
```\r
\r
---\r
\r
### Clipboard Functions\r
\r
#### `copy_to_clipboard(text)`\r
Copy text to clipboard.\r
\r
**Example:**\r
```python\r
dc.copy_to_clipboard("Hello from SkillBoss!")\r
```\r
\r
#### `get_from_clipboard()`\r
Get text from clipboard.\r
\r
**Returns:** str\r
\r
**Example:**\r
```python\r
text = dc.get_from_clipboard()\r
print(f"Clipboard: {text}")\r
```\r
\r
---\r
\r
## ⌨️ Key Names Reference\r
\r
### Alphabet Keys\r
`'a'` through `'z'`\r
\r
### Number Keys\r
`'0'` through `'9'`\r
\r
### Function Keys\r
`'f1'` through `'f24'`\r
\r
### Special Keys\r
- `'enter'` / `'return'`\r
- `'esc'` / `'escape'`\r
- `'space'` / `'spacebar'`\r
- `'tab'`\r
- `'backspace'`\r
- `'delete'` / `'del'`\r
- `'insert'`\r
- `'home'`\r
- `'end'`\r
- `'pageup'` / `'pgup'`\r
- `'pagedown'` / `'pgdn'`\r
\r
### Arrow Keys\r
- `'up'` / `'down'` / `'left'` / `'right'`\r
\r
### Modifier Keys\r
- `'ctrl'` / `'control'`\r
- `'shift'`\r
- `'alt'`\r
- `'win'` / `'winleft'` / `'winright'`\r
- `'cmd'` / `'command'` (Mac)\r
\r
### Lock Keys\r
- `'capslock'`\r
- `'numlock'`\r
- `'scrolllock'`\r
\r
### Punctuation\r
- `'.'` / `','` / `'?'` / `'!'` / `';'` / `':'`\r
- `'['` / `']'` / `'{'` / `'}'`\r
- `'('` / `')'`\r
- `'+'` / `'-'` / `'*'` / `'/'` / `'='`\r
\r
---\r
\r
## 🛡️ Safety Features\r
\r
### Failsafe Mode\r
\r
Move mouse to **any corner** of the screen to abort all automation.\r
\r
```python\r
# Enable failsafe (enabled by default)\r
dc = DesktopController(failsafe=True)\r
```\r
\r
### Pause Control\r
\r
```python\r
# Pause all automation for 2 seconds\r
dc.pause(2.0)\r
\r
# Check if automation is safe to proceed\r
if dc.is_safe():\r
dc.click(500, 500)\r
```\r
\r
### Approval Mode\r
\r
Require user confirmation before actions:\r
\r
```python\r
dc = DesktopController(require_approval=True)\r
\r
# This will ask for confirmation\r
dc.click(500, 500) # Prompt: "Allow click at (500, 500)? [y/n]"\r
```\r
\r
---\r
\r
## 🎨 Advanced Examples\r
\r
### Example 1: Automated Form Filling\r
\r
```python\r
dc = DesktopController()\r
\r
# Click name field\r
dc.click(300, 200)\r
dc.type_text("John Doe", wpm=80)\r
\r
# Tab to next field\r
dc.press('tab')\r
dc.type_text("[email protected]", wpm=80)\r
\r
# Tab to password\r
dc.press('tab')\r
dc.type_text("SecurePassword123", wpm=60)\r
\r
# Submit form\r
dc.press('enter')\r
```\r
\r
### Example 2: Screenshot Region and Save\r
\r
```python\r
# Capture specific area\r
region = (100, 100, 800, 600) # left, top, width, height\r
img = dc.screenshot(region=region)\r
\r
# Save with timestamp\r
import datetime\r
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")\r
img.save(f"capture_{timestamp}.png")\r
```\r
\r
### Example 3: Multi-File Selection\r
\r
```python\r
# Hold Ctrl and click multiple files\r
dc.key_down('ctrl')\r
dc.click(100, 200) # First file\r
dc.click(100, 250) # Second file\r
dc.click(100, 300) # Third file\r
dc.key_up('ctrl')\r
\r
# Copy selected files\r
dc.hotkey('ctrl', 'c')\r
```\r
\r
### Example 4: Window Automation\r
\r
```python\r
# Activate Calculator\r
dc.activate_window("Calculator")\r
time.sleep(0.5)\r
\r
# Type calculation\r
dc.type_text("5+3=", interval=0.2)\r
time.sleep(0.5)\r
\r
# Take screenshot of result\r
dc.screenshot(filename="calculation_result.png")\r
```\r
\r
### Example 5: Drag & Drop File\r
\r
```python\r
# Drag file from source to destination\r
dc.drag(\r
start_x=200, start_y=300, # File location\r
end_x=800, end_y=500, # Folder location\r
duration=1.0 # Smooth 1-second drag\r
)\r
```\r
\r
---\r
\r
## ⚡ Performance Tips\r
\r
1. **Use instant movements** for speed: `duration=0`\r
2. **Batch operations** instead of individual calls\r
3. **Cache screen positions** instead of recalculating\r
4. **Disable failsafe** for maximum performance (use with caution)\r
5. **Use hotkeys** instead of menu navigation\r
\r
---\r
\r
## ⚠️ Important Notes\r
\r
- **Screen coordinates** start at (0, 0) in top-left corner\r
- **Multi-monitor setups** may have negative coordinates for secondary displays\r
- **Windows DPI scaling** may affect coordinate accuracy\r
- **Failsafe corners** are: (0,0), (width-1, 0), (0, height-1), (width-1, height-1)\r
- **Some applications** may block simulated input (games, secure apps)\r
\r
---\r
\r
## 🔧 Troubleshooting\r
\r
### Mouse not moving to correct position\r
- Check DPI scaling settings\r
- Verify screen resolution matches expectations\r
- Use `get_screen_size()` to confirm dimensions\r
\r
### Keyboard input not working\r
- Ensure target application has focus\r
- Some apps require admin privileges\r
- Try increasing `interval` for reliability\r
\r
### Failsafe triggering accidentally\r
- Increase screen border tolerance\r
- Move mouse away from corners during normal use\r
- Disable if needed: `DesktopController(failsafe=False)`\r
\r
### Permission errors\r
- Run Python with administrator privileges for some operations\r
- Some secure applications block automation\r
\r
---\r
\r
## 📦 Dependencies\r
\r
- **PyAutoGUI** - Core automation engine\r
- **Pillow** - Image processing\r
- **OpenCV** (optional) - Image recognition\r
- **PyGetWindow** - Window management\r
\r
Install all:\r
```bash\r
pip install pyautogui pillow opencv-python pygetwindow\r
```\r
\r
---\r
\r
**Built for SkillBoss** - The ultimate desktop automation companion\r
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install abe-desktop-control - After installation, invoke the skill by name or use
/abe-desktop-control - Provide required inputs per the skill's parameter spec and get structured output
What is desktop-control?
Advanced desktop automation with mouse, keyboard, and screen control. It is an AI Agent Skill for Claude Code / OpenClaw, with 79 downloads so far.
How do I install desktop-control?
Run "/install abe-desktop-control" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is desktop-control free?
Yes, desktop-control is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does desktop-control support?
desktop-control is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created desktop-control?
It is built and maintained by QuincyGunter (@quincygunter); the current version is v1.0.0.