← Back to Skills Marketplace
breckengan

Control

by Breckengan · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
1438
Downloads
0
Stars
9
Active Installs
1
Versions
Install in OpenClaw
/install control
Description
Advanced desktop automation with mouse, keyboard, and screen control
README (SKILL.md)

\r \r

Desktop Control Skill\r

\r The most advanced desktop automation skill for OpenClaw. Provides pixel-perfect mouse control, lightning-fast keyboard input, screen capture, window management, and clipboard operations.\r \r

🎯 Features\r

\r

Mouse Control\r

  • Absolute positioning - Move to exact coordinates\r
  • Relative movement - Move from current position\r
  • Smooth movement - Natural, human-like mouse paths\r
  • Click types - Left, right, middle, double, triple clicks\r
  • Drag & drop - Drag from point A to point B\r
  • Scroll - Vertical and horizontal scrolling\r
  • Position tracking - Get current mouse coordinates\r \r

Keyboard Control\r

  • Text typing - Fast, accurate text input\r
  • Hotkeys - Execute keyboard shortcuts (Ctrl+C, Win+R, etc.)\r
  • Special keys - Enter, Tab, Escape, Arrow keys, F-keys\r
  • Key combinations - Multi-key press combinations\r
  • Hold & release - Manual key state control\r
  • Typing speed - Configurable WPM (instant to human-like)\r \r

Screen Operations\r

  • Screenshot - Capture entire screen or regions\r
  • Image recognition - Find elements on screen (via OpenCV)\r
  • Color detection - Get pixel colors at coordinates\r
  • Multi-monitor - Support for multiple displays\r \r

Window Management\r

  • Window list - Get all open windows\r
  • Activate window - Bring window to front\r
  • Window info - Get position, size, title\r
  • Minimize/Maximize - Control window states\r \r

Safety Features\r

  • Failsafe - Move mouse to corner to abort\r
  • Pause control - Emergency stop mechanism\r
  • Approval mode - Require confirmation for actions\r
  • Bounds checking - Prevent out-of-screen operations\r
  • Logging - Track all automation actions\r \r ---\r \r

🚀 Quick Start\r

\r

Installation\r

\r First, install required dependencies:\r \r

pip install pyautogui pillow opencv-python pygetwindow\r
```\r
\r
### Basic Usage\r
\r
```python\r
from skills.desktop_control import DesktopController\r
\r
# Initialize controller\r
dc = DesktopController(failsafe=True)\r
\r
# Mouse operations\r
dc.move_mouse(500, 300)  # Move to coordinates\r
dc.click()  # Left click at current position\r
dc.click(100, 200, button="right")  # Right click at position\r
\r
# Keyboard operations\r
dc.type_text("Hello from OpenClaw!")\r
dc.hotkey("ctrl", "c")  # Copy\r
dc.press("enter")\r
\r
# Screen operations\r
screenshot = dc.screenshot()\r
position = dc.get_mouse_position()\r
```\r
\r
---\r
\r
## 📋 Complete API Reference\r
\r
### Mouse Functions\r
\r
#### `move_mouse(x, y, duration=0, smooth=True)`\r
Move mouse to absolute screen coordinates.\r
\r
**Parameters:**\r
- `x` (int): X coordinate (pixels from left)\r
- `y` (int): Y coordinate (pixels from top)\r
- `duration` (float): Movement time in seconds (0 = instant, 0.5 = smooth)\r
- `smooth` (bool): Use bezier curve for natural movement\r
\r
**Example:**\r
```python\r
# Instant movement\r
dc.move_mouse(1000, 500)\r
\r
# Smooth 1-second movement\r
dc.move_mouse(1000, 500, duration=1.0)\r
```\r
\r
#### `move_relative(x_offset, y_offset, duration=0)`\r
Move mouse relative to current position.\r
\r
**Parameters:**\r
- `x_offset` (int): Pixels to move horizontally (positive = right)\r
- `y_offset` (int): Pixels to move vertically (positive = down)\r
- `duration` (float): Movement time in seconds\r
\r
**Example:**\r
```python\r
# Move 100px right, 50px down\r
dc.move_relative(100, 50, duration=0.3)\r
```\r
\r
#### `click(x=None, y=None, button='left', clicks=1, interval=0.1)`\r
Perform mouse click.\r
\r
**Parameters:**\r
- `x, y` (int, optional): Coordinates to click (None = current position)\r
- `button` (str): 'left', 'right', 'middle'\r
- `clicks` (int): Number of clicks (1 = single, 2 = double)\r
- `interval` (float): Delay between multiple clicks\r
\r
**Example:**\r
```python\r
# Simple left click\r
dc.click()\r
\r
# Double-click at specific position\r
dc.click(500, 300, clicks=2)\r
\r
# Right-click\r
dc.click(button='right')\r
```\r
\r
#### `drag(start_x, start_y, end_x, end_y, duration=0.5, button='left')`\r
Drag and drop operation.\r
\r
**Parameters:**\r
- `start_x, start_y` (int): Starting coordinates\r
- `end_x, end_y` (int): Ending coordinates\r
- `duration` (float): Drag duration\r
- `button` (str): Mouse button to use\r
\r
**Example:**\r
```python\r
# Drag file from desktop to folder\r
dc.drag(100, 100, 500, 500, duration=1.0)\r
```\r
\r
#### `scroll(clicks, direction='vertical', x=None, y=None)`\r
Scroll mouse wheel.\r
\r
**Parameters:**\r
- `clicks` (int): Scroll amount (positive = up/left, negative = down/right)\r
- `direction` (str): 'vertical' or 'horizontal'\r
- `x, y` (int, optional): Position to scroll at\r
\r
**Example:**\r
```python\r
# Scroll down 5 clicks\r
dc.scroll(-5)\r
\r
# Scroll up 10 clicks\r
dc.scroll(10)\r
\r
# Horizontal scroll\r
dc.scroll(5, direction='horizontal')\r
```\r
\r
#### `get_mouse_position()`\r
Get current mouse coordinates.\r
\r
**Returns:** `(x, y)` tuple\r
\r
**Example:**\r
```python\r
x, y = dc.get_mouse_position()\r
print(f"Mouse is at: {x}, {y}")\r
```\r
\r
---\r
\r
### Keyboard Functions\r
\r
#### `type_text(text, interval=0, wpm=None)`\r
Type text with configurable speed.\r
\r
**Parameters:**\r
- `text` (str): Text to type\r
- `interval` (float): Delay between keystrokes (0 = instant)\r
- `wpm` (int, optional): Words per minute (overrides interval)\r
\r
**Example:**\r
```python\r
# Instant typing\r
dc.type_text("Hello World")\r
\r
# Human-like typing at 60 WPM\r
dc.type_text("Hello World", wpm=60)\r
\r
# Slow typing with 0.1s between keys\r
dc.type_text("Hello World", interval=0.1)\r
```\r
\r
#### `press(key, presses=1, interval=0.1)`\r
Press and release a key.\r
\r
**Parameters:**\r
- `key` (str): Key name (see Key Names section)\r
- `presses` (int): Number of times to press\r
- `interval` (float): Delay between presses\r
\r
**Example:**\r
```python\r
# Press Enter\r
dc.press('enter')\r
\r
# Press Space 3 times\r
dc.press('space', presses=3)\r
\r
# Press Down arrow\r
dc.press('down')\r
```\r
\r
#### `hotkey(*keys, interval=0.05)`\r
Execute keyboard shortcut.\r
\r
**Parameters:**\r
- `*keys` (str): Keys to press together\r
- `interval` (float): Delay between key presses\r
\r
**Example:**\r
```python\r
# Copy (Ctrl+C)\r
dc.hotkey('ctrl', 'c')\r
\r
# Paste (Ctrl+V)\r
dc.hotkey('ctrl', 'v')\r
\r
# Open Run dialog (Win+R)\r
dc.hotkey('win', 'r')\r
\r
# Save (Ctrl+S)\r
dc.hotkey('ctrl', 's')\r
\r
# Select All (Ctrl+A)\r
dc.hotkey('ctrl', 'a')\r
```\r
\r
#### `key_down(key)` / `key_up(key)`\r
Manually control key state.\r
\r
**Example:**\r
```python\r
# Hold Shift\r
dc.key_down('shift')\r
dc.type_text("hello")  # Types "HELLO"\r
dc.key_up('shift')\r
\r
# Hold Ctrl and click (for multi-select)\r
dc.key_down('ctrl')\r
dc.click(100, 100)\r
dc.click(200, 100)\r
dc.key_up('ctrl')\r
```\r
\r
---\r
\r
### Screen Functions\r
\r
#### `screenshot(region=None, filename=None)`\r
Capture screen or region.\r
\r
**Parameters:**\r
- `region` (tuple, optional): (left, top, width, height) for partial capture\r
- `filename` (str, optional): Path to save image\r
\r
**Returns:** PIL Image object\r
\r
**Example:**\r
```python\r
# Full screen\r
img = dc.screenshot()\r
\r
# Save to file\r
dc.screenshot(filename="screenshot.png")\r
\r
# Capture specific region\r
img = dc.screenshot(region=(100, 100, 500, 300))\r
```\r
\r
#### `get_pixel_color(x, y)`\r
Get color of pixel at coordinates.\r
\r
**Returns:** RGB tuple `(r, g, b)`\r
\r
**Example:**\r
```python\r
r, g, b = dc.get_pixel_color(500, 300)\r
print(f"Color at (500, 300): RGB({r}, {g}, {b})")\r
```\r
\r
#### `find_on_screen(image_path, confidence=0.8)`\r
Find image on screen (requires OpenCV).\r
\r
**Parameters:**\r
- `image_path` (str): Path to template image\r
- `confidence` (float): Match threshold (0-1)\r
\r
**Returns:** `(x, y, width, height)` or None\r
\r
**Example:**\r
```python\r
# Find button on screen\r
location = dc.find_on_screen("button.png")\r
if location:\r
    x, y, w, h = location\r
    # Click center of found image\r
    dc.click(x + w//2, y + h//2)\r
```\r
\r
#### `get_screen_size()`\r
Get screen resolution.\r
\r
**Returns:** `(width, height)` tuple\r
\r
**Example:**\r
```python\r
width, height = dc.get_screen_size()\r
print(f"Screen: {width}x{height}")\r
```\r
\r
---\r
\r
### Window Functions\r
\r
#### `get_all_windows()`\r
List all open windows.\r
\r
**Returns:** List of window titles\r
\r
**Example:**\r
```python\r
windows = dc.get_all_windows()\r
for title in windows:\r
    print(f"Window: {title}")\r
```\r
\r
#### `activate_window(title_substring)`\r
Bring window to front by title.\r
\r
**Parameters:**\r
- `title_substring` (str): Part of window title to match\r
\r
**Example:**\r
```python\r
# Activate Chrome\r
dc.activate_window("Chrome")\r
\r
# Activate VS Code\r
dc.activate_window("Visual Studio Code")\r
```\r
\r
#### `get_active_window()`\r
Get currently focused window.\r
\r
**Returns:** Window title (str)\r
\r
**Example:**\r
```python\r
active = dc.get_active_window()\r
print(f"Active window: {active}")\r
```\r
\r
---\r
\r
### Clipboard Functions\r
\r
#### `copy_to_clipboard(text)`\r
Copy text to clipboard.\r
\r
**Example:**\r
```python\r
dc.copy_to_clipboard("Hello from OpenClaw!")\r
```\r
\r
#### `get_from_clipboard()`\r
Get text from clipboard.\r
\r
**Returns:** str\r
\r
**Example:**\r
```python\r
text = dc.get_from_clipboard()\r
print(f"Clipboard: {text}")\r
```\r
\r
---\r
\r
## ⌨️ Key Names Reference\r
\r
### Alphabet Keys\r
`'a'` through `'z'`\r
\r
### Number Keys\r
`'0'` through `'9'`\r
\r
### Function Keys\r
`'f1'` through `'f24'`\r
\r
### Special Keys\r
- `'enter'` / `'return'`\r
- `'esc'` / `'escape'`\r
- `'space'` / `'spacebar'`\r
- `'tab'`\r
- `'backspace'`\r
- `'delete'` / `'del'`\r
- `'insert'`\r
- `'home'`\r
- `'end'`\r
- `'pageup'` / `'pgup'`\r
- `'pagedown'` / `'pgdn'`\r
\r
### Arrow Keys\r
- `'up'` / `'down'` / `'left'` / `'right'`\r
\r
### Modifier Keys\r
- `'ctrl'` / `'control'`\r
- `'shift'`\r
- `'alt'`\r
- `'win'` / `'winleft'` / `'winright'`\r
- `'cmd'` / `'command'` (Mac)\r
\r
### Lock Keys\r
- `'capslock'`\r
- `'numlock'`\r
- `'scrolllock'`\r
\r
### Punctuation\r
- `'.'` / `','` / `'?'` / `'!'` / `';'` / `':'`\r
- `'['` / `']'` / `'{'` / `'}'`\r
- `'('` / `')'`\r
- `'+'` / `'-'` / `'*'` / `'/'` / `'='`\r
\r
---\r
\r
## 🛡️ Safety Features\r
\r
### Failsafe Mode\r
\r
Move mouse to **any corner** of the screen to abort all automation.\r
\r
```python\r
# Enable failsafe (enabled by default)\r
dc = DesktopController(failsafe=True)\r
```\r
\r
### Pause Control\r
\r
```python\r
# Pause all automation for 2 seconds\r
dc.pause(2.0)\r
\r
# Check if automation is safe to proceed\r
if dc.is_safe():\r
    dc.click(500, 500)\r
```\r
\r
### Approval Mode\r
\r
Require user confirmation before actions:\r
\r
```python\r
dc = DesktopController(require_approval=True)\r
\r
# This will ask for confirmation\r
dc.click(500, 500)  # Prompt: "Allow click at (500, 500)? [y/n]"\r
```\r
\r
---\r
\r
## 🎨 Advanced Examples\r
\r
### Example 1: Automated Form Filling\r
\r
```python\r
dc = DesktopController()\r
\r
# Click name field\r
dc.click(300, 200)\r
dc.type_text("John Doe", wpm=80)\r
\r
# Tab to next field\r
dc.press('tab')\r
dc.type_text("[email protected]", wpm=80)\r
\r
# Tab to password\r
dc.press('tab')\r
dc.type_text("SecurePassword123", wpm=60)\r
\r
# Submit form\r
dc.press('enter')\r
```\r
\r
### Example 2: Screenshot Region and Save\r
\r
```python\r
# Capture specific area\r
region = (100, 100, 800, 600)  # left, top, width, height\r
img = dc.screenshot(region=region)\r
\r
# Save with timestamp\r
import datetime\r
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")\r
img.save(f"capture_{timestamp}.png")\r
```\r
\r
### Example 3: Multi-File Selection\r
\r
```python\r
# Hold Ctrl and click multiple files\r
dc.key_down('ctrl')\r
dc.click(100, 200)  # First file\r
dc.click(100, 250)  # Second file\r
dc.click(100, 300)  # Third file\r
dc.key_up('ctrl')\r
\r
# Copy selected files\r
dc.hotkey('ctrl', 'c')\r
```\r
\r
### Example 4: Window Automation\r
\r
```python\r
# Activate Calculator\r
dc.activate_window("Calculator")\r
time.sleep(0.5)\r
\r
# Type calculation\r
dc.type_text("5+3=", interval=0.2)\r
time.sleep(0.5)\r
\r
# Take screenshot of result\r
dc.screenshot(filename="calculation_result.png")\r
```\r
\r
### Example 5: Drag & Drop File\r
\r
```python\r
# Drag file from source to destination\r
dc.drag(\r
    start_x=200, start_y=300,  # File location\r
    end_x=800, end_y=500,       # Folder location\r
    duration=1.0                 # Smooth 1-second drag\r
)\r
```\r
\r
---\r
\r
## ⚡ Performance Tips\r
\r
1. **Use instant movements** for speed: `duration=0`\r
2. **Batch operations** instead of individual calls\r
3. **Cache screen positions** instead of recalculating\r
4. **Disable failsafe** for maximum performance (use with caution)\r
5. **Use hotkeys** instead of menu navigation\r
\r
---\r
\r
## ⚠️ Important Notes\r
\r
- **Screen coordinates** start at (0, 0) in top-left corner\r
- **Multi-monitor setups** may have negative coordinates for secondary displays\r
- **Windows DPI scaling** may affect coordinate accuracy\r
- **Failsafe corners** are: (0,0), (width-1, 0), (0, height-1), (width-1, height-1)\r
- **Some applications** may block simulated input (games, secure apps)\r
\r
---\r
\r
## 🔧 Troubleshooting\r
\r
### Mouse not moving to correct position\r
- Check DPI scaling settings\r
- Verify screen resolution matches expectations\r
- Use `get_screen_size()` to confirm dimensions\r
\r
### Keyboard input not working\r
- Ensure target application has focus\r
- Some apps require admin privileges\r
- Try increasing `interval` for reliability\r
\r
### Failsafe triggering accidentally\r
- Increase screen border tolerance\r
- Move mouse away from corners during normal use\r
- Disable if needed: `DesktopController(failsafe=False)`\r
\r
### Permission errors\r
- Run Python with administrator privileges for some operations\r
- Some secure applications block automation\r
\r
---\r
\r
## 📦 Dependencies\r
\r
- **PyAutoGUI** - Core automation engine\r
- **Pillow** - Image processing\r
- **OpenCV** (optional) - Image recognition\r
- **PyGetWindow** - Window management\r
\r
Install all:\r
```bash\r
pip install pyautogui pillow opencv-python pygetwindow\r
```\r
\r
---\r
\r
**Built for OpenClaw** - The ultimate desktop automation companion 🦞\r
Usage Guidance
This package appears to be a legitimate desktop automation skill that implements the features described, but there are a few red flags to consider before installing and running it: - Metadata mismatch: the registry owner/slug differ from the _meta.json inside the package. Confirm the author's identity and source before trusting the package. - Manual dependency installation: the skill expects pyautogui, opencv-python, pygetwindow, pyperclip, etc. Install these in a contained environment (virtualenv or VM) and inspect for platform-specific permission prompts (accessibility/access to control input on macOS, for example). - Powerful local capabilities: the skill can move your mouse, type into any active window, take screenshots, and read/modify clipboard contents. Only run it when you trust the code and on a non-sensitive session (or in a VM) and enable failsafe/require_approval. - Autonomy & LLM integration: the ai_agent component can execute multi-step tasks autonomously. Avoid providing an LLM client or credentials until you've reviewed any code paths that might send screenshots or other data to external services. - Safety tips: enable failsafe (DesktopController(failsafe=True)), prefer require_approval=True during testing, run demos step-by-step rather than 'run all', and test inside a disposable VM. If you plan to allow autonomous execution, review the full ai_agent.py to ensure it does not attempt to auto-detect or transmit credentials/data. If you want higher confidence: ask the publisher to reconcile the metadata, provide a provenance/source URL (repo or homepage), or provide a signed release. If you provide the truncated parts of ai_agent.py and __init__.py for review, I can re-evaluate any hidden behaviors.
Capability Analysis
Type: OpenClaw Skill Name: control Version: 1.0.0 The skill bundle provides a comprehensive desktop automation framework using standard libraries like PyAutoGUI, PyGetWindow, and Pyperclip. It includes a low-level controller and a high-level AI agent capable of planning and executing tasks such as drawing in Paint, typing in Notepad, and managing windows. The code is well-documented, follows its stated purpose, and incorporates safety features like a failsafe mechanism (mouse-to-corner abort) and an optional manual approval mode for actions. No evidence of malicious intent, data exfiltration, or unauthorized persistence was found across the implementation files (__init__.py, ai_agent.py) or documentation.
Capability Assessment
Purpose & Capability
The name, description, SKILL.md, and code implement desktop automation (mouse, keyboard, screenshots, window management, clipboard). That functionality is coherent with the stated purpose. However there are metadata inconsistencies: the registry metadata (ownerId/slug) differs from _meta.json (different ownerId and slug 'desktop-control'), and the published skill's slug 'control' differs from package files. The skill contains code files (Python) but no install spec in the registry metadata — the SKILL.md instructs pip-install of several dependencies (pyautogui, opencv, pygetwindow, pyperclip), which is necessary but not declared in the registry. These mismatches could be harmless packaging issues but warrant scrutiny.
Instruction Scope
Runtime instructions and the code stay within desktop automation scope: move mouse, type text, take screenshots, manipulate clipboard, list windows, and run demo flows. The ai_agent component can plan and autonomously execute multi-step workflows (open apps, type, click, take screenshots). This is expected for this skill but grants broad local control (including the ability to type into any focused window, open apps, read/modify clipboard, and capture screen contents). The SKILL.md demos explicitly ask user to run sequences that will control the desktop — that's expected but high-impact.
Install Mechanism
No install spec is provided in the registry (lowest install risk), but the package includes Python code that imports pyautogui and the documentation instructs pip install of pyautogui, pillow, opencv-python, pygetwindow, pyperclip. Because the registry doesn't declare or bundle dependencies, users must install them manually. opencv-python and pyautogui may require native dependencies/permissions on some OSes. No remote downloads or URL-based installers were found.
Credentials
The skill declares no required environment variables or credentials, which is consistent with a purely local desktop-control tool. One uncertainty: ai_agent.py accepts an optional llm_client and comments that it will 'try to auto-detect' an LLM client — the visible code sets llm_client from an argument but truncated portions could attempt to discover/configure a model client or use system credentials. No explicit environment-variable reads or network endpoints are present in the visible code, but the potential LLM integration is a place to check for undeclared credential usage.
Persistence & Privilege
The skill does not request always:true, and does not declare any persistent system-wide modifications. It can be invoked by the model autonomously (disable-model-invocation is false), which is the platform default; combined with its ability to act on the user's desktop this increases potential impact but is expected for an automation skill. The code does not appear to modify other skills' configs.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install control
  3. After installation, invoke the skill by name or use /control
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of Desktop Control Skill — powerful desktop automation for mouse, keyboard, and screen. - Provides pixel-perfect mouse control, including clicking, dragging, scrolling, and human-like movement. - Offers fast, accurate keyboard input, hotkeys, special key support, and configurable typing speed. - Enables screen capture, image recognition, color detection, and multi-monitor compatibility. - Includes window management: activate, move/resize, minimize/maximize, and list windows. - Focuses on safety with failsafes, emergency stop, approval mode, bounds checking, and action logging. - Bundled with installation instructions, code examples, and detailed API documentation.
Metadata
Slug control
Version 1.0.0
License
All-time Installs 10
Active Installs 9
Total Versions 1
Frequently Asked Questions

What is Control?

Advanced desktop automation with mouse, keyboard, and screen control. It is an AI Agent Skill for Claude Code / OpenClaw, with 1438 downloads so far.

How do I install Control?

Run "/install control" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Control free?

Yes, Control is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Control support?

Control is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Control?

It is built and maintained by Breckengan (@breckengan); the current version is v1.0.0.

💬 Comments