Description

Automate desktop tasks locally with mouse, keyboard, window control, OCR, and image recognition using Python on Windows/macOS/Linux.

README (SKILL.md)

Desktop Automation Skill v2.0

Name: desktop-automation-100per100-local
Author: jordaneparis

Complete desktop automation for Windows/macOS/Linux. Zero-error edition.

⚠️ Privacy & Security

CRITICAL: This skill captures ALL keyboard and mouse events.

NEVER record while entering passwords, credit cards, or secrets
Recorded macros are stored as JSON in recorded_macro/ directory
Always use dry_run=true to test before actual execution
Store macros in secure locations only
Enable safe mode by default (it is)

🎯 What It Does

Automate desktop interactions without APIs:

✅ Click, type, drag, scroll
✅ Capture screenshots
✅ Recognize images (OpenCV template matching)
✅ Extract text (Tesseract OCR)
✅ Record and replay macros
✅ Find windows by title
✅ Clipboard operations
✅ Safe mode with dry_run for testing

🔐 Safety Features (Built-In)

1. Safe Mode (Default: ON)

Blocks dangerous actions when enabled:

type, press_key, click, drag are monitored
Parameters are scanned for dangerous patterns: rm , del , C:\Windows\, /etc/, sudo, etc.
Blocked actions are logged

2. Dry-Run Mode

All actions support dry_run=true:

Action is logged but NOT executed
Use for testing before running real automation

3. Audit Logging

Every action logged to ~/.openclaw/skills/desktop-automation-logs/automation_YYYY-MM-DD.log

4. Thread Safety

All modules use locks to prevent race conditions.

📦 Installation

1. Extract Files

Place desktop-automation-ultra-local/ in:

Windows: C:\Users\\x3CUser>\.openclaw\workspace\skills\
Linux/macOS: ~/.openclaw/workspace/skills/

2. Install Dependencies

pip install -r requirements.txt

3. Optional: Tesseract for OCR

For find_text_on_screen functionality:

Windows: Download installer from https://github.com/UB-Mannheim/tesseract/wiki
Linux: sudo apt install tesseract-ocr
macOS: brew install tesseract

4. Restart OpenClaw

openclaw gateway restart

🚀 Quick Start

Basic Click

action: click
params:
  x: 100
  y: 100
  dry_run: true  # Test first!

Type Text

action: type
params:
  text: "Hello World"
  interval: 0.05  # Delay between keys
  dry_run: false

Find Image

action: find_image
params:
  template_path: "templates/button.png"
  confidence: 0.95

Extract Text (OCR)

action: read_text_ocr
params:
  lang: "fra"  # French

📖 Core Actions

Mouse & Keyboard

Action	Parameters	Returns
`click`	`x`, `y`, `button="left"`, `dry_run`	`{status, x, y}`
`type`	`text`, `interval=0.05`, `dry_run`	`{status, text}`
`press_key`	`key`, `dry_run`	`{status, key}`
`move_mouse`	`x`, `y`, `duration=0.5`, `dry_run`	`{status, x, y}`
`scroll`	`amount=5`, `dry_run`	`{status, amount}`
`drag`	`start_x`, `start_y`, `end_x`, `end_y`, `duration=0.5`, `dry_run`	`{status}`
`copy_to_clipboard`	`text`, `dry_run`	`{status}`
`paste_from_clipboard`	`dry_run`	`{status, length}`

Screenshots & Windows

Action	Parameters	Returns
`screenshot`	`path="~/Desktop/screenshot.png"`, `dry_run`	`{status, path}`
`get_active_window`	`dry_run`	`{status, title, x, y, width, height}`
`list_windows`	`dry_run`	`{status, windows[], count}`
`activate_window`	`title_substring`, `dry_run`	`{status, title}`

Image Recognition (requires OpenCV)

Action	Parameters	Returns
`find_image`	`template_path`, `confidence=0.9`, `dry_run`	`{status, x, y, confidence}`
`find_image_multiscale`	`template_path`, `confidence`, `scale_factors`, `dry_run`	`{status, x, y, confidence, scale}`
`wait_for_image`	`template_path`, `timeout=30.0`, `interval=0.5`, `confidence=0.9`, `dry_run`	`{status, x, y, confidence}`

OCR / Text Recognition (requires Tesseract)

Action	Parameters	Returns
`find_text_on_screen`	`text`, `lang="fra"`, `dry_run`	`{status, locations[], count}`
`find_all_text_on_screen`	`text`, `lang="fra"`, `dry_run`	`{status, data[], count}`
`read_text_ocr`	`lang="fra"`, `dry_run`	`{status, text, length}`
`read_text_region`	`x`, `y`, `width`, `height`, `lang="fra"`, `dry_run`	`{status, text, length}`
`extract_screen_data`	`region={}`, `output_format="json"`, `lang="fra"`, `dry_run`	`{status, data[], count}`

Macros

Action	Parameters	Returns
`play_macro`	`macro_path`, `speed=1.0`, `dry_run`	`{status, executed, total, errors[]}`
`stop_macro`	—	`{status}`
`play_macro_with_subroutines`	`macro_path`, `speed=1.0`, `sub_macros_dir`, `dry_run`	`{status, executed, total, errors[]}`

Safety Management

Action	Parameters	Returns
`set_safe_mode`	`enabled=true`	`{status, safe_mode}`
`get_safety_status`	—	`{status, safe_mode_enabled, dangerous_patterns, dangerous_actions[]}`

📝 Macro Format

Recorded macros are JSON with this structure:

{
  "events": [
    {
      "action": "click",
      "params": {"x": 100, "y": 50},
      "wait": 500
    },
    {
      "action": "type",
      "params": {"text": "Hello"},
      "wait": 200
    },
    {
      "action": "press_key",
      "params": {"key": "return"},
      "wait": 100
    }
  ]
}

action — action name
params — action parameters
wait — milliseconds to wait before next action

🔧 Advanced: Mouse Move Debouncing

To avoid recording hundreds of move_mouse events during a smooth drag, the recorder uses debouncing:

When you move the mouse, events are suppressed during movement
After you stop moving for N seconds (default: 1 sec), the final position is recorded
This reduces macro size dramatically while preserving intended end positions
Configurable via GUI: set debounce time (0.1–10 seconds)

Example:

Fast horizontal line → 1 move_mouse event (end coordinates)
Slow, stop-and-go → multiple move_mouse events (one per "stop")

🧪 Testing

Run the unit test suite:

python scripts/test_automation.py

Output:

test_dry_run_click ... ok
test_get_active_window ... ok
test_safe_mode_blocks_dangerous ... ok
...
Ran 13 tests
OK

📊 Logging

All actions logged to: ~/.openclaw/skills/desktop-automation-logs/automation_YYYY-MM-DD.log

Example:

[2026-03-15 10:23:45] [INFO] ActionManager: ActionManager initialized with safe_mode=True
[2026-03-15 10:23:46] [INFO] ActionManager: Clicked at (100, 50) with left button
[2026-03-15 10:23:47] [INFO] ActionManager: Typed: Hello World

⚙️ Configuration

Environment Variables

# Override log directory
export AUTOMATION_LOG_DIR=~/my_logs

# Disable safe mode globally (NOT recommended)
export AUTOMATION_SAFE_MODE=false

🐛 Troubleshooting

"pyautogui failsafe triggered"

Move mouse to corner of screen to stop.

OCR returns empty text

Ensure Tesseract is installed correctly
Check image quality (high contrast helps)
Try read_text_ocr instead of find_text_on_screen

Image recognition not finding template

Ensure template image exists and is correct format (PNG, JPG)
Try lower confidence threshold (e.g., 0.85 instead of 0.95)
Use find_image_multiscale to detect at different scales

Actions blocked by safe mode

This is intentional. To run dangerous actions:

action: set_safe_mode
params:
  enabled: false

Then execute your action. Re-enable safe mode immediately after:

action: set_safe_mode
params:
  enabled: true

📄 License

MIT License. See LICENSE file.

📚 Files Structure

desktop-automation-ultra-local/
├── SKILL.md                          (This file)
├── requirements.txt                  (Python dependencies)
├── lib/
│   ├── actions.py                   (Core click/type/drag actions)
│   ├── image_recognition.py         (OpenCV template matching)
│   ├── ocr_engine.py                (Tesseract OCR)
│   ├── macro_player.py              (Record/playback macros)
│   ├── safety_manager.py            (Safe mode, blocking)
│   └── utils.py                     (Logging, helpers)
├── scripts/
│   └── test_automation.py           (Unit tests)
└── recorded_macro/                  (Output: saved macros)

✅ Validation Checklist

All modules have proper error handling
Thread safety implemented (locks)
Safe mode enabled by default
Dry-run mode on all actions
Comprehensive logging
Unit tests (13 tests)
UTF-8 encoding for all text
No hardcoded paths (uses expanduser)
Graceful fallbacks for missing dependencies
Documentation complete

Status: PRODUCTION READY ✅

Last updated: 2026-03-15 Version: 2.0.0

Usage Guidance

This skill appears to do what it claims (local desktop automation) but is inherently high-risk because it records and replays all keyboard and mouse activity. Before installing: - Don’t record or store any sensitive input (passwords, credit cards, authentication tokens). Recorded macros and logs are stored locally and can contain raw keystrokes and window titles. - Treat recorded_macro/ and ~/.openclaw/skills/desktop-automation-logs/ as sensitive data stores; restrict filesystem permissions and consider encrypting backups. The skill supports AES-protected macros, but the presence of that feature means users might be tempted to embed secrets — avoid doing so. - Do not enable autonomous or unattended execution of macros you did not author and review carefully every macro file before playback. A macro that types key sequences can open terminals or web browsers and perform destructive or exfiltrative actions even though the skill has no network calls itself. - The 'safe mode' is pattern-based and easily bypassed (GUI-driven actions, obfuscated typing, or multi-step sequences). Do not rely on it as a security boundary. - Prefer testing with dry_run=true and manual supervision. Consider running the skill in a restricted, sandboxed account or VM if you need to run untrusted macros. - Audit the full source before use (especially files omitted/truncated in the provided dump). If you plan to allow other users or the agent to invoke the skill autonomously, consider disabling autonomous invocation or adding stronger authorization controls. Confidence is medium because some files were truncated in the provided file dump; a complete line-by-line review of all source files would increase confidence and could surface any hidden network calls or persistence mechanisms not visible in the excerpts.

Capability Analysis

Type: OpenClaw Skill Name: desktop-automation-100per100-local Version: 2.0.1 This skill bundle provides powerful desktop automation capabilities, including mouse/keyboard control, OCR, and clipboard access, which are inherently high-risk. It includes a macro recorder (scripts/record_macro.py) that captures all keystrokes, posing a significant privacy risk if used on sensitive data. While the bundle includes multiple safety modules (lib/safety.py, lib/safety_manager.py) and a 'Safe Mode' designed to block dangerous strings like 'sudo' or 'rm', these protections can be programmatically disabled via the 'set_safe_mode' action. The presence of high-privilege desktop access combined with the ability to bypass safety constraints warrants a suspicious classification, although no clear evidence of intentional malice or data exfiltration was found.

Capability Assessment

✓ Purpose & Capability

Name/description, declared requirements, and included code match: this is a local desktop automation/macro recorder + player using PyAutoGUI/OpenCV/pytesseract. Required packages and runtime behaviors align with the stated functionality.

⚠ Instruction Scope

SKILL.md and the code explicitly capture ALL keyboard and mouse events, record window titles, save macros to disk, and can replay arbitrary sequences of input. Although the skill provides dry_run and 'safe mode' pattern checks, those checks are pattern-based and superficial (string pattern matching like 'rm ' or 'sudo') and can be bypassed by GUI-driven flows (e.g., launching a terminal and sending keystrokes to execute arbitrary commands). Recorded macros and audit logs may contain sensitive data (passwords, credit cards, window titles). The skill does not require or declare any environment credentials, but its ability to synthesize input lets it interact with networked apps to exfiltrate data (even though the skill itself has no network code).

✓ Install Mechanism

No automatic installer is provided (instruction-only install). The README/SKILL.md instructs the user to place the folder and pip install requirements — this is typical and low-risk compared to remote-download installers. The repository/package URLs referenced are GitHub-like; there are no opaque download URLs or archive extraction steps in the install instructions.

✓ Credentials

The skill requests no environment variables or external credentials and lists only local Python dependencies appropriate to desktop automation, OCR, and image recognition. That is proportionate to the stated purpose.

ℹ Persistence & Privilege

The skill writes audit logs to ~/.openclaw/skills/desktop-automation-logs/ and saves recorded macros to a recorded_macro/ directory; it can also create encrypted macro files. It does not declare always:true and does not modify other skills. Persisting logs and macros in user home is expected for this type of tool, but these files can contain sensitive data and should be protected (permissions/encryption).

Version History

v2.0.1

SKILL TOTALY RENAMED FOR RESEARSH IMPROVEMENT : https://clawhub.ai/JordaneParis/desktop-automation-ultra aiting merge capability **Desktop Automation Skill v2.0.1 — Major Safety & Architecture Overhaul** - Completely refactored core for zero-error operation and full thread safety; all logic is modularized in new `lib/` package. - Introduced robust safety features: default safe mode, powerful dangerous action/pattern detection, and dry-run support for all actions. - All actions now support `dry_run` parameter for logging/test runs without side effects. - Built-in audit logging: every action writes detailed logs to a date-stamped file in `~/.openclaw/skills/desktop-automation-logs/`. - Macro system fully reworked: robust recording with mouse debouncing and secure, well-structured JSON storage; macro playback supports dry-run and subroutine macros. - Significantly improved documentation: precise action tables, safety notes, configuration, advanced macro recording, test suite, setup, and troubleshooting. - All dependencies and core actions now unit-tested (see `scripts/test_automation.py`). SKILL TOTALY RENAMED FOR RESEARSH IMPROVEMENT : https://clawhub.ai/JordaneParis/desktop-automation-ultra aiting merge capability

v2.0.0

- Added a new "Security & Privacy" section with explicit warnings about keyboard recording in macro mode. - Informs users that recorded macros may capture sensitive data and advises on safe usage and storage. - Clarifies there is no network access or credential storage unless entered by the user during a macro. - Provides additional best practices for testing, verifying coordinates, and permissions. - No changes to functionality or dependencies; documentation update only.

v1.0.0

First Push

Metadata

Slug desktop-automation-100per100-local

Version 2.0.1

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 3

Frequently Asked Questions

What is desktop-automation-100per100-local?

Automate desktop tasks locally with mouse, keyboard, window control, OCR, and image recognition using Python on Windows/macOS/Linux. It is an AI Agent Skill for Claude Code / OpenClaw, with 424 downloads so far.

How do I install desktop-automation-100per100-local?

Run "/install desktop-automation-100per100-local" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is desktop-automation-100per100-local free?

Yes, desktop-automation-100per100-local is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does desktop-automation-100per100-local support?

desktop-automation-100per100-local is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created desktop-automation-100per100-local?

It is built and maintained by JordaneParis (@jordaneparis); the current version is v2.0.1.

More Skills

desktop-automation-100per100-local