← Back to Skills Marketplace
95
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install desktop-agent
Description
Control desktop apps via mouse and keyboard, capture screenshots, teach AI tasks by demonstration, and automate workflows with saved reusable tasks.
README (SKILL.md)
Desktop AI Agent Skill
Use this skill when the user wants to:
- Control desktop applications
- Teach the AI to perform tasks
- Have the AI learn from demonstrations
- Automate desktop workflows
Commands
Take Screenshot
from desktop_agent import get_agent
agent = get_agent()
agent.capture_to_file("screenshot.png")
Get Mouse Position
agent = get_agent()
pos = agent.get_mouse_position() # Returns (x, y)
Mouse Control
agent = get_agent()
agent.move_to(x, y) # Move mouse
agent.click(x, y) # Click
agent.double_click(x, y) # Double click
agent.right_click(x, y) # Right click
agent.drag_to(x, y) # Drag
Keyboard Control
agent = get_agent()
agent.type("Hello world") # Type text
agent.press("enter") # Press key
agent.hotkey("ctrl", "c") # Ctrl+C
Teaching Mode
from desktop_agent import get_agent
from desktop_agent.teacher import TaskTeacher
agent = get_agent()
teacher = TaskTeacher(agent)
# Start teaching
teacher.start_teaching("my_task")
# Record actions
teacher.record_click() # Records current mouse position
teacher.record_click(100, 200) # Records specific position
teacher.record_type("text")
teacher.record_press("enter")
teacher.record_hotkey("ctrl", "s")
teacher.record_wait(2) # Wait 2 seconds
# Show steps
teacher.show_steps()
# Save or cancel
teacher.finish_teaching()
teacher.cancel_teaching()
Running Learned Tasks
agent = get_agent()
tasks = agent.list_tasks() # List all tasks
agent.execute_task("task_name") # Run a task
Files
desktop_agent/__init__.py- Core agentdesktop_agent/teacher.py- Teaching systemlearned_tasks/- Saved task definitions
Notes
- AI can see screen and control mouse/keyboard
- User can teach by demonstration
- Tasks are saved as JSON and reusable
- Use with caution - can control any application
Usage Guidance
This skill's functionality (screen capture, OCR, mouse/keyboard automation, saving and replaying tasks) matches its description, but there are a few red flags you should address before installing: 1) The package includes code that depends on many native Python libraries (pyautogui, mss, numpy, pillow, opencv, easyocr) but provides no install instructions — make sure you install these in a controlled environment (virtualenv/container) and verify versions. 2) The core initializer in __init__.py appears truncated/buggy (a reference like 'workspace = works' in the shipped file), so the skill may fail or behave unexpectedly; ask the author for the complete file and a clear get_agent(workspace) behavior. 3) The skill can see your screen and control your mouse/keyboard and will save tasks to disk — only use it if you trust the source and run it under caution (isolated environment, limited agent autonomy). 4) There is no network exfiltration code visible, but review the complete files and confirm there are no hidden network calls in the missing/truncated section. Recommended actions: run the code locally in an isolated VM or container, inspect the full __init__.py (and any truncated content), require explicit user invocation (disable autonomous invocation if possible), and get a dependency/install manifest from the publisher before use.
Capability Analysis
Type: OpenClaw Skill
Name: desktop-agent
Version: 1.0.0
The skill provides extensive desktop control capabilities, including screen capture, OCR (via easyocr), and full mouse/keyboard manipulation (via pyautogui). While these features are aligned with the stated purpose of a desktop automation agent, they represent high-risk behaviors that allow an AI to interact with any application or sensitive data on the host system. Furthermore, desktop_agent/__init__.py contains a hardcoded Windows path to a specific user's directory (C:\Users\funky\.openclaw\workspace). Although no evidence of intentional malice or data exfiltration was found, the broad permissions and potential for misuse via prompt injection justify a suspicious classification.
Capability Assessment
Purpose & Capability
Name, description, SKILL.md, and included Python modules align: the code provides screenshot capture, OCR, image/template matching, mouse/keyboard control, and a teach/save task system — all expected for a 'Desktop Agent'.
Instruction Scope
SKILL.md explicitly instructs using functions that capture the screen and control mouse/keyboard; these are within the declared purpose. The instructions cause the agent to create and write task JSON files to a local learned_tasks directory — this is expected but is powerful (it can record and replay arbitrary interactions).
Install Mechanism
There is no install spec but the code imports many heavy native Python packages (pyautogui, mss, numpy, pillow/PIL, opencv/cv2, easyocr). The skill provides no guidance for installing these dependencies; that mismatch (no install instructions + heavy deps) is a packaging/operational gap and increases friction/risk for users.
Credentials
The skill requests no environment variables, credentials, or external config paths. It operates locally (screen, mouse/keyboard, local filesystem) which is proportional to its stated function.
Persistence & Privilege
always:false and no system-level modifications are declared. However, because the skill enables direct desktop control and the platform allows autonomous invocation by default, granting it to an agent gives it the ability to perform potentially dangerous UI actions if invoked autonomously — consider limiting autonomous use or granting only when explicitly invoked.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install desktop-agent - After installation, invoke the skill by name or use
/desktop-agent - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of Desktop AI Agent Skill:
- Enables controlling desktop applications via Python commands (mouse, keyboard, screenshots).
- Supports "Teaching Mode" for creating reusable automation tasks by demonstration.
- Allows running and managing learned tasks.
- Includes core agent, teaching system, and task storage.
- Use with caution: full control over screen, mouse, and keyboard.
Metadata
Frequently Asked Questions
What is Desktop Agent?
Control desktop apps via mouse and keyboard, capture screenshots, teach AI tasks by demonstration, and automate workflows with saved reusable tasks. It is an AI Agent Skill for Claude Code / OpenClaw, with 95 downloads so far.
How do I install Desktop Agent?
Run "/install desktop-agent" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Desktop Agent free?
Yes, Desktop Agent is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Desktop Agent support?
Desktop Agent is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Desktop Agent?
It is built and maintained by Fuzzyb33s (@fuzzyb33s); the current version is v1.0.0.
More Skills