← Back to Skills Marketplace
Virtual Desktop — Universal Browser Execution
by
Wesley Armando
· GitHub ↗
· v1.0.7
· MIT-0
403
Downloads
1
Stars
0
Active Installs
8
Versions
Install in OpenClaw
/install virtual-desktop
Description
Full Computer Use for OpenClaw via kasmweb/chrome Docker sidecar. Navigate any website, click, type, fill forms, extract data, upload files, screenshot on an...
Usage Guidance
This skill appears to be internally consistent with its stated purpose, but it materially increases the risk that an agent (or an attacker who controls the agent) can act as logged-in users and access arbitrary sites using persisted sessions. Before installing:
- Do not run on a production or multi-tenant host; deploy in an isolated VM/VPS or sandbox.
- Review and set a strong VNC_PW in .env before starting; avoid the default 'CHANGE_ME_NOW'.
- Avoid exposing ports 6901 (noVNC) and 9222 (Chrome CDP) to the public internet; restrict access via firewall/VPC or use SSH/VPN tunneling.
- Understand optional API keys: CAPSOLVER (paid CAPTCHA solver) and BROWSERBASE (residential proxy) can enable bypassing anti-bot protections—only add them if you trust the provider and need the capability.
- Confirm how Telegram notifications are delivered; if you do not want Telegram notifications, remove TELEGRAM_BOT_TOKEN and verify behavior.
- Audit browser_control.py and SKILL.md yourself (or have a developer review) for any hardcoded endpoints or unexpected network calls; the code does call api.capsolver.com and api.anthropic.com as advertised.
- If you must use it, limit the agent's permissions, rotate or revoke sessions regularly, and consider making browser profiles ephemeral instead of permanent to reduce long-term exposure.
Capability Analysis
Type: OpenClaw Skill
Name: virtual-desktop
Version: 1.0.7
The 'virtual-desktop' skill bundle provides extensive browser automation capabilities by deploying a kasmweb/chrome Docker sidecar. It performs high-risk setup actions, including programmatically modifying the host's docker-compose.yml and .env files, opening port 6901 for remote VNC access, and using docker exec to install dependencies. While these actions are aligned with the stated goal of enabling 'Computer Use' for the agent, the broad permissions, modification of system-level configurations, and exposure of a remote desktop port constitute a significant security risk. No clear evidence of intentional malice was found, and the documentation includes appropriate security warnings regarding the VNC password and firewalling.
Capability Assessment
Purpose & Capability
The name/description (persistent authenticated browser via a kasmweb/chrome Docker sidecar) matches the declared binaries (docker, python3, openclaw), the required env vars (VNC_PW, BROWSER_CDP_URL), and the code (browser_control.py) which implements CDP/Playwright, CAPTCHA solving, and Claude Vision. Optional API keys (CapSolver, Browserbase, Anthropic) are appropriate to the advertised features.
Instruction Scope
Runtime instructions automatically edit docker-compose.yml, .env and openclaw.json, map ports 6901 and 9222, create a persistent Docker volume for browser profiles, and direct the agent to use stored authenticated sessions indefinitely. These actions are coherent with the feature set but significantly broaden scope: they persist cookies/sessions that allow the agent to act as logged-in users, expose remote VNC/CDP endpoints if host port mappings are used, and default to a weak VNC_PW value ('CHANGE_ME_NOW') unless the principal changes it.
Install Mechanism
No external install script or remote download is used; this is instruction-plus-local Python script editing configuration files and a contained Python control script. That lowers supply-chain risk compared with arbitrary remote downloads.
Credentials
Required env vars (VNC_PW, BROWSER_CDP_URL) are proportional. Optional keys (CAPSOLVER_API_KEY, BROWSERBASE_API_KEY, ANTHROPIC_API_KEY, TELEGRAM_BOT_TOKEN) map to advertised capabilities. Minor inconsistency: SKILL.md metadata claims Telegram uses the existing agent channel 'no separate token required' while TELEGRAM_BOT_TOKEN is listed as optional—clarify how Telegram notifications are delivered. Also be aware CapSolver and Browserbase imply costs and privacy tradeoffs.
Persistence & Privilege
The skill requests persistent presence of a browser profile (Docker volume) and modifies OpenClaw configuration (openclaw.json) to enable a browser profile. It does not set 'always: true'. Modifying agent/platform config and creating long-lived authenticated browser sessions increases privilege and long-term access surface but is consistent with the skill's functionality.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install virtual-desktop - After installation, invoke the skill by name or use
/virtual-desktop - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.7
No user-visible changes detected in this release.
- Version bump only; no updates to files or functionality.
v1.0.6
No visible code or content changes in this release; version number only incremented.
- No files were changed between v1.0.5 and v1.0.6.
- Functionality and documentation remain identical to the previous release.
v1.0.5
No changes detected in this release.
- Version number updated.
- No files or documentation were changed.
v1.0.4
- Updated description and instructions to clarify that agent access to authenticated platforms is enabled after a one-time manual login via noVNC.
- Expanded metadata: added a note on Telegram notifications and made the TELEGRAM_BOT_TOKEN environment variable optional.
- Adjusted required_paths to remove some unused files, simplifying workspace requirements.
- No code changes included in this version; documentation and metadata improvements only.
v1.0.3
No visible file changes detected. Metadata/environment variable requirements have been updated.
- Added "env" and "env_optional" fields to metadata to clarify required and optional environment variables.
- Default VNC_PW in installation instructions changed from "ChangeMe2024!" to "CHANGE_ME_NOW".
- No code or logic changes; documentation and environment setup guidance improved.
v1.0.2
**Major upgrade: Now uses persistent authenticated browser via kasmweb/chrome Docker sidecar, with full autonomy after initial login.**
- Added browser_control.py providing browser automation functionality.
- Switched to using kasmweb/chrome Docker sidecar for persistent, authenticated browser state.
- Enables login via noVNC; agent operates all platforms autonomously after setup.
- Permanent browser sessions persist across container restarts; no credential reuse required.
- Integrated CapSolver for automated CAPTCHA solving; supports Browserbase profiles for proxies and stealth.
- Enhanced logs, screenshot, and discovery recording in standardized workspace directories.
v1.0.1
## virtual-desktop 1.0.1 Changelog
- Added explicit documentation of required environment variables (PLATFORM_EMAIL and PLATFORM_PASSWORD) and notes on secure usage.
- Network request metadata now specifies operator-authorized platforms and clarifies configuration at runtime by the principal.
- Documented Telegram usage: skill sends action confirmations and screenshots via the agent’s existing Telegram channel (no extra bot needed).
- Refined metadata to clarify security practices, workspace paths, and environment variable handling.
- No code changes; documentation and metadata improvements only.
v1.0.0
Initial release of Virtual Desktop: a Docker-native, persistent headless browser skill for OpenClaw agents.
- Provides a Playwright Chromium-based automation layer with no Xvfb, VNC, or host dependencies.
- Enables agents to autonomously navigate, interact with, and extract data from any website.
- Supports full session persistence, automated form handling, structured error/self-correction logs, and continuous UI discovery logging.
- Output includes screenshots, action logs, and browser traces for every action.
- Designed to automate tasks such as content creation, email management, data extraction, and workflow execution on any web platform.
Metadata
Frequently Asked Questions
What is Virtual Desktop — Universal Browser Execution?
Full Computer Use for OpenClaw via kasmweb/chrome Docker sidecar. Navigate any website, click, type, fill forms, extract data, upload files, screenshot on an... It is an AI Agent Skill for Claude Code / OpenClaw, with 403 downloads so far.
How do I install Virtual Desktop — Universal Browser Execution?
Run "/install virtual-desktop" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Virtual Desktop — Universal Browser Execution free?
Yes, Virtual Desktop — Universal Browser Execution is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Virtual Desktop — Universal Browser Execution support?
Virtual Desktop — Universal Browser Execution is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Virtual Desktop — Universal Browser Execution?
It is built and maintained by Wesley Armando (@georges91560); the current version is v1.0.7.
More Skills