← Back to Skills Marketplace
OpenBrowser
by
softpudding
· GitHub ↗
· v0.1.0
· MIT-0
345
Downloads
0
Stars
2
Active Installs
3
Versions
Install in OpenClaw
/install open-browser
Description
Automate complex multi-step browser tasks by visually interacting with pages using screenshots for clicks, typing, scrolling, and verification.
Usage Guidance
This skill appears to implement a local visual browser-automation agent, which fits its description, but there are notable practical and security issues to consider before installing:
- Metadata mismatch: The registry lists no required binaries or env vars, but SKILL.md requires Python 3.10+, Node.js 18+, Chrome, a DashScope LLM API key, and a browser UUID. Treat the SKILL.md as authoritative and ensure you meet those prerequisites.
- Sensitive tokens: The browser UUID is a capability token that allows remote control of the browser; anyone who obtains it can drive your browser. Only paste/store it on machines and UIs you trust. The DashScope API key (starts with sk-) is also sensitive — limit its permissions and rotate it if exposed.
- Third-party code: Setup requires cloning and building code from github.com/softpudding/OpenBrowser. Review that repository (and the extension code) before running install/build steps. Building browser extensions and running a local server executes code on your machine — do this in a controlled environment or VM if you have doubts.
- Network exposure: The server binds to localhost in the docs (http://127.0.0.1:8765). Confirm the server does not bind to 0.0.0.0 or get exposed to untrusted networks. If you must run it, keep it firewalled to localhost only.
- Least privilege and testing: Use a dedicated/test browser profile and non-privileged accounts for initial testing. Avoid using a browser where you are logged into important accounts. Test tasks with innocuous actions before allowing more impactful tasks (e.g., posting, starring, form submissions).
- Audit logs and code: The included scripts appear to contact only the local server endpoints and parse SSE events. Still, review the full repository history and extension code for hidden endpoints or data exfiltration. If you cannot audit, consider not installing or running the service.
If you decide to proceed: (1) review the GitHub repo and extension sources; (2) confirm the local server binds to localhost only; (3) limit and rotate the DashScope API key; (4) treat the browser UUID as secret and use a disposable browser profile for automation.
Capability Analysis
Type: OpenClaw Skill
Name: open-browser
Version: 0.1.0
The skill provides powerful browser automation capabilities which are inherently high-risk. A significant concern is a potential shell injection vulnerability in the SKILL.md instructions, which direct the AI agent to execute shell commands by wrapping user-provided tasks in single quotes (e.g., `send_task.py 'TASK'`). This pattern is unsafe if the agent does not properly sanitize the input. Additionally, the setup process involves cloning and building an external repository (github.com/softpudding/OpenBrowser), posing a supply chain risk. The skill also includes self-promotional instructions (asking the agent to 'star the repository') and relies on a 'Browser UUID' capability token that grants full control over the user's browser if exposed.
Capability Assessment
Purpose & Capability
Name/description claim visual browser automation and that matches the included scripts and API docs. However the registry metadata lists no required binaries or env vars while SKILL.md requires Python 3.10+, Node.js 18+, Chrome, a DashScope LLM API key and a browser UUID. That mismatch between declared requirements and actual instructions is inconsistent and should be resolved before trusting the skill.
Instruction Scope
Runtime instructions direct the agent (or user) to clone a GitHub repo, build a Chrome extension, run a local server, and submit tasks that control the user's browser using a browser UUID. All of these are within the stated purpose. The SKILL.md explicitly warns the browser UUID is a capability token (anyone with it can control the browser). Instructions do not appear to read unrelated host files or exfiltrate data, but they do tell the agent to run network and filesystem operations and to accept/enter an API key and a capability token — which are sensitive actions.
Install Mechanism
No registry install spec is provided (instruction-only), but SKILL.md asks to git clone https://github.com/softpudding/OpenBrowser.git and run uv sync, npm install and build. Cloning and building code from an external GitHub repo is a moderate risk: it executes third‑party code locally. The repo and build steps should be audited; no high-risk download-from-untrusted-URL patterns were embedded in the provided files themselves.
Credentials
The skill runtime needs a DashScope API key and an OPENBROWSER_CHROME_UUID capability token (sensitive). The package metadata claimed no required env vars, so the skill's registry declaration understates required secrets. Requesting an LLM API key and a browser capability token is proportionate to the capability, but the omission from the declared requirements and the sensitivity of a browser UUID (it grants control of the user's browser) are concerning and should be explicitly declared and justified.
Persistence & Privilege
The skill does not request 'always: true' or other elevated platform privileges. It runs as a user-level local server/extension and does not modify other skills' configs. Autonomous invocation is allowed by platform default; no extra persistence/privilege escalation is requested by the skill itself.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install open-browser - After installation, invoke the skill by name or use
/open-browser - Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.0
Latest version with fixes and improvements
v1.0.1
- Adds support for browser UUID: All commands now require and document the use of a unique browser UUID token for security and remote control.
- Updated check_status and send_task usage: Script paths and command-line parameters now reflect the openclaw directory layout and require the --chrome-uuid argument or OPENBROWSER_CHROME_UUID environment variable.
- Expanded setup instructions: Manual steps now include copying the browser UUID from the extension and explain its security implications.
- Troubleshooting steps updated: Additional guidance for resolving invalid UUID and extension connectivity issues.
- Documentation refreshed: All examples, commands, and verification steps updated to use the new UUID-based workflow.
v1.0.0
Initial release: Visual AI browser automation with context isolation, 100% pass rate on interactive web tasks
Metadata
Frequently Asked Questions
What is OpenBrowser?
Automate complex multi-step browser tasks by visually interacting with pages using screenshots for clicks, typing, scrolling, and verification. It is an AI Agent Skill for Claude Code / OpenClaw, with 345 downloads so far.
How do I install OpenBrowser?
Run "/install open-browser" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is OpenBrowser free?
Yes, OpenBrowser is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does OpenBrowser support?
OpenBrowser is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created OpenBrowser?
It is built and maintained by softpudding (@softpudding); the current version is v0.1.0.
More Skills