← Back to Skills Marketplace
swaylq

Browser Automation Ultra

by Sway Liu · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
528
Downloads
0
Stars
3
Active Installs
1
Versions
Install in OpenClaw
/install browser-automation-ultra
Description
Zero-token browser automation via Playwright scripts with CDP lock management and human-like interaction. Use when: (1) automating any browser-based workflow...
README (SKILL.md)

Browser Automation Ultra

Explore → Record → Replay → Fix. Convert expensive browser-tool interactions into zero-token Playwright scripts that reuse OpenClaw's Chrome session (cookies/login intact).

Prerequisites

Install Playwright (once per machine):

npm install -g playwright
# or in workspace: npm init -y && npm install playwright

No browser download needed — scripts connect to OpenClaw's existing Chrome via CDP.

Architecture

Chrome user-data: ~/.openclaw/browser/openclaw/user-data
       ↕ shared cookies/login (mutually exclusive CDP)
┌──────────────┐    ┌──────────────────┐
│ browser tool │ OR │ Playwright script │
│ (explore)    │    │ (zero token)      │
└──────────────┘    └──────────────────┘
       ↕ managed by browser-lock.sh

Only one CDP client can connect at a time. browser-lock.sh handles the mutex.

Setup

  1. Copy scripts/browser-lock.sh to your workspace scripts/ directory
  2. Copy scripts/utils/human-like.js to your workspace scripts/browser/utils/
  3. chmod +x scripts/browser-lock.sh
  4. Create scripts/browser/ for your automation scripts

Core Workflow

1. Explore (browser tool, costs tokens)

Use the OpenClaw browser tool (snapshot/act) to figure out a workflow. Note selectors, page flow, key waits.

2. Record (write a Playwright script)

Convert steps into a script. Save to scripts/browser/\x3Cverb>-\x3Ctarget>.js. Use the template pattern:

const { chromium } = require('playwright');
const { humanDelay, humanClick, humanType, humanThink, humanBrowse } = require('./utils/human-like');

function discoverCdpUrl() {
  try {
    const { execSync } = require('child_process');
    const ps = execSync("ps aux | grep 'remote-debugging-port' | grep -v grep", { encoding: 'utf8' });
    const match = ps.match(/remote-debugging-port=(\d+)/);
    return `http://127.0.0.1:${match ? match[1] : '18800'}`;
  } catch { return 'http://127.0.0.1:18800'; }
}

async function main() {
  const browser = await chromium.connectOverCDP(discoverCdpUrl());
  const context = browser.contexts()[0]; // reuse existing context (cookies/login)
  const page = await context.newPage();
  try {
    // automation here — use human-like functions
    await page.goto('https://example.com', { waitUntil: 'networkidle', timeout: 30000 });
    await humanBrowse(page); // simulate looking at the page
    await humanClick(page, 'button.submit');
    await humanType(page, 'input[name="title"]', 'Hello World');
  } finally {
    await page.close(); // NEVER browser.close() — kills entire Chrome
  }
}
main().then(() => process.exit(0)).catch(e => { console.error('❌', e.message); process.exit(1); });

3. Replay (zero tokens)

./scripts/browser-lock.sh run scripts/browser/my-task.js [args]
./scripts/browser-lock.sh run --timeout 120 scripts/browser/my-task.js

4. Fix (on error)

  1. Read script error output
  2. Re-explore the failing step with browser tool (snapshot) to check current UI
  3. Update script with corrected selectors/logic
  4. Retry

Never guess fixes blindly. Always re-explore the actual page state.

browser-lock.sh

Manages CDP mutex between OpenClaw browser and Playwright scripts.

./scripts/browser-lock.sh run \x3Cscript.js> [args]    # acquire → run → release (300s default)
./scripts/browser-lock.sh run --timeout 120 \x3Cscript> # custom timeout
./scripts/browser-lock.sh acquire                    # manual: stop OpenClaw browser, start Chrome
./scripts/browser-lock.sh release                    # manual: kill Chrome, release lock
./scripts/browser-lock.sh status                     # show state

Lock file: /tmp/openclaw-browser.lock. Stale locks auto-recover.

Anti-Detection Rules (MANDATORY)

All scripts must use human-like.js. See references/anti-detection.md for the full rule set.

Summary of critical rules:

❌ Banned ✅ Required
waitForTimeout(3000) fixed delays humanDelay(2000, 4000) random range
input.fill(text) instant fill humanType(page, sel, text) char-by-char with typos
element.click() teleport click humanClick(page, sel) bezier mouse path + hover
Direct page operation after load humanBrowse(page) simulate reading first
nativeSetter.call() DOM injection humanType() or humanFillContentEditable()
Fixed cron schedule jitterWait(1, 10) random offset

Exception: setInputFiles() for file uploads is allowed (no human simulation possible), but add random delays before/after.

human-like.js API

Function Purpose
humanDelay(min, max) Random wait (ms)
humanThink(min, max) Longer pause before form fills
humanClick(page, sel) Bezier mouse move → hover → click with press/release jitter
humanType(page, sel, text, opts) Char-by-char typing, normal distribution speed, 3% typo rate
humanFillContentEditable(page, sel, text) For contenteditable divs (line-by-line Enter + humanType)
humanBrowse(page, opts) Simulate page reading (scroll + mouse wander, 2-5s)
humanScroll(page, opts) Random scroll with occasional reverse
jitterWait(minMin, maxMin) Random delay in minutes for cron tasks

Script Naming Convention

\x3Cverb>-\x3Ctarget>.js — e.g. publish-deviantart.js, read-inbox.js, reply-comment.js

Example Scripts

Production-tested scripts in scripts/examples/. Copy to your workspace scripts/browser/ and adapt.

Script Platform Function
publish-deviantart.js DeviantArt Upload image, fill title/desc/tags, submit
publish-xiaohongshu.js 小红书 Publish image note with topic tag association via recommend list
publish-pinterest.js Pinterest Create pin with title/desc, select board
publish-behance.js Behance Upload project with title/desc/tags/categories
read-proton-latest.js Proton Mail Read inbox, output JSON list of emails
read-xhs-comments.js 小红书 Read notification comments, output JSON with reply button index
reply-xhs-comment.js 小红书 Reply to a specific comment by index

Usage pattern:

# Copy examples to workspace
cp scripts/examples/*.js scripts/browser/
cp scripts/utils/human-like.js scripts/browser/utils/

# Run
./scripts/browser-lock.sh run scripts/browser/publish-deviantart.js image.png "Title" "Description" "tag1,tag2"
./scripts/browser-lock.sh run scripts/browser/read-xhs-comments.js --limit 10
./scripts/browser-lock.sh run scripts/browser/reply-xhs-comment.js 0 "回复文字"

All example scripts already use human-like.js for anti-detection.

Cron Integration

cd /path/to/workspace && ./scripts/browser-lock.sh run scripts/browser/task.js

Add jitterWait() at script start to randomize execution time.

Troubleshooting

Problem Fix
Lock held by PID xxx ./scripts/browser-lock.sh release
CDP connection timeout Ensure acquire was called / Chrome is running
Login expired Use browser tool to re-login, then run script
Selector not found Re-explore with browser tool, update script
Script timeout Increase with --timeout flag

Environment Variables

Var Default Description
CDP_PORT auto-discover Override CDP port
CHROME_BIN auto-detect Chrome binary path
HEADLESS auto true/false to force headless
Usage Guidance
What to consider before installing: - This skill will (if you run its scripts) start/stop Chrome and reuse your OpenClaw browser profile (~/.openclaw/browser/openclaw/user-data). That means any accounts already logged in in that profile (Google/Adobe/ProtonMail/etc.) are available to the scripts. Treat that as equivalent to giving the scripts access to those accounts. - Several example scripts read sensitive content (e.g., read-proton-latest.js reads email body and prints it). If the agent sends script output to external services or logs, sensitive data could be exfiltrated. - The skill includes a lock manager (browser-lock.sh) that kills processes and writes /tmp lock/pid files. Review it carefully — running it will interrupt any existing browser sessions. - The skill metadata did not declare the user-data/config paths or process-manipulation behavior. That mismatch is a red flag: inspect and understand the scripts before running them. - If you want this functionality but want to reduce risk: create a dedicated browser profile (separate user-data-dir) with no sensitive logins and point the scripts to it; run Playwright and these scripts inside a disposable VM, container, or sandbox; remove or modify scripts that print sensitive data (e.g., the Proton Mail reader) or that you don't need; do not run as root. - If you do not fully trust the publisher (source unknown) do NOT run these scripts against your real OpenClaw profile. Audit the code (especially any page.evaluate calls) and test in an isolated environment first.
Capability Analysis
Type: OpenClaw Skill Name: browser-automation-ultra Version: 1.0.0 The skill is classified as suspicious due to its reliance on `child_process.execSync` in JavaScript files and direct execution of user-provided scripts via `node "$@"` in `scripts/browser-lock.sh`. While these capabilities are necessary for the skill's stated purpose of browser automation (running Playwright scripts), they introduce a significant vulnerability. If an AI agent or a malicious user were to inject arbitrary commands or paths into the script arguments, it could lead to arbitrary code execution. There is no evidence of intentional malicious behavior (e.g., data exfiltration, backdoors, obfuscation for hiding harmful actions) within the provided code or instructions; the risk stems from the powerful, unconstrained execution capabilities that could be exploited.
Capability Assessment
Purpose & Capability
Name/description promise (Playwright-based browser automation reusing OpenClaw's Chrome session) matches the included scripts (Playwright templates, human-like utilities, site-specific publishers). However the skill implicitly requires access to the user's OpenClaw Chrome user-data (~/.openclaw/browser/openclaw/user-data) and to kill/start local Chrome processes — these sensitive capabilities are not declared in the skill metadata (no required config paths or env vars), which is an incoherence and increases risk.
Instruction Scope
Runtime instructions and included scripts instruct the agent (and user) to stop/restart Chrome, start a standalone Chrome using the OpenClaw user-data directory, and connect over CDP; scripts use ps/curl/child_process.execSync, read DOM (including email if on Proton Mail), write screenshots to /tmp, and print extracted content to stdout. That scope goes beyond simple page navigation and includes reading potentially highly sensitive content (e.g., Proton Mail inbox) and manipulating local processes. The SKILL.md does not clearly warn about these sensitive effects.
Install Mechanism
No install spec is present (instruction-only), and Playwright install is the only recommended step (npm). No remote downloads or archive extraction are used by the skill package itself. This is a lower installation risk relative to arbitrary binary downloads.
Credentials
The skill declares no required env vars or config paths, yet the scripts implicitly depend on and operate on local resources: the OpenClaw user-data directory (~/.openclaw/browser/openclaw/user-data), /tmp lock/pid files, local Chrome binary, and optional CDP_PORT env var. Scripts access local filesystem (setInputFiles, fs.existsSync), process list, and can kill processes. Access to user browser profile effectively grants access to all logged-in accounts (cookies/sessions); this is a high-sensitivity capability that should have been declared and justified.
Persistence & Privilege
always:false (no forced permanent presence). That said, the skill's scripts stop/kill existing browser processes, start a standalone Chrome using the user's profile, and write lock/pid files under /tmp. Those runtime privileges let the skill interrupt other agent/browser activity and run arbitrary Playwright code with the user's session context; this is powerful but consistent with the skill's stated goal. There is no evidence the skill modifies other skills' configs.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install browser-automation-ultra
  3. After installation, invoke the skill by name or use /browser-automation-ultra
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release: zero-token Playwright browser automation with CDP lock management, human-like anti-detection, and example scripts for DeviantArt, Pinterest, Behance, 小红书, Proton Mail
Metadata
Slug browser-automation-ultra
Version 1.0.0
License
All-time Installs 3
Active Installs 3
Total Versions 1
Frequently Asked Questions

What is Browser Automation Ultra?

Zero-token browser automation via Playwright scripts with CDP lock management and human-like interaction. Use when: (1) automating any browser-based workflow... It is an AI Agent Skill for Claude Code / OpenClaw, with 528 downloads so far.

How do I install Browser Automation Ultra?

Run "/install browser-automation-ultra" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Browser Automation Ultra free?

Yes, Browser Automation Ultra is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Browser Automation Ultra support?

Browser Automation Ultra is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Browser Automation Ultra?

It is built and maintained by Sway Liu (@swaylq); the current version is v1.0.0.

💬 Comments