Description

Zero-token browser automation via Playwright scripts with CDP lock management and human-like interaction. Use when: (1) automating any browser-based workflow...

README (SKILL.md)

Browser Automation Ultra

Name: Browser Automation Ultra
Author: swaylq

Explore → Record → Replay → Fix. Convert expensive browser-tool interactions into zero-token Playwright scripts that reuse OpenClaw's Chrome session (cookies/login intact).

Prerequisites

Install Playwright (once per machine):

npm install -g playwright
# or in workspace: npm init -y && npm install playwright

No browser download needed — scripts connect to OpenClaw's existing Chrome via CDP.

Architecture

Chrome user-data: ~/.openclaw/browser/openclaw/user-data
       ↕ shared cookies/login (mutually exclusive CDP)
┌──────────────┐    ┌──────────────────┐
│ browser tool │ OR │ Playwright script │
│ (explore)    │    │ (zero token)      │
└──────────────┘    └──────────────────┘
       ↕ managed by browser-lock.sh

Only one CDP client can connect at a time. browser-lock.sh handles the mutex.

Setup

Copy scripts/browser-lock.sh to your workspace scripts/ directory
Copy scripts/utils/human-like.js to your workspace scripts/browser/utils/
chmod +x scripts/browser-lock.sh
Create scripts/browser/ for your automation scripts

Core Workflow

1. Explore (browser tool, costs tokens)

Use the OpenClaw browser tool (snapshot/act) to figure out a workflow. Note selectors, page flow, key waits.

2. Record (write a Playwright script)

Convert steps into a script. Save to scripts/browser/\x3Cverb>-\x3Ctarget>.js. Use the template pattern:

const { chromium } = require('playwright');
const { humanDelay, humanClick, humanType, humanThink, humanBrowse } = require('./utils/human-like');

function discoverCdpUrl() {
  try {
    const { execSync } = require('child_process');
    const ps = execSync("ps aux | grep 'remote-debugging-port' | grep -v grep", { encoding: 'utf8' });
    const match = ps.match(/remote-debugging-port=(\d+)/);
    return `http://127.0.0.1:${match ? match[1] : '18800'}`;
  } catch { return 'http://127.0.0.1:18800'; }
}

async function main() {
  const browser = await chromium.connectOverCDP(discoverCdpUrl());
  const context = browser.contexts()[0]; // reuse existing context (cookies/login)
  const page = await context.newPage();
  try {
    // automation here — use human-like functions
    await page.goto('https://example.com', { waitUntil: 'networkidle', timeout: 30000 });
    await humanBrowse(page); // simulate looking at the page
    await humanClick(page, 'button.submit');
    await humanType(page, 'input[name="title"]', 'Hello World');
  } finally {
    await page.close(); // NEVER browser.close() — kills entire Chrome
  }
}
main().then(() => process.exit(0)).catch(e => { console.error('❌', e.message); process.exit(1); });

3. Replay (zero tokens)

./scripts/browser-lock.sh run scripts/browser/my-task.js [args]
./scripts/browser-lock.sh run --timeout 120 scripts/browser/my-task.js

4. Fix (on error)

Read script error output
Re-explore the failing step with browser tool (snapshot) to check current UI
Update script with corrected selectors/logic
Retry

Never guess fixes blindly. Always re-explore the actual page state.

browser-lock.sh

Manages CDP mutex between OpenClaw browser and Playwright scripts.

./scripts/browser-lock.sh run \x3Cscript.js> [args]    # acquire → run → release (300s default)
./scripts/browser-lock.sh run --timeout 120 \x3Cscript> # custom timeout
./scripts/browser-lock.sh acquire                    # manual: stop OpenClaw browser, start Chrome
./scripts/browser-lock.sh release                    # manual: kill Chrome, release lock
./scripts/browser-lock.sh status                     # show state

Lock file: /tmp/openclaw-browser.lock. Stale locks auto-recover.

Anti-Detection Rules (MANDATORY)

All scripts must use human-like.js. See references/anti-detection.md for the full rule set.

Summary of critical rules:

❌ Banned	✅ Required
`waitForTimeout(3000)` fixed delays	`humanDelay(2000, 4000)` random range
`input.fill(text)` instant fill	`humanType(page, sel, text)` char-by-char with typos
`element.click()` teleport click	`humanClick(page, sel)` bezier mouse path + hover
Direct page operation after load	`humanBrowse(page)` simulate reading first
`nativeSetter.call()` DOM injection	`humanType()` or `humanFillContentEditable()`
Fixed cron schedule	`jitterWait(1, 10)` random offset

Exception: setInputFiles() for file uploads is allowed (no human simulation possible), but add random delays before/after.

human-like.js API

Function	Purpose
`humanDelay(min, max)`	Random wait (ms)
`humanThink(min, max)`	Longer pause before form fills
`humanClick(page, sel)`	Bezier mouse move → hover → click with press/release jitter
`humanType(page, sel, text, opts)`	Char-by-char typing, normal distribution speed, 3% typo rate
`humanFillContentEditable(page, sel, text)`	For contenteditable divs (line-by-line Enter + humanType)
`humanBrowse(page, opts)`	Simulate page reading (scroll + mouse wander, 2-5s)
`humanScroll(page, opts)`	Random scroll with occasional reverse
`jitterWait(minMin, maxMin)`	Random delay in minutes for cron tasks

Script Naming Convention

\x3Cverb>-\x3Ctarget>.js — e.g. publish-deviantart.js, read-inbox.js, reply-comment.js

Example Scripts

Production-tested scripts in scripts/examples/. Copy to your workspace scripts/browser/ and adapt.

Script	Platform	Function
`publish-deviantart.js`	DeviantArt	Upload image, fill title/desc/tags, submit
`publish-xiaohongshu.js`	小红书	Publish image note with topic tag association via recommend list
`publish-pinterest.js`	Pinterest	Create pin with title/desc, select board
`publish-behance.js`	Behance	Upload project with title/desc/tags/categories
`read-proton-latest.js`	Proton Mail	Read inbox, output JSON list of emails
`read-xhs-comments.js`	小红书	Read notification comments, output JSON with reply button index
`reply-xhs-comment.js`	小红书	Reply to a specific comment by index

Usage pattern:

# Copy examples to workspace
cp scripts/examples/*.js scripts/browser/
cp scripts/utils/human-like.js scripts/browser/utils/

# Run
./scripts/browser-lock.sh run scripts/browser/publish-deviantart.js image.png "Title" "Description" "tag1,tag2"
./scripts/browser-lock.sh run scripts/browser/read-xhs-comments.js --limit 10
./scripts/browser-lock.sh run scripts/browser/reply-xhs-comment.js 0 "回复文字"

All example scripts already use human-like.js for anti-detection.

Cron Integration

cd /path/to/workspace && ./scripts/browser-lock.sh run scripts/browser/task.js

Add jitterWait() at script start to randomize execution time.

Troubleshooting

Problem	Fix
Lock held by PID xxx	`./scripts/browser-lock.sh release`
CDP connection timeout	Ensure `acquire` was called / Chrome is running
Login expired	Use browser tool to re-login, then run script
Selector not found	Re-explore with browser tool, update script
Script timeout	Increase with `--timeout` flag

Environment Variables

Var	Default	Description
`CDP_PORT`	auto-discover	Override CDP port
`CHROME_BIN`	auto-detect	Chrome binary path
`HEADLESS`	auto	`true`/`false` to force headless

Usage Guidance

What to consider before installing: - This skill will (if you run its scripts) start/stop Chrome and reuse your OpenClaw browser profile (~/.openclaw/browser/openclaw/user-data). That means any accounts already logged in in that profile (Google/Adobe/ProtonMail/etc.) are available to the scripts. Treat that as equivalent to giving the scripts access to those accounts. - Several example scripts read sensitive content (e.g., read-proton-latest.js reads email body and prints it). If the agent sends script output to external services or logs, sensitive data could be exfiltrated. - The skill includes a lock manager (browser-lock.sh) that kills processes and writes /tmp lock/pid files. Review it carefully — running it will interrupt any existing browser sessions. - The skill metadata did not declare the user-data/config paths or process-manipulation behavior. That mismatch is a red flag: inspect and understand the scripts before running them. - If you want this functionality but want to reduce risk: create a dedicated browser profile (separate user-data-dir) with no sensitive logins and point the scripts to it; run Playwright and these scripts inside a disposable VM, container, or sandbox; remove or modify scripts that print sensitive data (e.g., the Proton Mail reader) or that you don't need; do not run as root. - If you do not fully trust the publisher (source unknown) do NOT run these scripts against your real OpenClaw profile. Audit the code (especially any page.evaluate calls) and test in an isolated environment first.

Capability Analysis

Type: OpenClaw Skill Name: browser-automation-ultra Version: 1.0.0 The skill is classified as suspicious due to its reliance on `child_process.execSync` in JavaScript files and direct execution of user-provided scripts via `node "$@"` in `scripts/browser-lock.sh`. While these capabilities are necessary for the skill's stated purpose of browser automation (running Playwright scripts), they introduce a significant vulnerability. If an AI agent or a malicious user were to inject arbitrary commands or paths into the script arguments, it could lead to arbitrary code execution. There is no evidence of intentional malicious behavior (e.g., data exfiltration, backdoors, obfuscation for hiding harmful actions) within the provided code or instructions; the risk stems from the powerful, unconstrained execution capabilities that could be exploited.

Capability Assessment

⚠ Purpose & Capability

Name/description promise (Playwright-based browser automation reusing OpenClaw's Chrome session) matches the included scripts (Playwright templates, human-like utilities, site-specific publishers). However the skill implicitly requires access to the user's OpenClaw Chrome user-data (~/.openclaw/browser/openclaw/user-data) and to kill/start local Chrome processes — these sensitive capabilities are not declared in the skill metadata (no required config paths or env vars), which is an incoherence and increases risk.

⚠ Instruction Scope

Runtime instructions and included scripts instruct the agent (and user) to stop/restart Chrome, start a standalone Chrome using the OpenClaw user-data directory, and connect over CDP; scripts use ps/curl/child_process.execSync, read DOM (including email if on Proton Mail), write screenshots to /tmp, and print extracted content to stdout. That scope goes beyond simple page navigation and includes reading potentially highly sensitive content (e.g., Proton Mail inbox) and manipulating local processes. The SKILL.md does not clearly warn about these sensitive effects.

✓ Install Mechanism

No install spec is present (instruction-only), and Playwright install is the only recommended step (npm). No remote downloads or archive extraction are used by the skill package itself. This is a lower installation risk relative to arbitrary binary downloads.

⚠ Credentials

The skill declares no required env vars or config paths, yet the scripts implicitly depend on and operate on local resources: the OpenClaw user-data directory (~/.openclaw/browser/openclaw/user-data), /tmp lock/pid files, local Chrome binary, and optional CDP_PORT env var. Scripts access local filesystem (setInputFiles, fs.existsSync), process list, and can kill processes. Access to user browser profile effectively grants access to all logged-in accounts (cookies/sessions); this is a high-sensitivity capability that should have been declared and justified.

ℹ Persistence & Privilege

always:false (no forced permanent presence). That said, the skill's scripts stop/kill existing browser processes, start a standalone Chrome using the user's profile, and write lock/pid files under /tmp. Those runtime privileges let the skill interrupt other agent/browser activity and run arbitrary Playwright code with the user's session context; this is powerful but consistent with the skill's stated goal. There is no evidence the skill modifies other skills' configs.

Version History

v1.0.0

Initial release: zero-token Playwright browser automation with CDP lock management, human-like anti-detection, and example scripts for DeviantArt, Pinterest, Behance, 小红书, Proton Mail

Metadata

Slug browser-automation-ultra

Version 1.0.0

License —

All-time Installs 3

Active Installs 3

Total Versions 1

Frequently Asked Questions

What is Browser Automation Ultra?

Zero-token browser automation via Playwright scripts with CDP lock management and human-like interaction. Use when: (1) automating any browser-based workflow... It is an AI Agent Skill for Claude Code / OpenClaw, with 528 downloads so far.

How do I install Browser Automation Ultra?

Run "/install browser-automation-ultra" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Browser Automation Ultra free?

Yes, Browser Automation Ultra is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Browser Automation Ultra support?

Browser Automation Ultra is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Browser Automation Ultra?

It is built and maintained by Sway Liu (@swaylq); the current version is v1.0.0.

More Skills

Browser Automation Ultra

Browser Automation Ultra

Prerequisites

Architecture

Setup

Core Workflow

1. Explore (browser tool, costs tokens)

2. Record (write a Playwright script)

3. Replay (zero tokens)

4. Fix (on error)

browser-lock.sh

Anti-Detection Rules (MANDATORY)

human-like.js API

Script Naming Convention

Example Scripts

Cron Integration

Troubleshooting

Environment Variables

What is Browser Automation Ultra?

How do I install Browser Automation Ultra?

Is Browser Automation Ultra free?

Which platforms does Browser Automation Ultra support?

Who created Browser Automation Ultra?

💬 Comments