功能描述

Core browser automation library for OpenClaw agents. Provides reusable navigation, interaction, and capture capabilities for both Facet (Onshape learning) an...

使用说明 (SKILL.md)

Browser Automation Core Skill

Name: Browser Automation Core
Author: stefanferreira

Purpose

A reusable browser automation library that provides common web interaction capabilities for multiple OpenClaw agents. Designed to be extended by agent-specific skills while maintaining a single, well-tested core.

Primary Users

Facet - Onshape CAD learning automation
Ace - Competition entry and form filling
Future agents - Any web automation needs

Architecture

browser-automation-core/          # This skill
├── navigation/                   # URL loading, waiting
├── interaction/                  # Click, type, select
├── capture/                      # Screenshot, HTML capture
├── forms/                        # Form detection and filling
└── sessions/                     # Tab/window management

facets-browser-learning/          # Facet-specific extension
└── uses core + Onshape-specific logic

ace-competition-automation/       # Ace-specific extension  
└── uses core + competition-specific logic

Core Capabilities

Navigation

URL loading with timeout and retry
Wait conditions (element visible, page loaded)
History management (back, forward, refresh)
Tab/window control (open, close, switch)

Interaction

Element finding (CSS selectors, XPath, text)
Basic actions (click, type, clear, submit)
Mouse operations (hover, drag, scroll)
Keyboard operations (key presses, shortcuts)

Capture

Screenshots (full page, element, viewport)
HTML capture (page source, element HTML)
Text extraction (visible text, attributes)
Performance metrics (load times, resources)

Forms

Form detection (find all forms on page)
Field mapping (match fields to data)
Validation (required fields, formats)
Submission (submit buttons, AJAX handling)

Sessions

Cookie management (save/load sessions)
Authentication state (login persistence)
Profile management (user agent, viewport)
Cleanup (close browsers, clear data)

Quick Start

Installation

# Install from ClawHub (when published)
npx clawhub@latest install browser-automation-core

# Or use local development version
cp -r /path/to/skill /root/.openclaw/workspace/skills/

Basic Usage

# Example: Navigate and take screenshot
from browser_core import BrowserAutomation

browser = BrowserAutomation()
browser.navigate("https://example.com")
browser.take_screenshot("example.png")
browser.close()

Agent-Specific Examples

For Facet (Onshape Learning)

from browser_core import BrowserAutomation
from facets_onshape import OnshapeAutomation

browser = BrowserAutomation()
onshape = OnshapeAutomation(browser)

# Login to Onshape
onshape.login(email="[email protected]", password="***")

# Navigate to tutorials
onshape.navigate_to_tutorial("getting-started")

# Complete tutorial steps
onshape.complete_tutorial_step(1)
onshape.take_progress_screenshot()

For Ace (Competition Entry)

from browser_core import BrowserAutomation
from ace_competition import CompetitionAutomation

browser = BrowserAutomation()
competition = CompetitionAutomation(browser)

# Navigate to competition
competition.navigate_to_competition("https://competition.example.com")

# Fill entry form
entry_data = {
    "name": "Stef Ferreira",
    "email": "[email protected]",
    "phone": "+27726386189"
}
competition.fill_entry_form(entry_data)

# Submit and capture proof
competition.submit_entry()
competition.capture_submission_proof()

Configuration

Environment Variables

# Browser settings
export BROWSER_HEADLESS="true"           # Run without display
export BROWSER_TIMEOUT="30"              # Default timeout seconds
export BROWSER_VIEWPORT="1280,720"       # Window size
export BROWSER_USER_AGENT="OpenClaw Agent" # Custom user agent

# CDP settings (OpenClaw browser)
export CDP_URL="http://localhost:18800/json"
export CDP_WEBSOCKET="ws://localhost:18800/devtools/page/..."

# Storage settings
export SCREENSHOT_DIR="/path/to/screenshots"
export SESSION_DIR="/path/to/sessions"

OpenClaw Integration

{
  "skills": {
    "browser-automation-core": {
      "enabled": true,
      "config": {
        "cdpUrl": "http://localhost:18800/json",
        "headless": true,
        "timeout": 30,
        "screenshotDir": "/root/.openclaw/workspace/screenshots"
      }
    }
  }
}

Implementation Details

CDP (Chrome DevTools Protocol)

This skill uses OpenClaw's built-in browser via CDP:

Connection: WebSocket to ws://localhost:18800/devtools/page/...
Commands: Standard CDP methods (Page.navigate, DOM.querySelector, etc.)
Events: Async event handling for page loads, network requests

Error Handling

Retry logic: Automatic retry on network failures
Timeout management: Configurable timeouts per operation
Fallback strategies: Alternative selectors, different interaction methods
Recovery procedures: Page reload, session restore

Performance

Connection pooling: Reuse WebSocket connections
Command batching: Batch CDP commands when possible
Caching: Cache page structure, element positions
Parallel operations: Async operations where safe

Extension Points

Creating Agent-Specific Extensions

1. Create Extension Skill

python3 /usr/lib/node_modules/openclaw/skills/skill-creator/scripts/init_skill.py facets-browser-learning

2. Import Core Library

# In extension skill
import sys
sys.path.append("/root/.openclaw/workspace/skills/browser-automation-core")
from browser_core import BrowserAutomation

class OnshapeAutomation:
    def __init__(self):
        self.browser = BrowserAutomation()
    
    def onshape_specific_method(self):
        # Use core capabilities
        self.browser.navigate("https://cad.onshape.com")
        # Add Onshape-specific logic

3. Add Specialized Logic

Site-specific selectors (Onshape CSS classes, competition form fields)
Domain-specific workflows (tutorial navigation, competition rules)
Custom capture requirements (progress tracking, entry proof)
Error handling for specific sites

Testing Strategy

Unit Tests

# Test core functionality
cd /root/.openclaw/workspace/skills/browser-automation-core
python3 -m pytest tests/unit/

Integration Tests

# Test with actual browser
python3 tests/integration/test_navigation.py
python3 tests/integration/test_forms.py

Agent-Specific Tests

# Test Facet extension
cd /root/.openclaw/workspace/skills/facets-browser-learning
python3 tests/test_onshape_automation.py

# Test Ace extension  
cd /root/.openclaw/workspace/skills/ace-competition-automation
python3 tests/test_competition_automation.py

Common Use Cases

Use Case 1: Form Filling (Ace)

# Competition entry automation
data = {
    "full_name": "Stef Ferreira",
    "email": "[email protected]",
    "phone": "+27726386189",
    "address": "123 Street, City, South Africa"
}

browser.navigate(competition_url)
browser.fill_form("form#entry-form", data)
browser.click("button[type='submit']")
browser.wait_for_element(".success-message")
browser.take_screenshot("entry_proof.png")

Use Case 2: Tutorial Navigation (Facet)

# Onshape learning automation
browser.navigate("https://cad.onshape.com")
browser.login(credentials)  # Custom method in extension
browser.navigate("/learning/tutorials")

# Complete tutorial
tutorial_steps = browser.extract_tutorial_steps()
for step in tutorial_steps:
    browser.complete_step(step)  # Custom method
    browser.take_screenshot(f"step_{step.number}.png")
    
browser.capture_certificate()

Use Case 3: Multi-Page Workflow

# Complex automation across multiple pages
browser.open_new_tab()
browser.navigate_to_login()
browser.login(credentials)

browser.switch_to_tab(0)
browser.fill_application_form(data)

browser.switch_to_tab(1)
browser.verify_email_confirmation()

browser.capture_all_tabs_screenshots()

Error Recovery Patterns

CORS Issues (Screenshots/Evaluate Not Working)

Problem: Browser automation fails with CORS errors when taking screenshots or evaluating JavaScript.

Solution: Ensure browser is started with --remote-allow-origins=* flag:

# Browser startup command must include:
--remote-debugging-port=18800 --remote-allow-origins=*

Verification:

curl http://localhost:18800/json/version
# Should return browser info without CORS errors

Network Issues

try:
    browser.navigate(url)
except NetworkError:
    browser.reload()
    browser.wait_for_network_idle()
    # Retry operation

Element Not Found

# Try multiple selectors
selectors = [
    "button.submit",
    "input[type='submit']",
    ".submit-button",
    "//button[contains(text(), 'Submit')]"
]

for selector in selectors:
    if browser.element_exists(selector):
        browser.click(selector)
        break

Form Validation Errors

browser.submit_form()
if browser.has_validation_errors():
    errors = browser.get_validation_errors()
    for field, message in errors:
        browser.fix_field(field, message)
    browser.submit_form()  # Retry

Performance Optimization

Batch Operations

# Instead of sequential commands
browser.click("button1")
browser.click("button2")
browser.type("input1", "text")

# Use batch commands
commands = [
    {"method": "click", "selector": "button1"},
    {"method": "click", "selector": "button2"},
    {"method": "type", "selector": "input1", "text": "text"}
]
browser.execute_batch(commands)

Caching Strategies

# Cache page structure
if not browser.has_cached_structure(url):
    structure = browser.extract_page_structure()
    browser.cache_structure(url, structure)

# Use cached selectors
selectors = browser.get_cached_selectors(url)

Security Considerations

Credential Management

Never hardcode credentials in scripts
Use environment variables or secure storage
Implement credential rotation
Log credential usage (without exposing values)

Session Isolation

Separate browser sessions per agent
Clear cookies and storage between sessions
Use incognito/private mode when possible
Implement session timeout

Input Validation

Validate all user inputs before browser interaction
Sanitize URLs to prevent navigation to malicious sites
Limit file system access from browser
Monitor for suspicious behavior

Maintenance

Versioning

Semantic versioning (MAJOR.MINOR.PATCH)
Backward compatibility for minor versions
Deprecation warnings for breaking changes
Migration guides between major versions

Updates

Monthly security updates
Quarterly feature updates
Annual architecture reviews
Continuous integration testing

Monitoring

Usage statistics (which agents use which features)
Error rates and common failures
Performance metrics (load times, success rates)
Agent-specific success tracking

Contributing

Adding New Features

Check if feature belongs in core or extension
Write tests for new functionality
Update documentation
Submit pull request

Reporting Issues

Include agent context (Facet, Ace, etc.)
Provide reproduction steps
Include screenshots/logs
Suggest possible solutions

Extension Development

Follow core API patterns
Reuse existing utilities when possible
Write agent-specific tests
Document extension capabilities

References

CDP Documentation

Related Skills

facets-browser-learning - Facet extension
ace-competition-automation - Ace extension
browser-testing - Testing utilities

External Resources

Version: 1.0.0
Last Updated: 2026-03-30
Maintainer: Bob (OpenClaw Agent)
License: MIT
Status: Active Development

安全使用建议

This skill appears to be what it says: a local browser automation core that talks to an OpenClaw browser/CDP on localhost and uses the openclaw CLI in some implementations. Before installing: 1) Verify the skill's provenance since Source/Homepage are unknown. 2) Only run it in an environment where the OpenClaw browser/CDP at localhost:18800 is trusted and not publicly reachable (the code connects to localhost and writes screenshots/temp files). 3) Note SKILL.md documents environment variables but the registry metadata lists none — confirm necessary env vars (cdp URLs, screenshot/session dirs) are set correctly. 4) Remove or replace example PII/credentials in example code before reuse. 5) Review subprocess usage (calls to 'openclaw' CLI) if your environment restricts command execution. If you need higher assurance, run the code in a sandboxed agent or inspect/validate the Python files locally before copying them into your OpenClaw workspace.

功能分析

Type: OpenClaw Skill Name: browser-automation-core Version: 1.0.0 The browser-automation-core bundle is a legitimate library designed to provide reusable web interaction capabilities for OpenClaw agents like Facet and Ace. It implements browser control through two methods: a wrapper for the 'openclaw' CLI (lib/browser_cli.py) and a direct Chrome DevTools Protocol (CDP) integration via WebSockets (lib/browser_core.py). The code follows standard automation patterns for navigation, form filling, and screenshot capture, and includes clear documentation and test scripts. No evidence of data exfiltration, malicious persistence, or harmful prompt injection was found; the use of subprocesses and temporary files is strictly aligned with the stated purpose of automating browser tasks.

能力评估

✓ Purpose & Capability

Name/description match the provided Python code: multiple implementations use CDP WebSocket and/or the openclaw CLI to navigate pages, capture screenshots, and manage sessions. The examples (Facet, Ace) and library files implement the advertised navigation/interaction/capture features and reference the local CDP endpoint (localhost:18800), which is consistent with a reusable browser automation core.

ℹ Instruction Scope

SKILL.md instructs the agent and user to connect to a local CDP/WebSocket, copy the skill into an OpenClaw workspace path, and set local environment variables (CDP_URL, CDP_WEBSOCKET, BROWSER_*). Those steps are reasonable for this functionality. Notes: SKILL.md contains example login usage (Onshape) and example PII-like data in examples; the README suggests writing config into the agent's skills config (expected). The instructions do not direct data to third‑party endpoints beyond the local CDP, nor do they request unrelated files, but they do reference /root/.openclaw paths which assume privilege to write there.

✓ Install Mechanism

No install spec is provided (instruction-only install), so nothing is downloaded or executed during install. The skill bundle includes Python source files; that is consistent with no installer. This is a low-risk install model, but because source is 'unknown' origin the user should verify provenance before placing files in system paths.

ℹ Credentials

SKILL.md documents environment variables (CDP endpoints, BROWSER_* settings, screenshot/session dirs) that are appropriate for a browser automation library. However, the registry metadata lists no required env vars — a mild mismatch. There are no demanded external API keys or unrelated credentials. Example files show an example email/password and other example PII — these are illustrative only but should be cleaned before production.

✓ Persistence & Privilege

The skill does not request always:true and does not modify other skills. It suggests writing screenshots, temp files, and optionally adding its config under the agent's skills config (normal). It will invoke subprocesses (openclaw CLI) and open local WebSocket connections to localhost:18800 — expected for its purpose.

版本历史

v1.0.0

Initial release of browser-automation-core, a reusable library for browser automation in OpenClaw agents. - Provides navigation, interaction, and capture tools for web automation. - Supports reusable modules: navigation, interaction, capture, forms, sessions. - Designed for extension by skills like Facet (Onshape learning) and Ace (competition entry). - Offers configuration by environment variables or OpenClaw JSON config. - Includes examples and testing strategies for both core and agent extensions. - Integrates with OpenClaw’s built-in browser via Chrome DevTools Protocol.

元数据

Slug browser-automation-core

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题

Browser Automation Core 是什么？

Core browser automation library for OpenClaw agents. Provides reusable navigation, interaction, and capture capabilities for both Facet (Onshape learning) an... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 135 次。

如何安装 Browser Automation Core？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install browser-automation-core」即可一键安装，无需额外配置。

Browser Automation Core 是免费的吗？

是的，Browser Automation Core 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Browser Automation Core 支持哪些平台？

Browser Automation Core 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Browser Automation Core？

由 stefanferreira（@stefanferreira）开发并维护，当前版本 v1.0.0。

Browser Automation Core