← 返回 Skills 市场
stefanferreira

Browser Automation Core

作者 stefanferreira · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
135
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install browser-automation-core
功能描述
Core browser automation library for OpenClaw agents. Provides reusable navigation, interaction, and capture capabilities for both Facet (Onshape learning) an...
使用说明 (SKILL.md)

Browser Automation Core Skill

Purpose

A reusable browser automation library that provides common web interaction capabilities for multiple OpenClaw agents. Designed to be extended by agent-specific skills while maintaining a single, well-tested core.

Primary Users

  1. Facet - Onshape CAD learning automation
  2. Ace - Competition entry and form filling
  3. Future agents - Any web automation needs

Architecture

browser-automation-core/          # This skill
├── navigation/                   # URL loading, waiting
├── interaction/                  # Click, type, select
├── capture/                      # Screenshot, HTML capture
├── forms/                        # Form detection and filling
└── sessions/                     # Tab/window management

facets-browser-learning/          # Facet-specific extension
└── uses core + Onshape-specific logic

ace-competition-automation/       # Ace-specific extension  
└── uses core + competition-specific logic

Core Capabilities

Navigation

  • URL loading with timeout and retry
  • Wait conditions (element visible, page loaded)
  • History management (back, forward, refresh)
  • Tab/window control (open, close, switch)

Interaction

  • Element finding (CSS selectors, XPath, text)
  • Basic actions (click, type, clear, submit)
  • Mouse operations (hover, drag, scroll)
  • Keyboard operations (key presses, shortcuts)

Capture

  • Screenshots (full page, element, viewport)
  • HTML capture (page source, element HTML)
  • Text extraction (visible text, attributes)
  • Performance metrics (load times, resources)

Forms

  • Form detection (find all forms on page)
  • Field mapping (match fields to data)
  • Validation (required fields, formats)
  • Submission (submit buttons, AJAX handling)

Sessions

  • Cookie management (save/load sessions)
  • Authentication state (login persistence)
  • Profile management (user agent, viewport)
  • Cleanup (close browsers, clear data)

Quick Start

Installation

# Install from ClawHub (when published)
npx clawhub@latest install browser-automation-core

# Or use local development version
cp -r /path/to/skill /root/.openclaw/workspace/skills/

Basic Usage

# Example: Navigate and take screenshot
from browser_core import BrowserAutomation

browser = BrowserAutomation()
browser.navigate("https://example.com")
browser.take_screenshot("example.png")
browser.close()

Agent-Specific Examples

For Facet (Onshape Learning)

from browser_core import BrowserAutomation
from facets_onshape import OnshapeAutomation

browser = BrowserAutomation()
onshape = OnshapeAutomation(browser)

# Login to Onshape
onshape.login(email="[email protected]", password="***")

# Navigate to tutorials
onshape.navigate_to_tutorial("getting-started")

# Complete tutorial steps
onshape.complete_tutorial_step(1)
onshape.take_progress_screenshot()

For Ace (Competition Entry)

from browser_core import BrowserAutomation
from ace_competition import CompetitionAutomation

browser = BrowserAutomation()
competition = CompetitionAutomation(browser)

# Navigate to competition
competition.navigate_to_competition("https://competition.example.com")

# Fill entry form
entry_data = {
    "name": "Stef Ferreira",
    "email": "[email protected]",
    "phone": "+27726386189"
}
competition.fill_entry_form(entry_data)

# Submit and capture proof
competition.submit_entry()
competition.capture_submission_proof()

Configuration

Environment Variables

# Browser settings
export BROWSER_HEADLESS="true"           # Run without display
export BROWSER_TIMEOUT="30"              # Default timeout seconds
export BROWSER_VIEWPORT="1280,720"       # Window size
export BROWSER_USER_AGENT="OpenClaw Agent" # Custom user agent

# CDP settings (OpenClaw browser)
export CDP_URL="http://localhost:18800/json"
export CDP_WEBSOCKET="ws://localhost:18800/devtools/page/..."

# Storage settings
export SCREENSHOT_DIR="/path/to/screenshots"
export SESSION_DIR="/path/to/sessions"

OpenClaw Integration

{
  "skills": {
    "browser-automation-core": {
      "enabled": true,
      "config": {
        "cdpUrl": "http://localhost:18800/json",
        "headless": true,
        "timeout": 30,
        "screenshotDir": "/root/.openclaw/workspace/screenshots"
      }
    }
  }
}

Implementation Details

CDP (Chrome DevTools Protocol)

This skill uses OpenClaw's built-in browser via CDP:

  • Connection: WebSocket to ws://localhost:18800/devtools/page/...
  • Commands: Standard CDP methods (Page.navigate, DOM.querySelector, etc.)
  • Events: Async event handling for page loads, network requests

Error Handling

  • Retry logic: Automatic retry on network failures
  • Timeout management: Configurable timeouts per operation
  • Fallback strategies: Alternative selectors, different interaction methods
  • Recovery procedures: Page reload, session restore

Performance

  • Connection pooling: Reuse WebSocket connections
  • Command batching: Batch CDP commands when possible
  • Caching: Cache page structure, element positions
  • Parallel operations: Async operations where safe

Extension Points

Creating Agent-Specific Extensions

1. Create Extension Skill

python3 /usr/lib/node_modules/openclaw/skills/skill-creator/scripts/init_skill.py facets-browser-learning

2. Import Core Library

# In extension skill
import sys
sys.path.append("/root/.openclaw/workspace/skills/browser-automation-core")
from browser_core import BrowserAutomation

class OnshapeAutomation:
    def __init__(self):
        self.browser = BrowserAutomation()
    
    def onshape_specific_method(self):
        # Use core capabilities
        self.browser.navigate("https://cad.onshape.com")
        # Add Onshape-specific logic

3. Add Specialized Logic

  • Site-specific selectors (Onshape CSS classes, competition form fields)
  • Domain-specific workflows (tutorial navigation, competition rules)
  • Custom capture requirements (progress tracking, entry proof)
  • Error handling for specific sites

Testing Strategy

Unit Tests

# Test core functionality
cd /root/.openclaw/workspace/skills/browser-automation-core
python3 -m pytest tests/unit/

Integration Tests

# Test with actual browser
python3 tests/integration/test_navigation.py
python3 tests/integration/test_forms.py

Agent-Specific Tests

# Test Facet extension
cd /root/.openclaw/workspace/skills/facets-browser-learning
python3 tests/test_onshape_automation.py

# Test Ace extension  
cd /root/.openclaw/workspace/skills/ace-competition-automation
python3 tests/test_competition_automation.py

Common Use Cases

Use Case 1: Form Filling (Ace)

# Competition entry automation
data = {
    "full_name": "Stef Ferreira",
    "email": "[email protected]",
    "phone": "+27726386189",
    "address": "123 Street, City, South Africa"
}

browser.navigate(competition_url)
browser.fill_form("form#entry-form", data)
browser.click("button[type='submit']")
browser.wait_for_element(".success-message")
browser.take_screenshot("entry_proof.png")

Use Case 2: Tutorial Navigation (Facet)

# Onshape learning automation
browser.navigate("https://cad.onshape.com")
browser.login(credentials)  # Custom method in extension
browser.navigate("/learning/tutorials")

# Complete tutorial
tutorial_steps = browser.extract_tutorial_steps()
for step in tutorial_steps:
    browser.complete_step(step)  # Custom method
    browser.take_screenshot(f"step_{step.number}.png")
    
browser.capture_certificate()

Use Case 3: Multi-Page Workflow

# Complex automation across multiple pages
browser.open_new_tab()
browser.navigate_to_login()
browser.login(credentials)

browser.switch_to_tab(0)
browser.fill_application_form(data)

browser.switch_to_tab(1)
browser.verify_email_confirmation()

browser.capture_all_tabs_screenshots()

Error Recovery Patterns

CORS Issues (Screenshots/Evaluate Not Working)

Problem: Browser automation fails with CORS errors when taking screenshots or evaluating JavaScript.

Solution: Ensure browser is started with --remote-allow-origins=* flag:

# Browser startup command must include:
--remote-debugging-port=18800 --remote-allow-origins=*

Verification:

curl http://localhost:18800/json/version
# Should return browser info without CORS errors

Network Issues

try:
    browser.navigate(url)
except NetworkError:
    browser.reload()
    browser.wait_for_network_idle()
    # Retry operation

Element Not Found

# Try multiple selectors
selectors = [
    "button.submit",
    "input[type='submit']",
    ".submit-button",
    "//button[contains(text(), 'Submit')]"
]

for selector in selectors:
    if browser.element_exists(selector):
        browser.click(selector)
        break

Form Validation Errors

browser.submit_form()
if browser.has_validation_errors():
    errors = browser.get_validation_errors()
    for field, message in errors:
        browser.fix_field(field, message)
    browser.submit_form()  # Retry

Performance Optimization

Batch Operations

# Instead of sequential commands
browser.click("button1")
browser.click("button2")
browser.type("input1", "text")

# Use batch commands
commands = [
    {"method": "click", "selector": "button1"},
    {"method": "click", "selector": "button2"},
    {"method": "type", "selector": "input1", "text": "text"}
]
browser.execute_batch(commands)

Caching Strategies

# Cache page structure
if not browser.has_cached_structure(url):
    structure = browser.extract_page_structure()
    browser.cache_structure(url, structure)

# Use cached selectors
selectors = browser.get_cached_selectors(url)

Security Considerations

Credential Management

  • Never hardcode credentials in scripts
  • Use environment variables or secure storage
  • Implement credential rotation
  • Log credential usage (without exposing values)

Session Isolation

  • Separate browser sessions per agent
  • Clear cookies and storage between sessions
  • Use incognito/private mode when possible
  • Implement session timeout

Input Validation

  • Validate all user inputs before browser interaction
  • Sanitize URLs to prevent navigation to malicious sites
  • Limit file system access from browser
  • Monitor for suspicious behavior

Maintenance

Versioning

  • Semantic versioning (MAJOR.MINOR.PATCH)
  • Backward compatibility for minor versions
  • Deprecation warnings for breaking changes
  • Migration guides between major versions

Updates

  • Monthly security updates
  • Quarterly feature updates
  • Annual architecture reviews
  • Continuous integration testing

Monitoring

  • Usage statistics (which agents use which features)
  • Error rates and common failures
  • Performance metrics (load times, success rates)
  • Agent-specific success tracking

Contributing

Adding New Features

  1. Check if feature belongs in core or extension
  2. Write tests for new functionality
  3. Update documentation
  4. Submit pull request

Reporting Issues

  1. Include agent context (Facet, Ace, etc.)
  2. Provide reproduction steps
  3. Include screenshots/logs
  4. Suggest possible solutions

Extension Development

  1. Follow core API patterns
  2. Reuse existing utilities when possible
  3. Write agent-specific tests
  4. Document extension capabilities

References

CDP Documentation

Related Skills

  • facets-browser-learning - Facet extension
  • ace-competition-automation - Ace extension
  • browser-testing - Testing utilities

External Resources


Version: 1.0.0
Last Updated: 2026-03-30
Maintainer: Bob (OpenClaw Agent)
License: MIT
Status: Active Development

安全使用建议
This skill appears to be what it says: a local browser automation core that talks to an OpenClaw browser/CDP on localhost and uses the openclaw CLI in some implementations. Before installing: 1) Verify the skill's provenance since Source/Homepage are unknown. 2) Only run it in an environment where the OpenClaw browser/CDP at localhost:18800 is trusted and not publicly reachable (the code connects to localhost and writes screenshots/temp files). 3) Note SKILL.md documents environment variables but the registry metadata lists none — confirm necessary env vars (cdp URLs, screenshot/session dirs) are set correctly. 4) Remove or replace example PII/credentials in example code before reuse. 5) Review subprocess usage (calls to 'openclaw' CLI) if your environment restricts command execution. If you need higher assurance, run the code in a sandboxed agent or inspect/validate the Python files locally before copying them into your OpenClaw workspace.
功能分析
Type: OpenClaw Skill Name: browser-automation-core Version: 1.0.0 The browser-automation-core bundle is a legitimate library designed to provide reusable web interaction capabilities for OpenClaw agents like Facet and Ace. It implements browser control through two methods: a wrapper for the 'openclaw' CLI (lib/browser_cli.py) and a direct Chrome DevTools Protocol (CDP) integration via WebSockets (lib/browser_core.py). The code follows standard automation patterns for navigation, form filling, and screenshot capture, and includes clear documentation and test scripts. No evidence of data exfiltration, malicious persistence, or harmful prompt injection was found; the use of subprocesses and temporary files is strictly aligned with the stated purpose of automating browser tasks.
能力评估
Purpose & Capability
Name/description match the provided Python code: multiple implementations use CDP WebSocket and/or the openclaw CLI to navigate pages, capture screenshots, and manage sessions. The examples (Facet, Ace) and library files implement the advertised navigation/interaction/capture features and reference the local CDP endpoint (localhost:18800), which is consistent with a reusable browser automation core.
Instruction Scope
SKILL.md instructs the agent and user to connect to a local CDP/WebSocket, copy the skill into an OpenClaw workspace path, and set local environment variables (CDP_URL, CDP_WEBSOCKET, BROWSER_*). Those steps are reasonable for this functionality. Notes: SKILL.md contains example login usage (Onshape) and example PII-like data in examples; the README suggests writing config into the agent's skills config (expected). The instructions do not direct data to third‑party endpoints beyond the local CDP, nor do they request unrelated files, but they do reference /root/.openclaw paths which assume privilege to write there.
Install Mechanism
No install spec is provided (instruction-only install), so nothing is downloaded or executed during install. The skill bundle includes Python source files; that is consistent with no installer. This is a low-risk install model, but because source is 'unknown' origin the user should verify provenance before placing files in system paths.
Credentials
SKILL.md documents environment variables (CDP endpoints, BROWSER_* settings, screenshot/session dirs) that are appropriate for a browser automation library. However, the registry metadata lists no required env vars — a mild mismatch. There are no demanded external API keys or unrelated credentials. Example files show an example email/password and other example PII — these are illustrative only but should be cleaned before production.
Persistence & Privilege
The skill does not request always:true and does not modify other skills. It suggests writing screenshots, temp files, and optionally adding its config under the agent's skills config (normal). It will invoke subprocesses (openclaw CLI) and open local WebSocket connections to localhost:18800 — expected for its purpose.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install browser-automation-core
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /browser-automation-core 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of browser-automation-core, a reusable library for browser automation in OpenClaw agents. - Provides navigation, interaction, and capture tools for web automation. - Supports reusable modules: navigation, interaction, capture, forms, sessions. - Designed for extension by skills like Facet (Onshape learning) and Ace (competition entry). - Offers configuration by environment variables or OpenClaw JSON config. - Includes examples and testing strategies for both core and agent extensions. - Integrates with OpenClaw’s built-in browser via Chrome DevTools Protocol.
元数据
Slug browser-automation-core
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Browser Automation Core 是什么?

Core browser automation library for OpenClaw agents. Provides reusable navigation, interaction, and capture capabilities for both Facet (Onshape learning) an... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 135 次。

如何安装 Browser Automation Core?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install browser-automation-core」即可一键安装,无需额外配置。

Browser Automation Core 是免费的吗?

是的,Browser Automation Core 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Browser Automation Core 支持哪些平台?

Browser Automation Core 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Browser Automation Core?

由 stefanferreira(@stefanferreira)开发并维护,当前版本 v1.0.0。

💬 留言讨论