← 返回 Skills 市场
fuzzyb33s

Browser Automation

作者 Fuzzyb33s · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
92
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install fuzzy-browser-automation
功能描述
Automate any web browser task with OpenClaw's built-in Playwright browser control. Use when: (1) scraping dynamic pages, (2) filling forms and submitting, (3...
使用说明 (SKILL.md)

Browser Automation

Control a Chromium browser directly from OpenClaw — navigate, click, type, snapshot, screenshot, extract data. Works with both the sandboxed OpenClaw-managed browser and your logged-in user browser (with profile="user").

Browser Selection

Target When to Use
sandbox (default) OpenClaw's clean browser — no cookies, no login state
host Browser running on the host machine
node Browser on a paired remote node
Profile When to Use
(omit) Clean OpenClaw-managed browser
profile="user" Your own browser with active logins (requires you present)

Core Actions

snapshot — Inspect the Page

browser(action="snapshot", target="sandbox")

Returns the full page DOM as a structured tree. Use refs="aria" for screen-reader-friendly selectors, refs="role" (default) for role+name based refs.

browser(
  action="snapshot",
  target="sandbox",
  refs="aria"
)

screenshot — Capture the Page

browser(action="screenshot", target="sandbox")

For full-page screenshots:

browser(
  action="screenshot",
  target="sandbox",
  fullPage=true
)

navigate — Open a URL

browser(action="navigate", target="sandbox", url="https://news.ycombinator.com")

act — Interact with Elements

The act action is the workhorse. It combines ref (what to target) + kind (action type) + request (action details).

Click:

browser(
  action="act",
  target="sandbox",
  ref="aria:Submit",
  request={"kind": "click"}
)

Type:

browser(
  action="act",
  target="sandbox",
  ref="id:search-box",
  request={"kind": "type", "text": "openclaw browser automation"}
)

Press a key:

browser(
  action="act",
  target="sandbox",
  ref="id:search-box",
  request={"kind": "press", "key": "Enter"}
)

Hover:

browser(
  action="act",
  target="sandbox",
  ref="css:.dropdown-menu",
  request={"kind": "hover"}
)

Select from dropdown:

browser(
  action="act",
  target="sandbox",
  ref="id:country-select",
  request={"kind": "select", "values": ["South Africa"]}
)

Wait for element:

browser(
  action="act",
  target="sandbox",
  ref="aria:Loading",
  request={"kind": "wait", "timeMs": 5000}
)

Locator Reference (ref types)

Prefix Example Best For
aria: aria:Submit Accessible labels, buttons with text
id: id:email-input Unique element IDs
css: css:.card:nth-child(2) Complex CSS selectors
role: role:button[name="Submit"] Semantic role selectors
text: text:Get Started Visible text content
xpath: xpath://button[@class="btn"] Fallback for complex paths

For stable refs across calls, prefer refs="aria" in snapshots — these use ARIA labels that rarely change.

Recipes

Recipe 1: Scrape a Dynamic Page

// 1. Navigate
browser(action="navigate", target="sandbox", url="https://news.ycombinator.com/news")

// 2. Wait for content to load
browser(
  action="act",
  target="sandbox",
  loadState="networkidle",
  ref="css:.itemlist",
  request={"kind": "wait", "timeMs": 3000}
)

// 3. Snapshot to extract structured data
browser(action="snapshot", target="sandbox", refs="aria")

Recipe 2: Fill and Submit a Form

// 1. Navigate to form
browser(action="navigate", target="sandbox", url="https://example.com/contact")

// 2. Fill inputs
browser(action="act", target="sandbox", ref="id:name",    request={"kind": "fill", "text": "Alice Smith"})
browser(action="act", target="sandbox", ref="id:email",   request={"kind": "fill", "text": "[email protected]"})
browser(action="act", target="sandbox", ref="id:message", request={"kind": "fill", "text": "Hi, I'd like to know more..."})

// 3. Click submit
browser(action="act", target="sandbox", ref="aria:Submit", request={"kind": "click"})

// 4. Wait for confirmation
browser(
  action="act",
  target="sandbox",
  ref="aria:Thank you",
  request={"kind": "wait", "timeMs": 2000}
)

Recipe 3: Login to a Service (User Browser)

// Requires you to be present at the machine — uses your actual browser session
browser(action="navigate", target="host", url="https://github.com/login")

browser(action="act", target="host", ref="id:login_field", request={"kind": "fill", "text": "myuser"})
browser(action="act", target="host", ref="id:password",    request={"kind": "fill", "text": "mypassword"})
browser(action="act", target="host", ref="css:[type=submit]", request={"kind": "click"})

Recipe 4: Monitor Price / Availability

// Navigate and wait for price to update
browser(action="navigate", target="sandbox", url="https://example.com/product/123")

browser(
  action="act",
  target="sandbox",
  ref="css:.price",
  request={"kind": "wait", "timeMs": 10000}
)

// Capture screenshot
browser(action="screenshot", target="sandbox")

// Evaluate for price text
browser(
  action="act",
  target="sandbox",
  request={
    "kind": "evaluate",
    "fn": "() => document.querySelector('.price').innerText"
  }
)

Recipe 5: Multi-Tab Workflow

// Open new tab
browser(action="navigate", target="sandbox", url="https://mail.google.com")

// Switch tabs
browser(action="act", target="sandbox", request={"kind": "press", "key": "Control+Tab"})

// Close current tab
browser(action="act", target="sandbox", request={"kind": "press", "key": "Control+W"})

Recipe 6: Scroll and Load Lazy Content

// Scroll by a pixel amount
browser(
  action="act",
  target="sandbox",
  request={
    "kind": "evaluate",
    "fn": "() => window.scrollBy(0, 800)"
  }
)

// Scroll to bottom (infinite scroll pages)
browser(
  action="act",
  target="sandbox",
  request={
    "kind": "evaluate",
    "fn": "() => window.scrollTo(0, document.body.scrollHeight)"
  }
)

Recipe 7: Extract Table Data

browser(action="navigate", target="sandbox", url="https://example.com/sales-report")

browser(
  action="act",
  target="sandbox",
  ref="css:table",
  request={"kind": "wait", "timeMs": 2000}
)

browser(
  action="act",
  target="sandbox",
  request={
    "kind": "evaluate",
    "fn": "() => Array.from(document.querySelectorAll('table tr')).map(row => Array.from(row.querySelectorAll('td')).map(cell => cell.innerText))"
  }
)

Recipe 8: Download a File

browser(action="navigate", target="sandbox", url="https://example.com/export.csv")

browser(
  action="act",
  target="sandbox",
  request={
    "kind": "evaluate",
    "fn": "() => { const link = document.querySelector('a[href$=\".csv\"]'); return link ? link.href : null; }"
  }
)

Action Reference

Action What It Does
snapshot Get structured page DOM
screenshot Capture page as PNG/JPEG
navigate Open a URL
act Click, type, press, hover, select, wait, evaluate
pdf Generate PDF of the page
console Read browser console logs
open Open a new tab
close Close current tab

act kind Reference

Kind Parameters
click
type text
fill text
press key (e.g. "Enter", "Escape", "Control+Tab")
hover
select values (array)
wait timeMs
evaluate fn (JavaScript string)
drag startRef, endRef
resize width, height
close

Anti-Patterns

  • Don't click before the page loads — always navigate then wait for loadState="networkidle" or an explicit element wait
  • Don't use hard pixel waits — prefer waiting for a specific element or networkidle state
  • Don't scrape without rate limiting — add timeMs waits between actions to avoid IP blocks
  • Don't use profile="user" for automated workflows — it's meant for attended use; automated flows should use the sandbox browser
  • Don't use xpath unless nothing else works — xpath selectors break easily when the page changes

Troubleshooting

Symptom Fix
"Target closed" error Browser timed out — navigate again
Element not found Page may be JS-rendered — add loadState="networkidle" or explicit wait
Click missed the button Use ref="aria:Button Text" instead of CSS — more robust
Stale element reference Element was replaced by a DOM update — re-snapshot and retry
Form submits twice Wait for navigation after submit before continuing
Screenshot is blank Page still loading — add loadState="networkidle"
profile="user" not working The logged-in browser must already be running; start it manually first

See Also

  • webhook-automation skill — combining browser-extracted data with outgoing webhooks
  • rss-aggregator skill — using browser scraping as a fallback when feeds aren't available
  • cron-scheduler skill — scheduling browser-based monitoring tasks
安全使用建议
This skill does what it says (automates browsers) but has sensitive capabilities: it can operate on your real browser profile and run arbitrary JavaScript inside pages, which can read session cookies, saved data, and send data out. Because the skill has no homepage or known publisher, only install/use it if you trust the source. Prefer using the sandbox target rather than 'host'/'profile="user"'. If you must use host automation: (1) require explicit, local user presence and confirmation before any host-target actions, (2) avoid allowing any 'evaluate' calls that run arbitrary JS unless you inspect the function, (3) disable autonomous (background) invocation for this skill or restrict it to manual runs, (4) test all recipes on non-sensitive pages first, and (5) monitor logs for unexpected outbound requests. If you need stronger assurance, ask the publisher for provenance (homepage, source repo, or signed package) before enabling host-profile automation.
功能分析
Type: OpenClaw Skill Name: fuzzy-browser-automation Version: 1.0.0 The skill provides instructions for an AI agent to perform browser automation with high-risk capabilities, specifically the 'evaluate' action for executing arbitrary JavaScript and the ability to access the host machine's browser and active user profiles ('profile="user"'). While these features are documented for legitimate automation tasks in skill.md, they grant the agent access to sensitive session data and credentials. The inclusion of a login automation recipe (Recipe 3) further highlights the risk of handling sensitive information within the agent's context.
能力评估
Purpose & Capability
Name and description (browser automation via Playwright-like controls) align with the provided instructions and example actions (navigate, click, type, snapshot, screenshot, evaluate). No unrelated binaries, env vars, or installs are requested — capability is coherent with the stated purpose.
Instruction Scope
Instructions explicitly support operating on the host (user) browser profile and include an 'evaluate' action that runs arbitrary JavaScript in page context. Those behaviors go beyond benign scraping: they can read session cookies, localStorage, and page DOM and can perform network requests from the page (potentially exfiltrating data). The SKILL.md does not include any constraints or safe-handling guidance about sensitive data or external transmissions.
Install Mechanism
Instruction-only skill with no install spec or downloaded code — lowest disk/write risk. There are no package installs or third-party downloads to review.
Credentials
The skill requests no environment variables or credentials, but it requests access to the host browser profile ('profile="user"' / target='host'), which implicitly grants access to sensitive data (cookies, active sessions, saved credentials) without any declared authorization mechanism. The lack of provenance (unknown source, no homepage) increases the risk because there's no clear trust anchor for granting that privileged access.
Persistence & Privilege
always is false and the skill is user-invocable (normal). The platform default allows autonomous invocation, and that combined with host-browser access and arbitrary JS execution elevates the potential impact if the skill is ever invoked without close supervision. Consider disabling autonomous invocation for this skill if you plan to allow host-profile operations.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install fuzzy-browser-automation
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /fuzzy-browser-automation 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release providing robust browser automation capabilities: - Automate web browser tasks such as page navigation, clicking, typing, screenshots, PDF generation, and data extraction. - Supports both sandboxed and logged-in (user) browsers, with clear instructions for selecting browser targets and profiles. - Detailed action and locator references allow fine-grained control over web interactions (click, type, hover, select, wait, evaluate, etc.). - Includes practical recipes for scraping, form filling, login, multi-tab workflows, monitoring content, and file downloads. - Simple JSON-based API for consistent and scriptable automation flows.
元数据
Slug fuzzy-browser-automation
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Browser Automation 是什么?

Automate any web browser task with OpenClaw's built-in Playwright browser control. Use when: (1) scraping dynamic pages, (2) filling forms and submitting, (3... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 92 次。

如何安装 Browser Automation?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install fuzzy-browser-automation」即可一键安装,无需额外配置。

Browser Automation 是免费的吗?

是的,Browser Automation 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Browser Automation 支持哪些平台?

Browser Automation 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Browser Automation?

由 Fuzzyb33s(@fuzzyb33s)开发并维护,当前版本 v1.0.0。

💬 留言讨论