MagicBrowse

Name: MagicBrowse
Author: xor777

功能描述

Browser automation fallback through the magicbrowse CLI for goal-driven launch, approved attach, observe, and act on real web pages.

使用说明 (SKILL.md)

Use magicbrowse to reach a target page when your own browser tooling cannot do it reliably. The planner runs two LLM loops per task and is slower than direct browser control; prefer your own tools when they suffice. Use magicbrowse to reach a target page (search, form-filling, multi-step navigation), then hand off to magicpay for any protected step.

Setup Check

Run magicbrowse doctor first on a fresh install. It verifies the shared MagicPay gateway config and reachability.
If it fails, run magicbrowse init \x3CapiKey> (sign up at https://agents.mercuryo.io/signup), or set MAGICPAY_API_KEY in the environment. Persisted config lives at ~/.magicpay/config.json, shared with the magicpay skill.
Only proceed to launch and act once doctor passes.

Hard Rules

Consequential actions require approval. magicbrowse may navigate, inspect, draft, and prepare. It must stop and ask before submitting a form, posting or sending content, accepting terms, changing account data or settings, booking, buying, ordering, deleting or modifying remote data, or otherwise committing an irreversible or account-affecting action. After approval, re-run observe and execute only the approved final action.

MagicPay boundary. Do not use act, type, fill, or select for any of the following on any page:

login or signup credentials (email, username, password, OTP),

identity-document fields (passport, ID, KYC address, DOB tied to identity),

payment-card or banking fields (PAN, CVV, expiry, IBAN, account),

any value sourced from a vault or secret store.

Stop at the form boundary and switch to the magicpay skill.

Target-ids are snapshot-scoped. Valid only for the observe snapshot that produced them. Re-run observe after any click, type, navigation, popup, or lazy-load before the next primitive — reusing an old id silently addresses a different element.

✓ observe → click 12 → observe → type 7 "hello" ✗ observe → click 12 → type 7 "hello"

One workflow per MAGICBROWSE_HOME. The current-session pointer at $MAGICBROWSE_HOME/current-session.json (default ~/.magicbrowse/) is a singleton. Concurrent workflows on the same home overwrite each other. Set a distinct MAGICBROWSE_HOME per workflow for parallel use.

Fresh browser by default. Prefer an owned, fresh browser session. Use attach, --profile, or --user-data-dir only when the user explicitly approves that browser/session for the current task. Keep CDP endpoints private. Close the session before unrelated work.

Page context can leave the browser. LLM-backed act sends page state to the MagicPay gateway; --use-vision can include screenshots. Avoid private pages unless the user approves that workflow, and stop at protected forms.

Primary Workflow

Contract: launch [url] → act … act → close. Sequential act calls in one session preserve page state and planner memory.

magicbrowse launch \x3Curl> — start an owned Chrome session pre-placed at the entry URL. --headful opts out of headless. To attach to an existing CDP browser instead, first get explicit user approval for that endpoint/session: magicbrowse attach \x3Ccdp-url-or-ws-endpoint> (positional, not a --cdp-url flag).
magicbrowse act "\x3Cgoal>" — natural-language browser step. Prompt is positional. act does not take --url; you cannot reset the page from inside act. To re-anchor, close and launch again.
Repeat act for the next strategic granule.
magicbrowse close — release the session when done.

magicbrowse run exists in the CLI for one-shot developer use. It is not part of this skill contract — its bundled close destroys continuity. Do not use it in an orchestrated workflow.

Fallback Ladder

Try in order. Do not start at layer 4 just because primitives exist.

Your own browser tooling (Computer Use, native browser tools).
magicbrowse act "\x3Cgoal>" — DOM-only navigator.
magicbrowse act "\x3Cgoal>" --use-vision — same goal, navigator with screenshots. Use only when the user is comfortable sending screenshots/page context for this workflow. Vision is a retry mode for the same task; keep the granule.
magicbrowse observe + primitives — click \x3Ctarget-id>, type \x3Ctarget-id> \x3Ctext>, fill \x3Ctarget-id> \x3Cvalue>, select \x3Ctarget-id> \x3Coption-text>, press \x3Ckeys>. Use only when vision-mode act cannot make progress, or when single-element precision is required. press is global — click first if focus matters.
Surface failure to the user.

Goal Granularity

Granule = atomic strategic segment. End each act where the orchestrator needs the next strategic decision. Tactics (which form field first) live inside act; strategy (this partner is wrong, try another) lives between act calls.
Target horizon: 15-30 navigator steps per act; smaller is safer. maxSteps: 100 is a safety ceiling. The planner self-validates done=true, so longer tasks have more room for false-positive completion. Prefer smaller granules when the success criterion cannot be checked externally.
Auth walls and captcha are hard boundaries, not obstacles. A task that plans through auth ends with status: completed and a finalMessage asking for login, not failed. Plan tasks to end at auth, not through it.
Rely on session memory; do not re-narrate. Sequential act calls in one session preserve page state and planner memory. Do not write "as we already found, continue with…" into goals — if you feel the need to, the granularity is wrong.

Goal Formulation

No element indexes or selectors in goal text. Indexes renumber on every DOM scan. Describe elements semantically.
- ✗ act "click target 14"
- ✓ act "click the 'Continue' button under the price summary"
Describe the expected terminal state where it adds a checkable criterion.
- ✗ act "get to checkout"
- ✓ act "navigate to a checkout page that shows passenger fields and total fare"
Pass the starting URL to launch, not as a separate step. To switch sites mid-workflow, either close and re-launch, or describe the navigation inside the goal text.

Common Mistakes

Element indexes ([14], target 7) in goal text.

magicbrowse run for orchestrated multi-step workflows.

type / fill / select / act on protected fields instead of switching to magicpay.

Letting act submit, post, book, buy, save, delete, or otherwise commit an account-affecting action without explicit approval.

Attaching to a logged-in browser or named profile without explicit approval for the current task.

Re-narrating prior act results into the next goal — sequential act calls keep state.

Starting at layer 4 (observe + primitives) without trying act.

Reusing a target-id from before a click, navigation, or popup.

Status and Errors

act returns status: completed | failed | max_steps | cancelled. completed does not always mean task success — auth walls and captcha return completed with a finalMessage asking for human action. Parse finalMessage for the actual outcome. See references/statuses.md.

References

references/commands.md — every CLI command.
references/workflow.md — worked end-to-end example.
references/guardrails.md — long-form hard rules.
references/statuses.md — outcome codes and finalMessage parsing.

安全使用建议

Install only if you are comfortable with an external CLI controlling browser sessions for approved tasks. Prefer fresh browser sessions, do not use it for credentials or payment fields, avoid private pages unless you approve sending their context to the gateway, and approve only exact consequential actions.

功能分析

Type: OpenClaw Skill Name: magicbrowse Version: 0.1.3 The magicbrowse skill is a browser automation tool that uses an external LLM gateway for goal-driven navigation. It contains extensive and explicit security guardrails in SKILL.md and references/guardrails.md, specifically defining a 'MagicPay Boundary' that forbids the agent from handling credentials, PII, or financial data. The instructions require explicit user approval for any consequential or irreversible actions, and the overall design focuses on safety and hand-offs to specialized secure skills for sensitive tasks.

能力标签

cryptocan-make-purchasesrequires-oauth-tokenrequires-sensitive-credentials

能力评估

ℹ Purpose & Capability

The skill’s browser automation powers are broad but match its stated purpose: launching, observing, and acting on real web pages as a fallback when native browser tooling is insufficient.

ℹ Instruction Scope

The instructions include explicit approval requirements before consequential actions and forbid entering credentials, identity, banking, or vault-sourced values through MagicBrowse.

ℹ Install Mechanism

The skill installs a pinned external npm CLI package and contains no local executable code for review; this is purpose-aligned but means the installed CLI is the trusted runtime.

ℹ Credentials

The required MagicPay API key, shared config file, and external gateway use are disclosed and proportionate to LLM-backed browser automation, but users should avoid private pages unless they approve that workflow.

ℹ Persistence & Privilege

The skill persists session pointers and planner/page state across a workflow and can optionally use existing browser profiles or CDP sessions with user approval, which can inherit logged-in account authority.

版本历史

v0.1.3

Release magicbrowse-v0.1.3

v0.1.2

Release magicbrowse-v0.1.2

v0.1.1

Release magicbrowse-v0.1.1

v0.1.0

Release magicbrowse-v0.1.0

元数据

Slug magicbrowse

版本 0.1.3

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 4

常见问题

MagicBrowse 是什么？

Browser automation fallback through the magicbrowse CLI for goal-driven launch, approved attach, observe, and act on real web pages. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 112 次。

如何安装 MagicBrowse？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install magicbrowse」即可一键安装，无需额外配置。

MagicBrowse 是免费的吗？

是的，MagicBrowse 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

MagicBrowse 支持哪些平台？

MagicBrowse 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 MagicBrowse？

由 Dmitry Ukhanov（@xor777）开发并维护，当前版本 v0.1.3。