功能描述

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking...

使用说明 (SKILL.md)

mybrowser-skill

Name: my-browser-bot
Author: handongpu16

Platform Support

Linux x86_64: Supported
macOS: Not supported
Windows: Not supported
Other Linux architectures (ARM, etc.) are not supported.

Installation

pipx install mybrowser-skill
mybrowser-skill install   # Download Chromium

Note:

Each command will return a snapshot of the current page after execution, including the index of elements. Please call the standalone mybrowser-skill browser_snapshot command only when necessary to avoid unnecessary token consumption.

Core Workflow

Every browser automation follows this pattern:

Navigate: mybrowser-skill browser_go_to_url --url \x3Curl>
Snapshot: mybrowser-skill browser_snapshot (get indexed element refs)
Interact: Use element index to click, fill, select
Re-snapshot: After navigation or DOM changes, get fresh refs

mybrowser-skill browser_go_to_url --url https://example.com/form
mybrowser-skill browser_snapshot
# Output includes element indices: [1] input "email", [2] input "password", [3] button "Submit"

mybrowser-skill browser_input_text --index 1 --text "[email protected]"
mybrowser-skill browser_input_text --index 2 --text "password123"
mybrowser-skill browser_click_element --index 3
mybrowser-skill browser_wait --seconds 2
mybrowser-skill browser_snapshot  # Check result

Essential Commands

# Navigation
mybrowser-skill browser_go_to_url --url \x3Curl>       # Navigate to URL
mybrowser-skill browser_go_back                      # Go back
mybrowser-skill browser_wait --seconds 3             # Wait for page load (default 3s)

# Snapshot & Screenshot
mybrowser-skill browser_snapshot                     # Get page content with element indices
mybrowser-skill browser_screenshot                   # Take screenshot (returns temp file path of .webp image)
mybrowser-skill browser_screenshot --full            # Full-page screenshot (returns temp file path)
mybrowser-skill browser_screenshot --annotate        # Annotated screenshot with element labels (returns temp file path)
mybrowser-skill browser_markdownify                  # Convert page to markdown

# Click & Input (use indices from snapshot)
mybrowser-skill browser_click_element --index 1      # Click element
mybrowser-skill browser_dblclick_element --index 1   # Double-click element
mybrowser-skill browser_focus_element --index 1      # Focus element
mybrowser-skill browser_input_text --index 1 --text "hello"  # Input text into element

# Scroll
mybrowser-skill browser_scroll_down                  # Scroll down one page
mybrowser-skill browser_scroll_down --amount 300     # Scroll down 300px
mybrowser-skill browser_scroll_up                    # Scroll up one page
mybrowser-skill browser_scroll_up --amount 300       # Scroll up 300px
mybrowser-skill browser_scroll_to_text --text "Section 3"    # Scroll to text
mybrowser-skill browser_scroll_to_top                # Scroll to top
mybrowser-skill browser_scroll_to_bottom             # Scroll to bottom
mybrowser-skill browser_scroll_by --direction down --pixels 500              # Scroll page by direction
mybrowser-skill browser_scroll_by --direction right --pixels 200 --index 3   # Scroll element by direction
mybrowser-skill browser_scroll_into_view --index 5   # Scroll element into view

# Keyboard
mybrowser-skill browser_keypress --key Enter         # Press a key
mybrowser-skill browser_keyboard_op --action type --text "hello"        # Type text
mybrowser-skill browser_keyboard_op --action inserttext --text "hello"  # Insert text without key events
mybrowser-skill browser_keydown --key Shift          # Hold down a key
mybrowser-skill browser_keyup --key Shift            # Release a key

# Dropdown
mybrowser-skill browser_get_dropdown_options --index 2           # Get dropdown options
mybrowser-skill browser_select_dropdown_option --index 2 --text "Option A"  # Select option

# Checkbox
mybrowser-skill browser_check_op --index 4 --value               # Check checkbox
mybrowser-skill browser_check_op --index 4                        # Uncheck checkbox (omit --value)

# Get Information
mybrowser-skill browser_get_info --type text --index 1   # Get element text
mybrowser-skill browser_get_info --type url              # Get current URL
mybrowser-skill browser_get_info --type title            # Get page title
mybrowser-skill browser_get_info --type html --index 1   # Get element HTML
mybrowser-skill browser_get_info --type value --index 1  # Get element value
mybrowser-skill browser_get_info --type attr --index 1 --attribute href   # Get attribute
mybrowser-skill browser_get_info --type count            # Get element count
mybrowser-skill browser_get_info --type box --index 1    # Get bounding box
mybrowser-skill browser_get_info --type styles --index 1 # Get computed styles
mybrowser-skill browser_check_state --state visible --index 1    # Check visibility
mybrowser-skill browser_check_state --state enabled --index 1    # Check if enabled
mybrowser-skill browser_check_state --state checked --index 1    # Check if checked

# Find and Act (semantic locators)
mybrowser-skill browser_find_and_act --by role --value button --action click --name "Submit"
mybrowser-skill browser_find_and_act --by text --value "Sign In" --action click
mybrowser-skill browser_find_and_act --by label --value "Email" --action fill --actionValue "[email protected]"
mybrowser-skill browser_find_and_act --by placeholder --value "Search" --action type --actionValue "query"
mybrowser-skill browser_find_and_act --by testid --value "submit-btn" --action click

# Download
mybrowser-skill browser_download_file --index 5      # Download file by clicking element
mybrowser-skill browser_download_url                 # Download from URL

# Tab Management
mybrowser-skill browser_tab_open --url \x3Curl>         # Open URL in new tab
mybrowser-skill browser_tab_list                     # List open tabs
mybrowser-skill browser_tab_switch --tabId 2         # Switch to tab
mybrowser-skill browser_tab_close --tabId 2          # Close tab

# Dialog
mybrowser-skill browser_dialog --action accept       # Accept dialog
mybrowser-skill browser_dialog --action dismiss      # Dismiss dialog
mybrowser-skill browser_dialog --action accept --text "input text"  # Accept prompt with text

# Task Completion
mybrowser-skill browser_done --success --text "Task completed"      # Mark task as done
mybrowser-skill browser_done --text "Still in progress"              # Mark task as incomplete

# Help
mybrowser-skill list                                 # List all available skills
mybrowser-skill \x3Cskill_name> --help                  # Show help for a specific skill

# Skill Status 
mybrowser-skill status                               # Check status

Common Patterns

Form Submission

mybrowser-skill browser_go_to_url --url https://example.com/signup
mybrowser-skill browser_snapshot
mybrowser-skill browser_input_text --index 1 --text "Jane Doe"
mybrowser-skill browser_input_text --index 2 --text "[email protected]"
mybrowser-skill browser_select_dropdown_option --index 3 --text "California"
mybrowser-skill browser_check_op --index 4 --value
mybrowser-skill browser_click_element --index 5
mybrowser-skill browser_wait --seconds 2
mybrowser-skill browser_snapshot  # Verify result

Data Extraction

mybrowser-skill browser_go_to_url --url https://example.com/products
mybrowser-skill browser_snapshot
mybrowser-skill browser_get_info --type text --index 5    # Get specific element text
mybrowser-skill browser_markdownify                        # Get full page as markdown

Infinite Scroll Pages

mybrowser-skill browser_go_to_url --url https://example.com/feed
mybrowser-skill browser_scroll_to_bottom     # Trigger lazy loading
mybrowser-skill browser_wait --seconds 2     # Wait for content
mybrowser-skill browser_snapshot             # Get updated content

Element Index Lifecycle (Important)

Element indices are invalidated when the page changes. Always re-snapshot after:

Clicking links or buttons that navigate
Form submissions
Dynamic content loading (dropdowns, modals, AJAX)

mybrowser-skill browser_click_element --index 5   # May navigate to new page
mybrowser-skill browser_snapshot                   # MUST re-snapshot
mybrowser-skill browser_click_element --index 1   # Use new indices

安全使用建议

This skill appears to do what it says, but before installing: (1) verify the package source — check the PyPI package page or the source repository and author identity; (2) inspect the package code or review its repo for how it downloads Chromium (which host/URL and checksum); (3) install and run it in a sandboxed environment or container first, not on a sensitive machine; (4) avoid giving the tool any secrets or credentials and be cautious when automating actions on pages that contain private data; (5) prefer alternatives with clear provenance or an official homepage if you cannot verify this package. If you can provide the package's homepage or repository, I can reassess with higher confidence.

功能分析

Type: OpenClaw Skill Name: my-browser-bot Version: 1.0.4 The skill bundle provides a comprehensive set of browser automation commands (navigation, interaction, snapshots, and downloads) for an AI agent. The instructions in SKILL.md are well-documented, align with the stated purpose of browser automation, and do not contain any evidence of prompt injection, data exfiltration, or malicious intent.

能力评估

✓ Purpose & Capability

The name and description (browser automation) match the instructions: navigate pages, snapshot DOM, click/fill elements, take screenshots, and download files. Requiring an external CLI and Chromium is coherent for this purpose.

ℹ Instruction Scope

SKILL.md tells the agent to run a pipx install of 'mybrowser-skill' and then run many browser actions (navigate arbitrary URLs, click elements, download files, save screenshots to temp files). These actions are expected for a browser automation tool, but they also imply the agent will access arbitrary web content and local temp files — review how the installed tool handles data, network I/O, and filesystem writes.

⚠ Install Mechanism

The skill is instruction-only but instructs users/agents to run 'pipx install mybrowser-skill' and to run 'mybrowser-skill install' which will download Chromium. Installing an unvetted PyPI package and downloading a browser binary are moderate risks because the exact sources/URLs and package provenance are not specified.

✓ Credentials

No environment variables, credentials, or config paths are requested. The lack of required secrets is proportionate to the declared functionality.

✓ Persistence & Privilege

The skill is not set to always:true and does not request elevated platform-wide privileges. Autonomous invocation is allowed (default) but not combined with other concerning flags.

版本历史

v1.0.4

- Updated installation instructions to use `pipx install mybrowser-skill` and added a step to download Chromium (`mybrowser-skill install`). - Added `browser_scroll_to_text --text "Section 3"` command to the Scroll section of essential commands.

v1.0.3

No user-facing changes detected in this version. - No modifications found between the previous and current version.

v1.0.2

- Package installation name updated from qqbrowser-skill to mybrowser-skill. - Example usage and documentation now reference mybrowser-skill consistently. - Minor documentation improvements for clarity and completeness. - No changes in functionality or commands.

v1.0.1

- Updated install instructions: use `pip install qqbrowser-skill` (was `pip install mybrowser-skill`). - Removed references to manual QQ Browser installation and daemon service management. - Shortened and clarified the description for general users. - Streamlined SKILL.md by removing redundant or obsolete setup steps. - Added new sections for status checking and debug commands.

v1.0.0

mybrowser-skill 1.0.0 – Initial Release - Introduces a browser automation CLI for AI agents to interact with websites programmatically. - Supports navigation, form filling, clicking, taking screenshots, scroll, keyboard input, dropdown/checkbox handling, and information extraction via CLI commands. - Works on Linux x86_64 (not supported on macOS/Windows/ARM). - Persistent daemon service required for task execution. - Includes tab management, download features, semantic element finding, and browser state checks. - Provides installation and usage instructions in SKILL.md.

元数据

Slug my-browser-bot

版本 1.0.4

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 5

常见问题

my-browser-bot 是什么？

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 177 次。

如何安装 my-browser-bot？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install my-browser-bot」即可一键安装，无需额外配置。

my-browser-bot 是免费的吗？

是的，my-browser-bot 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

my-browser-bot 支持哪些平台？

my-browser-bot 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 my-browser-bot？

由 handongpu16（@handongpu16）开发并维护，当前版本 v1.0.4。

my-browser-bot