← 返回 Skills 市场
huamu668

Browser Automation

作者 huamu668 · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
332
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install browser-automation-pin
功能描述
Browser automation for AI agents using PinchTab. Control Chrome programmatically for testing, scraping, and interaction. Features token-efficient text extrac...
使用说明 (SKILL.md)

Browser Automation

Browser automation for AI agents using PinchTab — a high-performance Chrome bridge with HTTP API.

What is PinchTab?

  • Standalone HTTP server — Control Chrome via HTTP API
  • Token-efficient — 800 tokens/page with text extraction (vs 10,000+ for screenshots)
  • Multi-instance — Run multiple parallel Chrome processes with isolated profiles
  • Headless or Headed — Run without window or with visible Chrome
  • Self-contained — 12MB binary, no external dependencies
  • MCP integration — Native SMCP plugin for Claude Code

Quick Start

Installation

# macOS / Linux
curl -fsSL https://pinchtab.com/install.sh | bash

# npm
npm install -g pinchtab

# Docker
docker run -d -p 9867:9867 pinchtab/pinchtab

Start Server

# Terminal 1: Start PinchTab server
pinchtab
# Server runs on http://localhost:9867

Basic Commands

# Navigate
pinchtab nav https://pinchtab.com

# Wait 3 seconds for accessibility tree
sleep 3

# Get interactive elements
pinchtab snap -i -c

# Extract text (token-efficient)
pinchtab text

# Click element by ref
pinchtab click e5

# Fill input
pinchtab fill e3 "[email protected]"

Core Concepts

Instance

A running Chrome process. Each instance has isolated state.

# Create headless instance
pinchtab instances create --mode=headless

# Create headed instance (visible window)
pinchtab instances create --mode=headed

# List instances
pinchtab instances list

# Stop instance
pinchtab instances stop \x3Cinstance-id>

Profile

Browser state (cookies, history, localStorage). Log in once, stay logged in.

# Create instance with profile
pinchtab instances create --profile=work

# Profile persists across restarts

Tab

A single webpage. Each instance can have multiple tabs.

# Open new tab
pinchtab tabs open https://example.com

# List tabs
pinchtab tabs list

# Close tab
pinchtab tabs close \x3Ctab-id>

Token-Efficient Patterns

The 3-Second Wait Rule

Critical: Chrome's accessibility tree takes ~3 seconds to populate after navigation.

# ❌ Too fast - empty tree
pinchtab nav https://example.com
pinchtab snap
# Returns: {"count": 1, "nodes": [{"ref": "e0"}]}

# ✅ Wait 3 seconds
pinchtab nav https://example.com
sleep 3
pinchtab snap
# Returns: {"count": 2645, "nodes": [...]}

Optimal Extraction Pattern

# Navigate + wait + filter (14x token savings)
curl -X POST http://localhost:9867/navigate \
  -d '{"url": "https://example.com"}' && \
sleep 3 && \
curl http://localhost:9867/snapshot | \
jq '.nodes[] | select(.name | length > 15) | .name' | \
head -30

Why this works:

  1. Navigate + wait ensures full accessibility tree
  2. jq filter extracts text nodes only
  3. length > 15 filters buttons/labels
  4. head -30 limits output

Token comparison:

  • Exploratory approach: ~3,800 tokens
  • Pattern-driven: ~270 tokens
  • Savings: 14x

HTTP API Reference

Base URL

http://localhost:9867

Instances

# Create instance
TAB=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"profile":"work","mode":"headless"}' | jq -r '.id')

# List instances
curl http://localhost:9867/instances

# Stop instance
curl -X POST "http://localhost:9867/instances/$TAB/stop"

Navigation

# Navigate to URL
curl -X POST "http://localhost:9867/instances/$TAB/tabs/open" \
  -d '{"url":"https://example.com"}'

# Wait for load
sleep 3

Snapshot

# Full snapshot
curl "http://localhost:9867/instances/$TAB/snapshot"

# Interactive elements only
curl "http://localhost:9867/instances/$TAB/snapshot?filter=interactive"

# With coordinates
curl "http://localhost:9867/instances/$TAB/snapshot?includeCoords=true"

Actions

# Click element
curl -X POST "http://localhost:9867/instances/$TAB/action" \
  -d '{"kind":"click","ref":"e5"}'

# Type text
curl -X POST "http://localhost:9867/instances/$TAB/action" \
  -d '{"kind":"type","ref":"e12","text":"hello"}'

# Press key
curl -X POST "http://localhost:9867/instances/$TAB/action" \
  -d '{"kind":"key","ref":"e12","key":"Enter"}'

# Scroll
curl -X POST "http://localhost:9867/instances/$TAB/action" \
  -d '{"kind":"scroll","direction":"down"}'

Extraction

# Extract text (token-efficient)
curl "http://localhost:9867/instances/$TAB/text"

# Take screenshot
curl "http://localhost:9867/instances/$TAB/screenshot" \
  --output screenshot.png

# Generate PDF
curl "http://localhost:9867/instances/$TAB/pdf" \
  --output page.pdf

# Evaluate JavaScript
curl -X POST "http://localhost:9867/instances/$TAB/evaluate" \
  -d '{"script": "document.title"}'

Common Patterns

Pattern 1: Web Scraping

#!/bin/bash
# scrape-headlines.sh

URL=$1
INST=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"mode":"headless"}' | jq -r '.id')

# Navigate and wait
curl -s -X POST "http://localhost:9867/instances/$INST/tabs/open" \
  -d "{\"url\":\"$URL\"}"
sleep 3

# Extract headlines (filter by length)
curl -s "http://localhost:9867/instances/$INST/snapshot" | \
  jq '.nodes[] | select(.name | length > 20) | .name' | \
  head -20

# Cleanup
curl -s -X POST "http://localhost:9867/instances/$INST/stop"

Pattern 2: Form Interaction

#!/bin/bash
# fill-form.sh

INST=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"mode":"headless"}' | jq -r '.id')

# Navigate to form
curl -s -X POST "http://localhost:9867/instances/$INST/tabs/open" \
  -d '{"url":"https://example.com/login"}'
sleep 3

# Get snapshot to find element refs
SNAPSHOT=$(curl -s "http://localhost:9867/instances/$INST/snapshot?filter=interactive")

# Extract refs (example: e5=email, e7=password, e9=submit)
EMAIL_REF=$(echo $SNAPSHOT | jq -r '.nodes[] | select(.name | contains("email")) | .ref')
PASS_REF=$(echo $SNAPSHOT | jq -r '.nodes[] | select(.name | contains("password")) | .ref')
SUBMIT_REF=$(echo $SNAPSHOT | jq -r '.nodes[] | select(.role == "button") | .ref')

# Fill form
curl -s -X POST "http://localhost:9867/instances/$INST/action" \
  -d "{\"kind\":\"type\",\"ref\":\"$EMAIL_REF\",\"text\":\"[email protected]\"}"
curl -s -X POST "http://localhost:9867/instances/$INST/action" \
  -d "{\"kind\":\"type\",\"ref\":\"$PASS_REF\",\"text\":\"password123\"}"

# Submit
curl -s -X POST "http://localhost:9867/instances/$INST/action" \
  -d "{\"kind\":\"click\",\"ref\":\"$SUBMIT_REF\"}"

# Wait for navigation
sleep 3

# Verify login
curl -s "http://localhost:9867/instances/$INST/text" | jq -r '.title'

# Cleanup
curl -s -X POST "http://localhost:9867/instances/$INST/stop"

Pattern 3: Multi-Instance Parallel Processing

#!/bin/bash
# parallel-scrape.sh

URLS=("https://site1.com" "https://site2.com" "https://site3.com")
INSTANCES=()

# Create instances
for i in {0..2}; do
  INST=$(curl -s -X POST http://localhost:9867/instances \
    -d '{"mode":"headless"}' | jq -r '.id')
  INSTANCES[$i]=$INST
done

# Launch parallel jobs
for i in {0..2}; do
  (
    curl -s -X POST "http://localhost:9867/instances/${INSTANCES[$i]}/tabs/open" \
      -d "{\"url\":\"${URLS[$i]}\"}"
    sleep 3
    TITLE=$(curl -s "http://localhost:9867/instances/${INSTANCES[$i]}/text" | jq -r '.title')
    echo "Result $i: $TITLE"
    curl -s -X POST "http://localhost:9867/instances/${INSTANCES[$i]}/stop"
  ) &
done

wait
echo "All complete"

Pattern 4: Visual Regression Testing

#!/bin/bash
# visual-regression.sh

URLS=("https://staging.example.com" "https://production.example.com")
INST=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"mode":"headless"}' | jq -r '.id')

for URL in "${URLS[@]}"; do
  curl -s -X POST "http://localhost:9867/instances/$INST/tabs/open" \
    -d "{\"url\":\"$URL\"}"
  sleep 3

  # Take screenshot
  FILENAME=$(echo $URL | sed 's/[^a-zA-Z0-9]/_/g').png
  curl -s "http://localhost:9867/instances/$INST/screenshot" \
    --output "$FILENAME"
  echo "Saved: $FILENAME"
done

curl -s -X POST "http://localhost:9867/instances/$INST/stop"

Pattern 5: Session Persistence

#!/bin/bash
# persistent-session.sh

# Create instance with named profile
INST=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"profile":"myaccount","mode":"headless"}' | jq -r '.id')

# Login once
curl -s -X POST "http://localhost:9867/instances/$INST/tabs/open" \
  -d '{"url":"https://example.com/login"}'
sleep 3
# ... perform login ...

# Stop (cookies saved to profile)
curl -s -X POST "http://localhost:9867/instances/$INST/stop"

# Later: Resume with same profile
INST2=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"profile":"myaccount","mode":"headless"}' | jq -r '.id')

# Already logged in!
curl -s -X POST "http://localhost:9867/instances/$INST2/tabs/open" \
  -d '{"url":"https://example.com/dashboard"}'

MCP Integration

PinchTab provides an SMCP plugin for native Claude Code integration.

Setup

# Set plugin directory
export MCP_PLUGINS_DIR=/path/to/pinchtab/plugins

# Restart Claude Code to load plugin

Available Tools

Tool Description
pinchtab__navigate Navigate to URL
pinchtab__snapshot Get page structure
pinchtab__action Click, type, press keys
pinchtab__text Extract text content
pinchtab__screenshot Capture screenshot
pinchtab__pdf Generate PDF
pinchtab__evaluate Run JavaScript
pinchtab__cookies_get Get cookies
pinchtab__stealth_status Check stealth mode

Usage in Claude Code

Use pinchtab to navigate to example.com and extract the main headlines.

Claude will:

  1. Call pinchtab__navigate with URL
  2. Wait 3 seconds
  3. Call pinchtab__snapshot with filter
  4. Extract headlines from result

Headless vs Headed

Aspect Headless Headed
Window No visible UI Chrome window visible
Speed ~20% faster Slower (rendering overhead)
Memory ~50-80 MB ~100-150 MB
Use Case CI/CD, scraping, batch Debugging, visual QA
Interaction API only API + manual
# Headless for production
pinchtab instances create --mode=headless

# Headed for debugging
pinchtab instances create --mode=headed

Best Practices

DO

  • ✅ Wait 3+ seconds after navigation
  • ✅ Use text extraction over screenshots (token-efficient)
  • ✅ Filter snapshots to reduce tokens
  • ✅ Use profiles for persistent sessions
  • ✅ Run headless in production
  • ✅ Clean up instances after use
  • ✅ Handle errors gracefully

DON'T

  • ❌ Skip the 3-second wait
  • ❌ Take screenshots for text extraction
  • ❌ Parse full snapshots without filtering
  • ❌ Use headed mode in CI/CD
  • ❌ Leave instances running indefinitely
  • ❌ Hardcode element refs (they change)

Troubleshooting

Only getting 1 node in snapshot

Cause: Accessibility tree not ready Fix: Increase wait time to 3+ seconds

pinchtab nav https://example.com
sleep 3  # Increase if needed
pinchtab snap

Timeouts

Cause: Page too slow or Chrome overloaded Fix: Increase sleep or use headless mode

# Increase wait
sleep 5

# Or use headless for faster rendering
pinchtab instances create --mode=headless

Element not found

Cause: Refs change between snapshots Fix: Re-snapshot before each action

# Get fresh refs before each action
REF=$(pinchtab snap -i | jq -r '.nodes[] | select(.name == "Submit") | .ref')
pinchtab click "$REF"

Connection refused

Cause: PinchTab server not running Fix: Start server first

pinchtab  # In separate terminal

References


Token-efficient browser automation for AI agents.

安全使用建议
This skill appears to do what it says (PinchTab-based browser automation) but exercise caution before following its install instructions. Verify the authenticity of pinchtab.com and prefer installing from your OS package manager or a verified release with checksums/signatures. Never run curl | bash on a URL you haven't audited—download the installer, inspect it, and verify signatures/checksums. If you plan to automate pages that require credentials, use isolated environments (VM/container) and avoid storing secrets in plaintext in scripts. If you need stronger assurance, request the upstream project's release checksums or use the Docker image instead of piping remote scripts directly. Finally, be aware that the skill omits declaring runtime dependencies (curl, jq, docker, npm, pinchtab CLI); ensure those tools are from trusted sources before use.
功能分析
Type: OpenClaw Skill Name: browser-automation-pin Version: 1.0.0 The skill bundle for 'PinchTab' provides browser automation capabilities for AI agents, including tools for navigation, interaction, and data extraction. It is classified as suspicious due to high-risk instructions in `skill.md`, specifically the recommendation to install the software via a pipe-to-shell command (`curl | bash`) and the inclusion of features for arbitrary JavaScript execution (`pinchtab__evaluate`) and session persistence. While these capabilities are aligned with the stated purpose of web automation, they represent a significant attack surface without explicit security constraints or input sanitization mentioned in the documentation.
能力评估
Purpose & Capability
The name and description match the SKILL.md content: the guide is about controlling Chrome via PinchTab. However the SKILL.md repeatedly uses command-line tools (curl, jq, npm, docker, pinchtab CLI) yet the skill metadata declares no required binaries or install steps. Not declaring these runtime dependencies is an inconsistency (informational, not necessarily malicious).
Instruction Scope
The instructions stay within the stated purpose (navigating pages, extracting text, clicking, filling forms, snapshots). A notable scope concern: the guide recommends executing remote installer scripts (curl -fsSL https://pinchtab.com/install.sh | bash) and running arbitrary evaluate JavaScript endpoints; those actions could execute arbitrary code from the PinchTab provider and should be reviewed before running. The examples include filling inputs (e.g., credentials) which is expected for form automation but could expose secrets to the pages automated—this is a normal risk for browser automation but worth calling out.
Install Mechanism
There is no install spec in the registry, but SKILL.md instructs running a remote install script (curl | bash), npm -g installs, and docker pull/run for pinchtab/pinchtab. Executing an unverified remote installer or running a downloaded binary is higher-risk than installing from a vetted package with checksums. The SKILL.md provides no checksums, signatures, or pinned versions.
Credentials
The skill declares no required environment variables or credentials, and the instructions do not attempt to read hidden environment variables or unrelated system config. The absence of requested credentials is proportionate to the described purpose. (Be aware that automated browsing can cause user-entered credentials to be submitted to remote pages—this is an application-level risk, not an incoherence in the skill manifest.)
Persistence & Privilege
The skill is instruction-only, has always=false, and requests no persistent system privileges or configuration changes in other skills. It does not attempt to modify other skills' configurations or request permanent presence.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install browser-automation-pin
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /browser-automation-pin 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of browser-automation skill, enabling programmatic Chrome browser control via PinchTab. - Automate browser tasks including navigation, interaction, testing, and data extraction. - Supports token-efficient text extraction (800 tokens/page) and interactive element identification. - Provides multi-instance, profile-based, headless and headed Chrome operation. - RESTful HTTP API for actions: navigation, snapshots, clicks, typing, scrolling, extraction, and more. - Includes ready-to-use bash patterns for web scraping, form filling, parallel automation, and visual testing.
元数据
Slug browser-automation-pin
版本 1.0.0
许可证
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Browser Automation 是什么?

Browser automation for AI agents using PinchTab. Control Chrome programmatically for testing, scraping, and interaction. Features token-efficient text extrac... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 332 次。

如何安装 Browser Automation?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install browser-automation-pin」即可一键安装,无需额外配置。

Browser Automation 是免费的吗?

是的,Browser Automation 完全免费(开源免费),可自由下载、安装和使用。

Browser Automation 支持哪些平台?

Browser Automation 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Browser Automation?

由 huamu668(@huamu668)开发并维护,当前版本 v1.0.0。

💬 留言讨论