← Back to Skills Marketplace
huamu668

Browser Automation

by huamu668 · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
332
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install browser-automation-pin
Description
Browser automation for AI agents using PinchTab. Control Chrome programmatically for testing, scraping, and interaction. Features token-efficient text extrac...
README (SKILL.md)

Browser Automation

Browser automation for AI agents using PinchTab — a high-performance Chrome bridge with HTTP API.

What is PinchTab?

  • Standalone HTTP server — Control Chrome via HTTP API
  • Token-efficient — 800 tokens/page with text extraction (vs 10,000+ for screenshots)
  • Multi-instance — Run multiple parallel Chrome processes with isolated profiles
  • Headless or Headed — Run without window or with visible Chrome
  • Self-contained — 12MB binary, no external dependencies
  • MCP integration — Native SMCP plugin for Claude Code

Quick Start

Installation

# macOS / Linux
curl -fsSL https://pinchtab.com/install.sh | bash

# npm
npm install -g pinchtab

# Docker
docker run -d -p 9867:9867 pinchtab/pinchtab

Start Server

# Terminal 1: Start PinchTab server
pinchtab
# Server runs on http://localhost:9867

Basic Commands

# Navigate
pinchtab nav https://pinchtab.com

# Wait 3 seconds for accessibility tree
sleep 3

# Get interactive elements
pinchtab snap -i -c

# Extract text (token-efficient)
pinchtab text

# Click element by ref
pinchtab click e5

# Fill input
pinchtab fill e3 "[email protected]"

Core Concepts

Instance

A running Chrome process. Each instance has isolated state.

# Create headless instance
pinchtab instances create --mode=headless

# Create headed instance (visible window)
pinchtab instances create --mode=headed

# List instances
pinchtab instances list

# Stop instance
pinchtab instances stop \x3Cinstance-id>

Profile

Browser state (cookies, history, localStorage). Log in once, stay logged in.

# Create instance with profile
pinchtab instances create --profile=work

# Profile persists across restarts

Tab

A single webpage. Each instance can have multiple tabs.

# Open new tab
pinchtab tabs open https://example.com

# List tabs
pinchtab tabs list

# Close tab
pinchtab tabs close \x3Ctab-id>

Token-Efficient Patterns

The 3-Second Wait Rule

Critical: Chrome's accessibility tree takes ~3 seconds to populate after navigation.

# ❌ Too fast - empty tree
pinchtab nav https://example.com
pinchtab snap
# Returns: {"count": 1, "nodes": [{"ref": "e0"}]}

# ✅ Wait 3 seconds
pinchtab nav https://example.com
sleep 3
pinchtab snap
# Returns: {"count": 2645, "nodes": [...]}

Optimal Extraction Pattern

# Navigate + wait + filter (14x token savings)
curl -X POST http://localhost:9867/navigate \
  -d '{"url": "https://example.com"}' && \
sleep 3 && \
curl http://localhost:9867/snapshot | \
jq '.nodes[] | select(.name | length > 15) | .name' | \
head -30

Why this works:

  1. Navigate + wait ensures full accessibility tree
  2. jq filter extracts text nodes only
  3. length > 15 filters buttons/labels
  4. head -30 limits output

Token comparison:

  • Exploratory approach: ~3,800 tokens
  • Pattern-driven: ~270 tokens
  • Savings: 14x

HTTP API Reference

Base URL

http://localhost:9867

Instances

# Create instance
TAB=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"profile":"work","mode":"headless"}' | jq -r '.id')

# List instances
curl http://localhost:9867/instances

# Stop instance
curl -X POST "http://localhost:9867/instances/$TAB/stop"

Navigation

# Navigate to URL
curl -X POST "http://localhost:9867/instances/$TAB/tabs/open" \
  -d '{"url":"https://example.com"}'

# Wait for load
sleep 3

Snapshot

# Full snapshot
curl "http://localhost:9867/instances/$TAB/snapshot"

# Interactive elements only
curl "http://localhost:9867/instances/$TAB/snapshot?filter=interactive"

# With coordinates
curl "http://localhost:9867/instances/$TAB/snapshot?includeCoords=true"

Actions

# Click element
curl -X POST "http://localhost:9867/instances/$TAB/action" \
  -d '{"kind":"click","ref":"e5"}'

# Type text
curl -X POST "http://localhost:9867/instances/$TAB/action" \
  -d '{"kind":"type","ref":"e12","text":"hello"}'

# Press key
curl -X POST "http://localhost:9867/instances/$TAB/action" \
  -d '{"kind":"key","ref":"e12","key":"Enter"}'

# Scroll
curl -X POST "http://localhost:9867/instances/$TAB/action" \
  -d '{"kind":"scroll","direction":"down"}'

Extraction

# Extract text (token-efficient)
curl "http://localhost:9867/instances/$TAB/text"

# Take screenshot
curl "http://localhost:9867/instances/$TAB/screenshot" \
  --output screenshot.png

# Generate PDF
curl "http://localhost:9867/instances/$TAB/pdf" \
  --output page.pdf

# Evaluate JavaScript
curl -X POST "http://localhost:9867/instances/$TAB/evaluate" \
  -d '{"script": "document.title"}'

Common Patterns

Pattern 1: Web Scraping

#!/bin/bash
# scrape-headlines.sh

URL=$1
INST=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"mode":"headless"}' | jq -r '.id')

# Navigate and wait
curl -s -X POST "http://localhost:9867/instances/$INST/tabs/open" \
  -d "{\"url\":\"$URL\"}"
sleep 3

# Extract headlines (filter by length)
curl -s "http://localhost:9867/instances/$INST/snapshot" | \
  jq '.nodes[] | select(.name | length > 20) | .name' | \
  head -20

# Cleanup
curl -s -X POST "http://localhost:9867/instances/$INST/stop"

Pattern 2: Form Interaction

#!/bin/bash
# fill-form.sh

INST=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"mode":"headless"}' | jq -r '.id')

# Navigate to form
curl -s -X POST "http://localhost:9867/instances/$INST/tabs/open" \
  -d '{"url":"https://example.com/login"}'
sleep 3

# Get snapshot to find element refs
SNAPSHOT=$(curl -s "http://localhost:9867/instances/$INST/snapshot?filter=interactive")

# Extract refs (example: e5=email, e7=password, e9=submit)
EMAIL_REF=$(echo $SNAPSHOT | jq -r '.nodes[] | select(.name | contains("email")) | .ref')
PASS_REF=$(echo $SNAPSHOT | jq -r '.nodes[] | select(.name | contains("password")) | .ref')
SUBMIT_REF=$(echo $SNAPSHOT | jq -r '.nodes[] | select(.role == "button") | .ref')

# Fill form
curl -s -X POST "http://localhost:9867/instances/$INST/action" \
  -d "{\"kind\":\"type\",\"ref\":\"$EMAIL_REF\",\"text\":\"[email protected]\"}"
curl -s -X POST "http://localhost:9867/instances/$INST/action" \
  -d "{\"kind\":\"type\",\"ref\":\"$PASS_REF\",\"text\":\"password123\"}"

# Submit
curl -s -X POST "http://localhost:9867/instances/$INST/action" \
  -d "{\"kind\":\"click\",\"ref\":\"$SUBMIT_REF\"}"

# Wait for navigation
sleep 3

# Verify login
curl -s "http://localhost:9867/instances/$INST/text" | jq -r '.title'

# Cleanup
curl -s -X POST "http://localhost:9867/instances/$INST/stop"

Pattern 3: Multi-Instance Parallel Processing

#!/bin/bash
# parallel-scrape.sh

URLS=("https://site1.com" "https://site2.com" "https://site3.com")
INSTANCES=()

# Create instances
for i in {0..2}; do
  INST=$(curl -s -X POST http://localhost:9867/instances \
    -d '{"mode":"headless"}' | jq -r '.id')
  INSTANCES[$i]=$INST
done

# Launch parallel jobs
for i in {0..2}; do
  (
    curl -s -X POST "http://localhost:9867/instances/${INSTANCES[$i]}/tabs/open" \
      -d "{\"url\":\"${URLS[$i]}\"}"
    sleep 3
    TITLE=$(curl -s "http://localhost:9867/instances/${INSTANCES[$i]}/text" | jq -r '.title')
    echo "Result $i: $TITLE"
    curl -s -X POST "http://localhost:9867/instances/${INSTANCES[$i]}/stop"
  ) &
done

wait
echo "All complete"

Pattern 4: Visual Regression Testing

#!/bin/bash
# visual-regression.sh

URLS=("https://staging.example.com" "https://production.example.com")
INST=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"mode":"headless"}' | jq -r '.id')

for URL in "${URLS[@]}"; do
  curl -s -X POST "http://localhost:9867/instances/$INST/tabs/open" \
    -d "{\"url\":\"$URL\"}"
  sleep 3

  # Take screenshot
  FILENAME=$(echo $URL | sed 's/[^a-zA-Z0-9]/_/g').png
  curl -s "http://localhost:9867/instances/$INST/screenshot" \
    --output "$FILENAME"
  echo "Saved: $FILENAME"
done

curl -s -X POST "http://localhost:9867/instances/$INST/stop"

Pattern 5: Session Persistence

#!/bin/bash
# persistent-session.sh

# Create instance with named profile
INST=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"profile":"myaccount","mode":"headless"}' | jq -r '.id')

# Login once
curl -s -X POST "http://localhost:9867/instances/$INST/tabs/open" \
  -d '{"url":"https://example.com/login"}'
sleep 3
# ... perform login ...

# Stop (cookies saved to profile)
curl -s -X POST "http://localhost:9867/instances/$INST/stop"

# Later: Resume with same profile
INST2=$(curl -s -X POST http://localhost:9867/instances \
  -d '{"profile":"myaccount","mode":"headless"}' | jq -r '.id')

# Already logged in!
curl -s -X POST "http://localhost:9867/instances/$INST2/tabs/open" \
  -d '{"url":"https://example.com/dashboard"}'

MCP Integration

PinchTab provides an SMCP plugin for native Claude Code integration.

Setup

# Set plugin directory
export MCP_PLUGINS_DIR=/path/to/pinchtab/plugins

# Restart Claude Code to load plugin

Available Tools

Tool Description
pinchtab__navigate Navigate to URL
pinchtab__snapshot Get page structure
pinchtab__action Click, type, press keys
pinchtab__text Extract text content
pinchtab__screenshot Capture screenshot
pinchtab__pdf Generate PDF
pinchtab__evaluate Run JavaScript
pinchtab__cookies_get Get cookies
pinchtab__stealth_status Check stealth mode

Usage in Claude Code

Use pinchtab to navigate to example.com and extract the main headlines.

Claude will:

  1. Call pinchtab__navigate with URL
  2. Wait 3 seconds
  3. Call pinchtab__snapshot with filter
  4. Extract headlines from result

Headless vs Headed

Aspect Headless Headed
Window No visible UI Chrome window visible
Speed ~20% faster Slower (rendering overhead)
Memory ~50-80 MB ~100-150 MB
Use Case CI/CD, scraping, batch Debugging, visual QA
Interaction API only API + manual
# Headless for production
pinchtab instances create --mode=headless

# Headed for debugging
pinchtab instances create --mode=headed

Best Practices

DO

  • ✅ Wait 3+ seconds after navigation
  • ✅ Use text extraction over screenshots (token-efficient)
  • ✅ Filter snapshots to reduce tokens
  • ✅ Use profiles for persistent sessions
  • ✅ Run headless in production
  • ✅ Clean up instances after use
  • ✅ Handle errors gracefully

DON'T

  • ❌ Skip the 3-second wait
  • ❌ Take screenshots for text extraction
  • ❌ Parse full snapshots without filtering
  • ❌ Use headed mode in CI/CD
  • ❌ Leave instances running indefinitely
  • ❌ Hardcode element refs (they change)

Troubleshooting

Only getting 1 node in snapshot

Cause: Accessibility tree not ready Fix: Increase wait time to 3+ seconds

pinchtab nav https://example.com
sleep 3  # Increase if needed
pinchtab snap

Timeouts

Cause: Page too slow or Chrome overloaded Fix: Increase sleep or use headless mode

# Increase wait
sleep 5

# Or use headless for faster rendering
pinchtab instances create --mode=headless

Element not found

Cause: Refs change between snapshots Fix: Re-snapshot before each action

# Get fresh refs before each action
REF=$(pinchtab snap -i | jq -r '.nodes[] | select(.name == "Submit") | .ref')
pinchtab click "$REF"

Connection refused

Cause: PinchTab server not running Fix: Start server first

pinchtab  # In separate terminal

References


Token-efficient browser automation for AI agents.

Usage Guidance
This skill appears to do what it says (PinchTab-based browser automation) but exercise caution before following its install instructions. Verify the authenticity of pinchtab.com and prefer installing from your OS package manager or a verified release with checksums/signatures. Never run curl | bash on a URL you haven't audited—download the installer, inspect it, and verify signatures/checksums. If you plan to automate pages that require credentials, use isolated environments (VM/container) and avoid storing secrets in plaintext in scripts. If you need stronger assurance, request the upstream project's release checksums or use the Docker image instead of piping remote scripts directly. Finally, be aware that the skill omits declaring runtime dependencies (curl, jq, docker, npm, pinchtab CLI); ensure those tools are from trusted sources before use.
Capability Analysis
Type: OpenClaw Skill Name: browser-automation-pin Version: 1.0.0 The skill bundle for 'PinchTab' provides browser automation capabilities for AI agents, including tools for navigation, interaction, and data extraction. It is classified as suspicious due to high-risk instructions in `skill.md`, specifically the recommendation to install the software via a pipe-to-shell command (`curl | bash`) and the inclusion of features for arbitrary JavaScript execution (`pinchtab__evaluate`) and session persistence. While these capabilities are aligned with the stated purpose of web automation, they represent a significant attack surface without explicit security constraints or input sanitization mentioned in the documentation.
Capability Assessment
Purpose & Capability
The name and description match the SKILL.md content: the guide is about controlling Chrome via PinchTab. However the SKILL.md repeatedly uses command-line tools (curl, jq, npm, docker, pinchtab CLI) yet the skill metadata declares no required binaries or install steps. Not declaring these runtime dependencies is an inconsistency (informational, not necessarily malicious).
Instruction Scope
The instructions stay within the stated purpose (navigating pages, extracting text, clicking, filling forms, snapshots). A notable scope concern: the guide recommends executing remote installer scripts (curl -fsSL https://pinchtab.com/install.sh | bash) and running arbitrary evaluate JavaScript endpoints; those actions could execute arbitrary code from the PinchTab provider and should be reviewed before running. The examples include filling inputs (e.g., credentials) which is expected for form automation but could expose secrets to the pages automated—this is a normal risk for browser automation but worth calling out.
Install Mechanism
There is no install spec in the registry, but SKILL.md instructs running a remote install script (curl | bash), npm -g installs, and docker pull/run for pinchtab/pinchtab. Executing an unverified remote installer or running a downloaded binary is higher-risk than installing from a vetted package with checksums. The SKILL.md provides no checksums, signatures, or pinned versions.
Credentials
The skill declares no required environment variables or credentials, and the instructions do not attempt to read hidden environment variables or unrelated system config. The absence of requested credentials is proportionate to the described purpose. (Be aware that automated browsing can cause user-entered credentials to be submitted to remote pages—this is an application-level risk, not an incoherence in the skill manifest.)
Persistence & Privilege
The skill is instruction-only, has always=false, and requests no persistent system privileges or configuration changes in other skills. It does not attempt to modify other skills' configurations or request permanent presence.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install browser-automation-pin
  3. After installation, invoke the skill by name or use /browser-automation-pin
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of browser-automation skill, enabling programmatic Chrome browser control via PinchTab. - Automate browser tasks including navigation, interaction, testing, and data extraction. - Supports token-efficient text extraction (800 tokens/page) and interactive element identification. - Provides multi-instance, profile-based, headless and headed Chrome operation. - RESTful HTTP API for actions: navigation, snapshots, clicks, typing, scrolling, extraction, and more. - Includes ready-to-use bash patterns for web scraping, form filling, parallel automation, and visual testing.
Metadata
Slug browser-automation-pin
Version 1.0.0
License
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Browser Automation?

Browser automation for AI agents using PinchTab. Control Chrome programmatically for testing, scraping, and interaction. Features token-efficient text extrac... It is an AI Agent Skill for Claude Code / OpenClaw, with 332 downloads so far.

How do I install Browser Automation?

Run "/install browser-automation-pin" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Browser Automation free?

Yes, Browser Automation is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Browser Automation support?

Browser Automation is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Browser Automation?

It is built and maintained by huamu668 (@huamu668); the current version is v1.0.0.

💬 Comments