Description

Onboard an agent to Bright Data. Use when a coding agent first encounters Bright Data — for live web work (search, scrape, structured data), for wiring Brigh...

README (SKILL.md)

Bright Data — Agent Onboarding

Name: Agent Onboarding
Author: meirk-brd

Bright Data gives agents reliable access to the open web: SERP results that look like a real browser, clean markdown from any URL (with CAPTCHA + JS handled), structured datasets for 40+ platforms (Amazon, LinkedIn, Instagram, TikTok, YouTube, Reddit, Crunchbase, …), and a Browser API for pages that need real interaction.

This skill is the entry point. Read it once, pick a path, then hand off to the narrower skill that owns that path.

Install

One command installs the CLI and the agent skills, and walks the human through OAuth in the browser:

# macOS / Linux — fastest install
curl -fsSL https://cli.brightdata.com/install.sh | bash

# Cross-platform (or if you don't want the install script)
npm install -g @brightdata/cli

# One-off, no install
npx --yes --package @brightdata/cli brightdata \x3Ccommand>

Requires Node.js >= 20. After install, both brightdata and bdata (shorthand) are available.

Then authenticate once:

bdata login

This single command:

Opens the browser for OAuth (or use bdata login --device on headless / SSH machines)
Saves the API key locally — you never need to paste a token again
Auto-creates the required proxy zones (cli_unlocker, cli_browser)
Sets sensible default configuration

For non-interactive setups you can pass the key directly:

bdata login --api-key \x3Ckey>
# or
export BRIGHTDATA_API_KEY=\x3Ckey>

Verify the install before doing real work:

bdata version
bdata config            # confirms auth + zones
bdata zones             # should list cli_unlocker, cli_browser
bdata budget            # confirms account + balance

If any of these fail, route to Path C (auth) before continuing.

Install agent skills (optional, recommended)

The CLI ships an installer that drops Bright Data skills directly into your coding agent's skill directory:

# Interactive picker — choose skills + target agent
bdata skill add

# Install a specific skill
bdata skill add scrape
bdata skill add data-feeds
bdata skill add competitive-intel

# See everything available
bdata skill list

These are the skills you'll hand off to from the paths below (scrape, search, data-feeds, scraper-builder, brightdata-cli, bright-data-mcp, …).

Choose your path

All paths share the same install + auth above. The difference is what you do next.

Situation	Path
Need web data during this session	Path A — live CLI tools
Need to add Bright Data to app code	Path B — SDK / REST integration
Want a drop-in tool layer for an LLM agent	Path M — MCP server
Need an API key first	Path C — auth only
Don't want to install anything	Path D — REST API directly

If your task spans paths, do them in order: auth → live tools to explore → app integration once the shape is known.

Path A — Live web tools (CLI)

Use this when the agent itself needs web data right now: discovering URLs, fetching clean content, pulling structured records from a known platform, or running a quick competitive scan.

After install + login, hand off to the narrower skills:

brightdata-cli — overall command surface (scrape, search, pipelines, status, zones, budget, config)
search — discovery via bdata search (Google / Bing / Yandex SERP, structured JSON)
scrape — clean content from a known URL via bdata scrape (markdown / HTML / JSON / screenshot)
data-feeds — structured records from 40+ supported platforms via bdata pipelines \x3Ctype> (Amazon, LinkedIn, Instagram, TikTok, YouTube, Reddit, Crunchbase, Google Maps, …)
competitive-intel — packaged competitor / pricing / review / hiring / SEO analyses on top of the CLI
seo-audit — sitemap-stratified live SEO audits

Default flow for live web work:

Search first when you need discovery bdata search "query" --json
Pipelines next if the target is a supported platform — you get structured JSON with no parsing bdata pipelines amazon_product "https://amazon.com/dp/..."
Scrape when you have a URL and no platform pipeline applies bdata scrape "https://example.com" -f markdown
Browser API only when the page truly needs clicks, forms, or login (see the brightdata-cli skill for bdata browser and the bright-data-best-practices browser-api reference)

When the task shifts from "fetch data now" to "wire this into an app," switch to Path B.

Path B — Integrate Bright Data into an app

Use this when you're building an application, agent, or workflow that calls Bright Data from code and needs BRIGHTDATA_API_KEY (and a zone) in .env or runtime config.

The required question on this path is:

What should Bright Data do in the product?

Use the answer to pick the API:

Job in product	API	Skill
Fetch a single page as markdown / HTML / JSON	Web Unlocker	`bright-data-best-practices` → `web-unlocker.md`
Search engine results in structured JSON	SERP API	`bright-data-best-practices` → `serp-api.md`
Structured records from supported platforms	Web Scraper API	`bright-data-best-practices` → `web-scraper-api.md`
JS-heavy / interactive pages with Playwright/Puppeteer	Browser API	`bright-data-best-practices` → `browser-api.md`
Build a custom scraper for an arbitrary site	All four, picked by site shape	`scraper-builder`

Pick a stack

Python → use the official SDK
```
pip install brightdata-sdk
```
Hand off to python-sdk-best-practices for client setup (async/sync), platform scrapers, SERP, datasets, Browser API, and error handling.
Node / TypeScript / shell / other → call the REST API directly (Path D below has the endpoints), or use the CLI as a library via npx @brightdata/cli.
LLM tool layer (Claude, ChatGPT, etc.) → use the MCP server (Path M).

Set credentials

BRIGHTDATA_API_KEY=...
BRIGHTDATA_UNLOCKER_ZONE=cli_unlocker   # created automatically by `bdata login`
BRIGHTDATA_SERP_ZONE=cli_unlocker       # or a dedicated SERP zone

If you don't have a key yet, do Path C first.

Smoke test before writing real code

Always run one real Bright Data request before scaling up integration work — catches auth, zone, and quota issues before they hide inside your app's error paths.

# Web Unlocker via REST
curl -sS https://api.brightdata.com/request \
  -H "Authorization: Bearer $BRIGHTDATA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "zone": "'"$BRIGHTDATA_UNLOCKER_ZONE"'",
    "format": "raw",
    "data_format": "markdown"
  }' | head -40

If this prints clean markdown, you're wired up. If not, check the zone name and key.

Path M — MCP server (LLM tool layer)

Use this when the consumer is an LLM agent that should call Bright Data as tools (e.g., Claude Code, ChatGPT desktop, custom agent loops). The MCP server exposes 60+ tools — search, scrape, structured data per platform, browser automation — over a single URL.

Connect with:

https://mcp.brightdata.com/mcp?token=YOUR_BRIGHTDATA_API_TOKEN

Optional URL parameters:

Parameter	Effect
`pro=1`	Enable all 60+ Pro tools
`groups=\x3Cname>`	Enable a tool group (`social`, `ecommerce`, `business`, `finance`, `research`, `app_stores`, `travel`, `browser`, `advanced_scraping`)
`tools=\x3Cnames>`	Enable a specific tool list, comma-separated

Hand off to the bright-data-mcp skill for tool selection, tool-group auto-enabling, and workflow patterns. That skill explicitly replaces WebFetch / WebSearch with Bright Data MCP equivalents.

Path C — Get an API key (auth only)

Use this when the human still needs to sign up, sign in, or generate a key. Skip this path if bdata config already shows an authenticated account, or if BRIGHTDATA_API_KEY is already set in the environment.

Easiest: use the CLI's OAuth flow

bdata login            # browser-based OAuth
bdata login --device   # headless / SSH (device-code flow)

This handles signup-or-signin, key generation, zone creation, and local config in one step. Prefer this over manual flows.

Manual: dashboard

If the human prefers the web UI:

Go to https://brightdata.com/cp (sign up if needed)
Create a Web Unlocker zone ("Add" → "Unlocker zone")
Copy the API key from the dashboard
Save it where the rest of the app reads secrets:

echo "BRIGHTDATA_API_KEY=..." >> .env
echo "BRIGHTDATA_UNLOCKER_ZONE=\x3Czone-name>" >> .env

Verify

bdata budget    # any successful response means the key works

If verification fails, the key is wrong, the zone is wrong, or the account has no active subscription — surface the error to the human rather than guessing.

Path D — Use Bright Data without installing anything

Use this when the environment can't run npm / curl | bash, or when you only need one or two requests and don't want the CLI / SDK. Works for both live agent work and app integration.

You still need an API key and a zone. Two ways to get them:

Human pastes it in — if a key already exists, set BRIGHTDATA_API_KEY=... and BRIGHTDATA_UNLOCKER_ZONE=... in the environment
Browser flow — do Path C; the dashboard issues both

Base URL: https://api.brightdata.com Auth header: Authorization: Bearer $BRIGHTDATA_API_KEY

Core endpoints

# Web Unlocker — clean content from any URL
POST /request
{
  "url": "https://target.com",
  "zone": "\x3Cunlocker-zone>",
  "format": "raw",
  "data_format": "markdown"   // or "html", "screenshot", "parsed_light"
}

# SERP API — structured search results
# Use the same /request endpoint with a SERP zone and a search URL,
# adding `brd_json=1` to receive parsed JSON instead of raw HTML.
POST /request
{
  "url": "https://www.google.com/search?q=web+scraping&brd_json=1",
  "zone": "\x3Cserp-zone>",
  "format": "raw"
}

# Web Scraper API — structured data for 40+ platforms (async)
POST /datasets/v3/trigger?dataset_id=\x3Cid>
[ { "url": "https://amazon.com/dp/B09V3KXJPB" } ]

# then poll
GET  /datasets/v3/snapshot/\x3Csnapshot_id>?format=json

For the full parameter surface (special headers like x-unblock-expect, async response IDs, dataset progress states, Browser API CDP commands), read the bright-data-best-practices skill — its references are the source of truth for REST-level work.

Documentation

Product docs: https://docs.brightdata.com
LLM-friendly docs index: https://docs.brightdata.com/llms.txt
Dashboard (zones, keys, billing): https://brightdata.com/cp

After onboarding — where to go next

Once the agent is set up, route the work to the narrowest skill that fits. Quick map:

User says…	Skill
"scrape this URL" / "get this page"	`scrape`
"search Google for…" / "find URLs about…"	`search`
"get Amazon / LinkedIn / Instagram / TikTok / YouTube / Reddit data"	`data-feeds`
"build a scraper for \x3Csite>"	`scraper-builder`
"analyze my competitor" / "compare pricing"	`competitive-intel`
"audit SEO" / "rank check" / "schema check"	`seo-audit`
"write Bright Data code in Python"	`python-sdk-best-practices`
"plug Bright Data into my LLM agent"	`bright-data-mcp`
"use the CLI" / "run from terminal"	`brightdata-cli`
"debug a Browser API session"	`brd-browser-debug`

When in doubt, prefer the more specific skill: data-feeds over scrape for supported platforms, scraper-builder over scrape for multi-page extraction, bright-data-mcp over brightdata-cli when the consumer is an LLM agent rather than a human at a terminal.

Usage Guidance

This SKILL.md is coherent for onboarding to Bright Data, but exercise caution before following the install steps: (1) Prefer npm (npm install -g @brightdata/cli) or npx over piping a remote install script into bash; inspect https://cli.brightdata.com/install.sh before running it. (2) Understand that bdata login stores an API key locally — use a scoped/limited key when possible and don't reuse high-privilege credentials. (3) 'bdata skill add' will install/modify skills in your agent's skill directory; review what those skill packages do before adding them. (4) If you want minimal exposure, use the auth-only (Path C) or REST-only (Path D) options and test installs in a sandbox or separate environment first.

Capability Analysis

Type: OpenClaw Skill Name: brightdata-agent-onboarding Version: 1.0.0 The skill bundle provides onboarding instructions for the Bright Data service, but it includes high-risk execution patterns such as piping a remote shell script directly into bash (`curl -fsSL https://cli.brightdata.com/install.sh | bash`) and performing global npm installations in `SKILL.md`. While these actions are plausibly necessary for the stated purpose of tool installation, they represent significant security risks (RCE) and grant the agent broad system-level permissions. No evidence of intentional malice or data exfiltration was identified.

Capability Tags

requires-oauth-tokenrequires-sensitive-credentials

Capability Assessment

✓ Purpose & Capability

Name/description match the instructions: this is an onboarding document for Bright Data, and it explains installing the CLI, authenticating, and adding Bright Data skills. The requested capabilities (CLI, API key, zones) are appropriate for that purpose.

ℹ Instruction Scope

SKILL.md tells the agent/human to install the Bright Data CLI, run bdata login (which saves an API key locally and creates zones), and to use bdata skill add to drop skills into the agent's skill directory. Those actions are within onboarding scope but do modify agent state and may install additional skill bundles.

⚠ Install Mechanism

The doc recommends piping https://cli.brightdata.com/install.sh to bash (curl | bash) — this runs a remote install script and is higher risk by design. It also suggests npm install -g @brightdata/cli (global install) or npx, which are lower-risk alternatives. The presence of a remote install script is the main install-related concern.

✓ Credentials

The only credential discussed is BRIGHTDATA_API_KEY (or OAuth via bdata login). That matches the stated purpose. The instructions advise storing the key locally or in .env for app integration — expected but worth protecting (use least-privilege keys).

ℹ Persistence & Privilege

The skill is not always:true and does not itself persist, but it instructs installing a CLI that can place additional skills into your agent's skill directory. That can expand agent privileges/behavior and should be reviewed before allowing automatic installation.

Version History

v1.0.0

Initial publish from brightdata-plugin repo

Metadata

Slug brightdata-agent-onboarding

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Agent Onboarding?

Onboard an agent to Bright Data. Use when a coding agent first encounters Bright Data — for live web work (search, scrape, structured data), for wiring Brigh... It is an AI Agent Skill for Claude Code / OpenClaw, with 47 downloads so far.

How do I install Agent Onboarding?

Run "/install brightdata-agent-onboarding" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Agent Onboarding free?

Yes, Agent Onboarding is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Agent Onboarding support?

Agent Onboarding is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Agent Onboarding?

It is built and maintained by Meir Kadosh (@meirk-brd); the current version is v1.0.0.

More Skills

Agent Onboarding