← Back to Skills Marketplace
foreztgump

Indeed Brightdata

by Tee Bunsopha · GitHub ↗ · v0.1.4
cross-platform ✓ Security Clean
310
Downloads
0
Stars
1
Active Installs
5
Versions
Install in OpenClaw
/install indeed-brightdata
Description
Search and scrape Indeed job listings and company information using Bright Data's Web Scraper API. Use when the user asks to find jobs on Indeed, search for...
README (SKILL.md)

Indeed Bright Data Skill

Search Indeed for job listings and company info via Bright Data's Web Scraper API. Designed for recruiting workflows on messaging platforms (Telegram, Signal) with smart defaults.

Prerequisites

  • BRIGHTDATA_API_KEY environment variable must be set
  • curl and jq must be available

Workflow Decision Tree

User wants job info?
├── Has a specific Indeed URL?
│   ├── Job URL (/viewjob?) → indeed_jobs_by_url.sh [SYNC — seconds]
│   ├── Company jobs URL (/cmp/*/jobs) → indeed_jobs_by_company.sh [ASYNC — minutes]
│   └── Company page URL (/cmp/*) → indeed_company_by_url.sh [SYNC — seconds]
├── Wants to search by keyword/location?
│   └── indeed_smart_search.sh [ASYNC — 3-8 min]
│       Agent says: "Searching now, this takes a few minutes."
│       If results \x3C 5: auto-expands date range, do NOT ask user
│       Always pipe output through: indeed_format_results.sh --top 5
├── Wants company info?
│   ├── Has Indeed company URL → indeed_company_by_url.sh [SYNC — seconds]
│   ├── Has keyword → indeed_company_by_keyword.sh [ASYNC — minutes]
│   └── Has industry + state → indeed_company_by_industry.sh [ASYNC — minutes]
└── Check pending results? → indeed_check_pending.sh (run on heartbeat)

Always prefer sync (URL-based) scripts when the user provides a URL — they return in seconds.

Scripts Reference

Script Purpose Mode
indeed_smart_search.sh Primary job search — keyword expansion, parallel queries, dedup, caching ASYNC
indeed_jobs_by_url.sh Collect job details by URL(s) SYNC
indeed_jobs_by_keyword.sh Low-level single-keyword job search (used by smart search internally) ASYNC
indeed_jobs_by_company.sh Discover jobs from company page ASYNC
indeed_company_by_url.sh Collect company info by URL SYNC
indeed_company_by_keyword.sh Discover companies by keyword ASYNC
indeed_company_by_industry.sh Discover companies by industry/state ASYNC
indeed_format_results.sh Format JSON results into summary, full, or CSV Local
indeed_check_pending.sh Check/fetch completed pending searches + auto-cleanup Local/API
indeed_poll_and_fetch.sh Poll async job and fetch results (internal) API
indeed_list_datasets.sh List available Indeed dataset IDs API

Quick Start

User says: "Find me cybersecurity jobs in New York"

scripts/indeed_smart_search.sh "cybersecurity" US "New York, NY" \
  | scripts/indeed_format_results.sh --type jobs --top 5

User says: "Get details on this job: https://www.indeed.com/viewjob?jk=abc123"

scripts/indeed_jobs_by_url.sh "https://www.indeed.com/viewjob?jk=abc123"

Behavior Rules (MANDATORY)

  1. NEVER return raw JSON to the user. Always pipe results through indeed_format_results.sh.
  2. NEVER ask "want me to try broader keywords?" if results \x3C 5. The smart search auto-expands automatically. Just tell the user: "Found only N results with recent postings, expanding search..."
  3. NEVER present results older than 30 days without noting they may be stale.
  4. When a discovery search is running, immediately acknowledge: "Searching Indeed now — this usually takes 3-5 minutes. I'll come back with results."
  5. If the user asks a follow-up while a search is pending, run indeed_check_pending.sh first before starting a new search.
  6. For Telegram: keep each message under 3500 characters. Use the ---SPLIT--- markers from indeed_format_results.sh to break across messages.
  7. Always show total result count and offer to show more: "Showing top 5 of 23 results. Want to see more, or filter by salary/location?"
  8. Default to "Last 7 days" for date filtering. If the user says "find me jobs" without a time preference, the default is already set.

Smart Search (Primary Entry Point)

# Basic search (expands keywords, deduplicates, defaults to last 7 days)
scripts/indeed_smart_search.sh "cybersecurity" US "Remote"

# All-time search
scripts/indeed_smart_search.sh "nursing" US "Texas" --all-time

# Skip keyword expansion
scripts/indeed_smart_search.sh "registered nurse" US "Ohio" --no-expand

# Bypass 6-hour cache
scripts/indeed_smart_search.sh "data science" US "New York" --force

Output is {"meta": {...}, "results": [...]} with metadata including query params, keywords used, and result counts.

Result Formatting

# Telegram-friendly summary (default)
scripts/indeed_format_results.sh --type jobs --top 5 results.json

# CSV export
scripts/indeed_format_results.sh --type jobs --format csv results.json

# Companies
scripts/indeed_format_results.sh --type companies --top 5 companies.json

# Pipe from smart search
scripts/indeed_smart_search.sh "nurse" US "Ohio" | scripts/indeed_format_results.sh --top 5

Heartbeat: Checking Pending Results

scripts/indeed_check_pending.sh
# Output: {"completed":[...],"still_pending":[...],"failed":[...]}

Run this periodically. If ~/.config/indeed-brightdata/pending.json exists and is non-empty, check for completed results. Format completed results with indeed_format_results.sh and send to the user.

Exit Codes

Code Meaning Agent should...
0 Success — results on stdout Format and present results
1 Error — something failed Report the error
2 Deferred — still processing, saved to pending Tell user "results are still processing, I'll follow up"

Caching

Smart search caches results for 6 hours. Identical searches (same keyword + location + country) return cached results without API calls. Use --force to bypass. Old results (>7 days) are auto-cleaned by indeed_check_pending.sh.

Data Storage

All persistent data is stored under ~/.config/indeed-brightdata/:

File Purpose Lifecycle
datasets.json Bright Data dataset IDs Created on first indeed_list_datasets.sh --save, rarely changes
pending.json In-flight async snapshots Entries added on poll timeout (exit 2) or fire-and-forget (--no-wait), removed when fetched or after 24h
history.json Search cache index Entries added per search, auto-cleaned after 7 days
results/*.json Fetched result data Written when snapshots complete, auto-cleaned after 7 days

Auto-cleanup runs at the start of indeed_check_pending.sh. No data is sent anywhere other than the Bright Data API.

Security

All scripts source scripts/_lib.sh for shared HTTP and persistence functions. The library:

  • Makes requests to a single endpoint: https://api.brightdata.com/datasets/v3
  • Uses one credential: BRIGHTDATA_API_KEY (sent via Authorization: Bearer header)
  • Writes only to ~/.config/indeed-brightdata/ (see Data Storage above)
  • Does not read other environment variables, contact other hosts, or modify files outside its config directory

For full API parameter details

See references/api-reference.md for complete endpoint documentation, response schemas, and country/domain mappings.

For keyword expansions

See references/keyword-expansions.json for the lookup table of keyword-to-job-title mappings.

Usage Guidance
This skill appears internally consistent: it asks only for BRIGHTDATA_API_KEY and uses curl/jq to call Bright Data and format results, storing data under ~/.config/indeed-brightdata/. Before installing, do the following: (1) Inspect scripts/_lib.sh to confirm LIB_BASE_URL and any remote endpoints are the official Bright Data API endpoints and that no unexpected third-party endpoints are contacted. (2) Confirm you are OK with the skill writing files to ~/.config/indeed-brightdata/ and creating symlinks under your agent's skills directories (install.sh). (3) Remember Bright Data is a paid scraping service—your API key may incur charges; do not share a high-privilege or shared account key unless you trust the skill. (4) If you are uncertain about the repository origin, prefer manual review over blind install. If you want, I can review the contents of scripts/_lib.sh and package.sh for endpoint and authentication details.
Capability Analysis
Type: OpenClaw Skill Name: indeed-brightdata Version: 0.1.4 The indeed-brightdata skill bundle is a well-structured tool for searching and scraping Indeed job listings and company information via the Bright Data Web Scraper API. The scripts (e.g., indeed_smart_search.sh, indeed_jobs_by_keyword.sh) interact exclusively with the official Bright Data API (api.brightdata.com) and store persistent data like search history and pending tasks in a dedicated local directory (~/.config/indeed-brightdata/). The SKILL.md instructions provide clear, task-oriented guidance for the AI agent without any signs of prompt injection or malicious intent. The code follows security best practices by using jq for JSON processing and avoiding dangerous shell execution patterns.
Capability Assessment
Purpose & Capability
Name/description (Indeed scraping via Bright Data) match the requested credential (BRIGHTDATA_API_KEY) and required binaries (curl, jq). The scripts implement job/company searches, async triggers, polling, result formatting and local storage under ~/.config/indeed-brightdata/, which is consistent with the declared purpose.
Instruction Scope
SKILL.md directs the agent to run local shell scripts that call a Bright Data API and format results. The scripts reference only local config paths under ~/.config/indeed-brightdata/ and Bright Data endpoints (e.g., api.brightdata.com in one script). You should inspect scripts/_lib.sh (defines LIB_BASE_URL, auth header construction, file-write locations, and helper functions) to confirm there are no unexpected external endpoints or use of unrelated environment variables.
Install Mechanism
There is no remote installer or archive download; installation is a local install.sh that creates symlinks or packages a ZIP for desktop upload. That is low-risk compared with arbitrary remote downloads. The install script does create symlinks under platform skill directories (e.g., ~/.openclaw/skills/), which is expected for skills installation.
Credentials
Only BRIGHTDATA_API_KEY is required and declared as the primary credential, which is appropriate for a Bright Data integration. I found no other required secrets or unrelated env variables in the visible scripts. The skill stores config/data under ~/.config/indeed-brightdata/ and results/*.json, which is proportionate to its function.
Persistence & Privilege
always:false and normal autonomous invocation. The skill persists data locally under ~/.config/indeed-brightdata/ (datasets.json, pending.json, history.json, results/). The installer writes symlinks into platform-specific skill directories—expected for skill installation—so verify you trust the repository before granting write access to those locations.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install indeed-brightdata
  3. After installation, invoke the skill by name or use /indeed-brightdata
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.4
Initial public release of the skill with all core features and test coverage. - Added scripts for searching, scraping, and formatting Indeed job and company data via Bright Data's API. - Supports job search by keyword/location, company lookups, batch polling, and result formatting. - Introduced smart search with keyword expansion, result caching, and auto date range adjustment. - Includes full test suite, helper scripts, and sample data fixtures. - Provides a detailed SKILL.md with usage instructions, workflow, behavior rules, and security guarantees.
v0.1.3
Version 0.1.3 - Added LICENSE file to clarify the project's licensing terms (MIT license). - No functional or API changes in this version.
v0.1.2
- Added install.sh script for easier installation or setup. - No functional changes to the skill's behavior or API. - Documentation and usage remain unchanged.
v0.1.1
- Added "indeed_smart_search.sh" as a primary, intelligent job search script, including keyword expansion, parallel queries, and deduplication. - Introduced "indeed_format_results.sh" for formatting JSON results to user-friendly summaries or CSV, ensuring no raw JSON is presented to users. - Added "references/keyword-expansions.json" containing keyword-to-job-title mappings for improved search results. - Updated workflow and mandatory behavior rules for better handling on messaging platforms (Telegram, Signal), including message length, result splitting, and defaults. - Removed legacy "install.sh". - Improved documentation to clarify sync vs async flows, result formatting, caching behavior, and fallback/auto-expansion logic.
v0.1.0
Initial public release of Indeed-BrightData skill. - Search and scrape Indeed job listings and company information using Bright Data's Web Scraper API. - Supports job search by keyword, location, or URL; company lookup by URL, keyword, or industry. - Provides both synchronous (fast, by URL) and asynchronous (discovery, by keyword/industry) workflows. - Outputs structured JSON for easy agent summarization; includes recommended formatting and result limits. - Requires BRIGHTDATA_API_KEY; depends on curl and jq. - Includes fire-and-forget mode for messaging platforms and pending result management. - Documentation includes a full workflow decision tree, exit codes, and script reference table.
Metadata
Slug indeed-brightdata
Version 0.1.4
License
All-time Installs 1
Active Installs 1
Total Versions 5
Frequently Asked Questions

What is Indeed Brightdata?

Search and scrape Indeed job listings and company information using Bright Data's Web Scraper API. Use when the user asks to find jobs on Indeed, search for... It is an AI Agent Skill for Claude Code / OpenClaw, with 310 downloads so far.

How do I install Indeed Brightdata?

Run "/install indeed-brightdata" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Indeed Brightdata free?

Yes, Indeed Brightdata is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Indeed Brightdata support?

Indeed Brightdata is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Indeed Brightdata?

It is built and maintained by Tee Bunsopha (@foreztgump); the current version is v0.1.4.

💬 Comments