← Back to Skills Marketplace
hundevmode

Website Email Scraper (Apify)

by hundevmode · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
39
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install website-email-scraper-apify
Description
Use this skill when the user needs public business emails, phone numbers, social profiles, source URLs, and crawl diagnostics from website domains or URLs th...
README (SKILL.md)

Website Email Scraper Apify Skill

Overview

This skill helps an AI agent run the Apify Website Email Scraper & Phone Finder actor for public website contact extraction from domains and URLs.

Default actor:

  • Actor ID: kWfD7C0WpHtIt8VAh
  • Actor name: x_guru/website-email-phone-finder
  • Store page: https://apify.com/x_guru/website-email-phone-finder
  • Console source: https://console.apify.com/actors/kWfD7C0WpHtIt8VAh/source

Use this skill when a user asks to:

  • scrape public business emails from website domains
  • find emails from company websites, landing pages, contact pages, or domain lists
  • enrich lead lists with emails, phones, social profile links, source URLs, and crawl diagnostics
  • process domains from Google Maps, CRMs, spreadsheets, directories, search results, Apollo-style lists, or agency prospecting workflows
  • return only websites with emails, only websites with any contact, or all scanned websites
  • control Apify spend with maxTotalChargeUsd
  • export contact rows for Sheets, Airtable, n8n, CRM, BI, CSV, JSON, or agent workflows

Quick Workflow

  1. Clarify the submitted domains or website URLs and the desired saved result count.
  2. Use resultMode: "emailsOnly" by default for email lead extraction.
  3. Use contactsOnly when phone numbers or social profiles are useful even without emails.
  4. Use allWebsites only when the user needs diagnostics for every submitted website.
  5. Keep maxPagesPerWebsite at 3 for fast runs; use 5-10 when contacts are likely on staff, team, legal, imprint, or contact pages.
  6. Set includePersonalData=false when person-like emails or personal LinkedIn profile URLs should be excluded.
  7. Set a budget guard with Apify maxTotalChargeUsd when spend matters.
  8. Run scripts/website_email_scraper_actor.py or call the Apify API directly.
  9. Return compact metrics and website contact rows. Check RUN_SUMMARY for diagnostics when counts are lower than requested.

Payload Rules

  • Use domains for bare domains and full website URLs.
  • urls and startUrls can be normalized into domains by the runner for agent convenience.
  • maxResults is the maximum number of saved dataset rows.
  • resultMode must be emailsOnly, contactsOnly, or allWebsites.
  • maxPagesPerWebsite must be 1-25; default is 3.
  • concurrency must be 1-500; default is 100.
  • requestTimeoutSecs must be 2-30; default is 5.
  • extractPhones, extractSocials, includePersonalData, and sameDomainOnly are booleans.
  • Do not send Google Maps search fields such as searchStringsArray, placeIds, locationQuery, or review fields to this website-only actor.
  • Pass maxTotalChargeUsd as an Apify run option, not inside actor input. The included script exposes it as --budget-usd.

Authentication

Use the Apify API token from the environment:

export APIFY_TOKEN='apify_api_xxx'

Never hardcode or print the full token in user-facing output.

Script Usage

The bundled script uses only Python standard library.

Run a quick domain email scrape:

APIFY_TOKEN='apify_api_xxx' \
python3 scripts/website_email_scraper_actor.py quick-domains \
  --domains example.com apify.com \
  --max-results 50 \
  --budget-usd 1

Run with deeper contact-page discovery:

APIFY_TOKEN='apify_api_xxx' \
python3 scripts/website_email_scraper_actor.py quick-domains \
  --domains centralrestaurante.com alchemist.dk caitlinmcweeney.com \
  --max-results 100 \
  --max-pages 5 \
  --result-mode emailsOnly \
  --budget-usd 1

Run custom JSON:

APIFY_TOKEN='apify_api_xxx' \
python3 scripts/website_email_scraper_actor.py run \
  --input-file references/sample_input.json \
  --budget-usd 1

Recommended Inputs

Public email leads only

{
  "domains": ["centralrestaurante.com", "alchemist.dk", "caitlinmcweeney.com"],
  "maxResults": 1000,
  "resultMode": "emailsOnly",
  "maxPagesPerWebsite": 3,
  "concurrency": 100,
  "requestTimeoutSecs": 5,
  "extractPhones": true,
  "extractSocials": true,
  "includePersonalData": true,
  "sameDomainOnly": true
}

Company inboxes only

{
  "domains": ["example.com", "https://example.com/contact"],
  "maxResults": 500,
  "resultMode": "emailsOnly",
  "includePersonalData": false,
  "extractPhones": true,
  "extractSocials": true
}

Contact records for every website with any public contact

{
  "domains": ["example.com", "apify.com"],
  "maxResults": 100,
  "resultMode": "contactsOnly",
  "maxPagesPerWebsite": 5
}

Output Contract

The runner returns JSON:

  • ok
  • actorId
  • fetchedAt
  • inputUsed
  • itemCount
  • rows[]

Rows are actor dataset items. Important groups:

  • Website identity: input, url, domain, status
  • Emails: emails, emailDetails.email, emailDetails.type, emailDetails.sourceUrl, emailDetails.domainMatch
  • Contacts: phones, socialLinks, facebooks, instagrams, linkedIns, twitters, youtubes, tiktoks
  • Crawl diagnostics: contactSignals, pagesFetched, fetchedUrls, httpStatusCodes, errors, durationMs

For the full contract, read references/input-output-contract.md.

Agent Response Rules

  • If rows are empty, say the run succeeded but no website contact rows matched the selected mode, then suggest checking RUN_SUMMARY.
  • If fewer rows than requested are returned, explain that submitted websites had fewer public contacts, the result mode filtered rows, or budget stopped saving.
  • If emails is empty in contactsOnly or allWebsites, explain that the row was saved due to phone/social/diagnostic data.
  • Explain website email extraction as best-effort because each website controls what it publishes.
  • Use maxTotalChargeUsd for any user concerned about spend.
  • Do not promise Google Maps place discovery from this actor. Use the Google Maps Email Extractor actor when the user needs search-by-keyword/location first.

References

  • references/input-output-contract.md
  • references/sample_input.json
  • references/troubleshooting.md
Usage Guidance
Install only if you intend to send submitted website domains or URLs to Apify for contact extraction. Use budget limits, avoid collecting personal data unless necessary, set includePersonalData=false when business inboxes are enough, and make sure your use complies with website terms, outreach rules, privacy laws, and your organization’s retention requirements.
Capability Assessment
Purpose & Capability
The stated purpose, docs, script, and output contract consistently focus on extracting public website emails, phones, social links, source URLs, and crawl diagnostics through a named Apify actor. The capability includes optional person-like emails and personal LinkedIn URLs, which is expected for this kind of lead-enrichment tool but privacy-sensitive.
Instruction Scope
The skill gives normal payload-building and response rules for submitted domains or URLs, with limits for result mode, crawl depth, concurrency, timeouts, and same-domain crawling. The default examples enable includePersonalData=true and the docs provide limited legal/privacy guidance, so users should apply their own compliance checks.
Install Mechanism
The package declares a standard skill install path, no third-party Python dependencies, and one Python standard-library helper script. I found no post-install hooks, obfuscated commands, destructive install behavior, or hidden package setup.
Credentials
Requiring APIFY_TOKEN and sending website/domain inputs to Apify is proportionate and disclosed for an Apify actor runner. The script warns not to hardcode or print the token, uses a budget guard, and does not appear to send unrelated local data.
Persistence & Privilege
The artifacts do not create background workers, local persistence, privilege escalation, broad filesystem indexing, or credential/session harvesting. Runtime output may be stored in Apify datasets as part of the disclosed actor workflow.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install website-email-scraper-apify
  3. After installation, invoke the skill by name or use /website-email-scraper-apify
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial Website Email Scraper Apify agent skill.
Metadata
Slug website-email-scraper-apify
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Website Email Scraper (Apify)?

Use this skill when the user needs public business emails, phone numbers, social profiles, source URLs, and crawl diagnostics from website domains or URLs th... It is an AI Agent Skill for Claude Code / OpenClaw, with 39 downloads so far.

How do I install Website Email Scraper (Apify)?

Run "/install website-email-scraper-apify" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Website Email Scraper (Apify) free?

Yes, Website Email Scraper (Apify) is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Website Email Scraper (Apify) support?

Website Email Scraper (Apify) is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Website Email Scraper (Apify)?

It is built and maintained by hundevmode (@hundevmode); the current version is v1.0.0.

💬 Comments