Description

Conducts in-depth customer research by mining forums, generating surveys and interviews, scraping competitor reviews, and analyzing sentiment to validate mar...

README (SKILL.md)

Customer Research & Validation Skill

Name: Customer Research & Validation
Author: clawdiri-ai

Trigger conditions:

User asks to validate a product idea, persona, or market assumption
User mentions "customer research", "validate assumption", "talk to users"
User requests Reddit/forum mining, competitor analysis, or sentiment analysis
User wants to generate surveys or interview scripts
User asks about customer pain points, needs, or jobs-to-be-done

Purpose

Pre-pipeline validation for DaVinci Enterprises products. Ensures marketing strategy is built on real customer signal, not assumptions. Prevents building features nobody wants.

What It Does

Reddit/Forum Mining — Extract threads, comments, sentiment from subreddits and forums
Survey Generation — Convert research questions into structured surveys
Interview Scripts — Generate customer interview guides with probing questions
Persona Validation — Test persona assumptions against real user behavior
Competitor Review Scraping — Aggregate reviews from G2, Trustpilot, Reddit
Sentiment Analysis — Aggregate and score customer sentiment across sources

Usage

Quick Start

# Validate a product hypothesis via Reddit mining
scripts/reddit-miner.sh --subreddit "personalfinance" --query "FIRE calculator" --limit 50

# Generate a customer interview script
scripts/interview-generator.sh --persona "FIRE enthusiast" --problem "retirement planning tools"

# Scrape competitor reviews
scripts/competitor-scraper.sh --product "Personal Capital" --sources "g2,trustpilot,reddit"

Integration with Marketing Pipeline

This skill feeds into the content strategy workflow:

Discovery → Run customer research to identify pain points
Validation → Test persona assumptions against real data
Strategy → Build content pillars around validated needs
Execution → Ogilvy creates content targeting real customer language

Output format: JSON reports to data/research/ for downstream consumption.

Scripts

reddit-miner.sh

Fetch Reddit threads matching keywords, extract sentiment, output structured JSON.

Usage:

./scripts/reddit-miner.sh --subreddit SUBREDDIT --query "search terms" [--limit N] [--sentiment]

Output: data/research/reddit-{subreddit}-{timestamp}.json

interview-generator.sh

Generate customer interview script from persona + problem statement.

Usage:

./scripts/interview-generator.sh --persona "description" --problem "pain point"

Output: Markdown interview guide to stdout

competitor-scraper.sh

Aggregate reviews from multiple sources, extract themes and sentiment.

Usage:

./scripts/competitor-scraper.sh --product "Product Name" --sources "g2,trustpilot,reddit"

Output: data/research/competitor-{product}-{timestamp}.json

Output Schema

All scripts output to data/research/ with consistent JSON schema:

{
  "meta": {
    "skill": "customer-research",
    "script": "reddit-miner",
    "timestamp": "2026-03-22T00:43:00Z",
    "query": {...}
  },
  "findings": [
    {
      "source": "reddit",
      "source_id": "thread_abc123",
      "text": "I wish there was a FIRE calculator that...",
      "sentiment": 0.65,
      "themes": ["pain point", "feature request"],
      "metadata": {...}
    }
  ],
  "summary": {
    "total_sources": 47,
    "avg_sentiment": 0.42,
    "top_themes": ["complexity", "cost", "trust"],
    "key_insights": ["Users want transparency", "Price sensitivity high"]
  }
}

Dependencies

jq — JSON processing
curl — HTTP requests
Reddit API access (optional: can scrape public threads without auth)
OpenClaw LLM access for sentiment analysis

Example Workflow

Scenario: Validate demand for FIRE Sim product

Mine Reddit pain points:

./scripts/reddit-miner.sh --subreddit "financialindependence" \
  --query "retirement calculator problems" --limit 100 --sentiment

Scrape Personal Capital reviews:

./scripts/competitor-scraper.sh --product "Personal Capital" \
  --sources "g2,trustpilot,reddit"

Generate interview script:

./scripts/interview-generator.sh \
  --persona "30-40 tech worker, $200K income, aiming FIRE by 45" \
  --problem "existing retirement tools too conservative or too complex"

Analyze findings:
- Review JSON outputs in data/research/
- Identify recurring themes, pain points, language patterns
- Validate/invalidate persona assumptions
- Feed insights into content strategy
Document learnings:
- Update projects/davinci-enterprises/customer-insights.md
- Flag validated needs for product roadmap
- Inform Ogilvy content pillars with real customer language

Quality Gates

Minimum sample size: 30+ sources per research question
Sentiment confidence: Only report sentiment scores with >50 samples
Theme validation: Themes must appear in ≥3 independent sources
Source diversity: Mix Reddit, review sites, forums (not just one platform)

Anti-Patterns

❌ Don't:

Build features based on one Reddit comment
Cherry-pick data to confirm existing beliefs
Skip competitor analysis (reinventing the wheel wastes time)
Ignore negative sentiment (it's the most valuable signal)

✅ Do:

Let data challenge your assumptions
Track quotes verbatim (real customer language = gold for content)
Cross-reference findings across sources
Document what you disproved, not just what you confirmed

Integration Points

Content Strategy: Feed validated pain points to Ogilvy for pillar creation
Product Roadmap: Link research findings to JIRA/task tickets
Persona Database: Update persona definitions based on validation results
Marketing Copy: Extract customer language for landing pages, ads

Maintenance

Research data retention: 90 days (then archive to cold storage)
Re-run validation quarterly for active products
Update scripts when Reddit/review site APIs change
Log failed scrapes to logs/customer-research-errors.log

Next Steps After Running Research:

Review findings in data/research/
Update persona docs with validated/invalidated assumptions
Create content strategy tasks based on identified pain points
Schedule customer interviews if online research raises questions
Document learnings in project-specific CONTEXT.md

Usage Guidance

This package appears coherent with its stated goal (mining communities, scraping reviews, generating surveys/interviews) but exercise caution before running it: 1) Inspect setup.sh and any scripts for network calls (look for curl, requests, sockets, or any hardcoded remote endpoints) and for any calls that post data externally. 2) Search scripts for invocations of LLM/CLI tools (e.g., openclaw/chat, external API endpoints). Because the manifest does not declare credentials, verify where you would need to provide API keys (Reddit, Playwright-driven sites, or an LLM) and assess whether those keys are necessary. 3) Run the scripts in a sandbox or isolated environment (container or VM) first, with no sensitive credentials mounted, until you confirm behavior. 4) If you plan scheduled/cron runs, ensure outputs are written to a controlled directory and that retention/archival behavior (90 days) is acceptable. 5) If you want a stronger assurance, share the contents of setup.sh and the top-level network-related sections of reddit-miner.py and competitor-scraper.py for focused review—if those files include obfuscated code, undocumented endpoints, or credential exfiltration, my assessment would escalate.

Capability Analysis

Type: OpenClaw Skill Name: customer-research-dv Version: 0.1.0 The Customer Research & Validation skill bundle provides a suite of tools for market analysis, including Reddit mining, competitor review scraping, and persona validation. The implementation consists of well-documented Bash and Python scripts (e.g., `reddit-miner.py`, `competitor-scraper.py`) that utilize standard libraries like `praw`, `BeautifulSoup`, and `requests` to gather data from public sources. No evidence of malicious intent, data exfiltration, or harmful prompt injection was found; the scripts operate within their stated scope and follow standard practices for public API interaction and web scraping.

Capability Assessment

ℹ Purpose & Capability

The scripts, docs, and CLI wrapper align with the described functionality (Reddit/forum mining, survey/interview generation, persona validation, competitor scraping). However the SKILL.md and README refer to using 'OpenClaw LLM access for sentiment analysis' while the skill declares no required credentials or primaryEnv; similarly competitor scraping mentions future browser automation (Playwright) for sites like G2/Trustpilot but no explicit runtime requirements are declared. These are minor coherence gaps (expected behavior but not fully documented in manifest).

ℹ Instruction Scope

Runtime instructions are explicit about what will be done: crawling Reddit/public JSON, scraping review sites, writing JSON/markdown to local paths (data/research/, projects/...). The SKILL.md and scripts direct the agent to gather community text and store results locally. This scope is consistent with the purpose, but the docs instruct running recurring jobs (cron) and updating project files, which broadens scope of file writes—users should review where outputs are stored and any automation hooks before use.

ℹ Install Mechanism

The registry lists no formal install spec (instruction-only), which is low risk. The repository includes setup.sh and a Python requirements.txt; running setup.sh would write to disk and install dependencies. No remote downloads from obscure URLs were listed in the manifest excerpts, but any install should be inspected before executing setup.sh or pip installs.

⚠ Credentials

The skill references use of the OpenClaw LLM for sentiment analysis and (optionally) Reddit API access, browser automation, and future APIs, but the manifest declares no required environment variables or primary credentials. That is a discrepancy: if you enable authenticated Reddit API use, or if the LLM requires an API key or a webhook, those credentials will be needed but are not declared. Treat any prompts to supply API keys or tokens as significant—only provide them if you trust the source and inspect the code paths that use them.

✓ Persistence & Privilege

The skill is not force-enabled (always: false), and it does not request elevated platform privileges in the manifest. It writes outputs to local project directories per the documentation, which is expected for a research tool. Autonomous invocation is allowed by default (normal), so consider limiting autonomous runs if you worry about automated scraping or exfiltration.

Version History

v0.1.0

Initial release

Metadata

Slug customer-research-dv

Version 0.1.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Customer Research & Validation?

Conducts in-depth customer research by mining forums, generating surveys and interviews, scraping competitor reviews, and analyzing sentiment to validate mar... It is an AI Agent Skill for Claude Code / OpenClaw, with 141 downloads so far.

How do I install Customer Research & Validation?

Run "/install customer-research-dv" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Customer Research & Validation free?

Yes, Customer Research & Validation is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Customer Research & Validation support?

Customer Research & Validation is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Customer Research & Validation?

It is built and maintained by RunByDaVinci (@clawdiri-ai); the current version is v0.1.0.

More Skills

Customer Research & Validation

Customer Research & Validation Skill

Purpose

What It Does

Usage

Quick Start

Integration with Marketing Pipeline

Scripts

reddit-miner.sh

interview-generator.sh

competitor-scraper.sh

Output Schema

Dependencies

Example Workflow

Quality Gates

Anti-Patterns

Integration Points

Maintenance

What is Customer Research & Validation?

How do I install Customer Research & Validation?

Is Customer Research & Validation free?

Which platforms does Customer Research & Validation support?

Who created Customer Research & Validation?

💬 Comments