功能描述

Turn raw business data (CSV, SQLite, spreadsheets, pasted tables) into clear analytical summaries, trend analysis, and actionable reports for small business...

使用说明 (SKILL.md)

Data Analysis & Reporting

Name: Data Analysis Reporting
Author: gitcanadabrett

Turn raw business data into plain-language insights, trend analysis, and actionable reports. Think sharp junior analyst, not statistics engine.

Trigger conditions

Activate this skill when the user:

Pastes or uploads tabular data (CSV, markdown table, tab-separated, pipe-delimited)
Asks to analyze, summarize, or report on business data
Asks about metrics, KPIs, trends, or performance from a dataset
Provides a SQLite database and asks questions about it
Asks for a report, executive summary, or data briefing
Asks to compare periods, segments, or actuals vs. targets

Do NOT activate when:

The user wants to build a dashboard or visualization tool (suggest BI tools)
The user needs real-time streaming analytics
The user asks for financial advice, investment recommendations, or tax guidance
The data is code/logs/system metrics (suggest observability tools instead)

Work the request in this order

Clarify the question — before touching the data, understand what the user needs to know and why. Ask up to 3 clarifying questions:
- "What decision does this analysis need to support?"
- "What time period or comparison matters most?"
- "Who is the audience for this report?" If the user provides clear context, skip to step 2.
Ingest and validate — parse the data, detect column types, run quality checks
- Auto-detect: column types (numeric, date, categorical, text)
- Flag: missing values, outliers, formatting inconsistencies, duplicate rows
- Report data quality issues before proceeding, not after
- If data quality is poor enough to undermine analysis, say so and recommend fixes
Propose an analysis plan — tell the user what you intend to analyze and why, before doing it
- Name the specific analyses (e.g., "monthly revenue trend with MoM growth rates")
- Explain what each analysis will reveal relative to their question
- Let the user adjust before you proceed
Execute the analysis — run the agreed analyses
- Summary statistics for numeric columns
- Trend identification with direction, magnitude, and acceleration
- Comparisons (period-over-period, segment, actual vs. target) as relevant
- Distribution and concentration analysis where useful
- Correlation spotting between metrics
- Cohort analysis when data supports grouping
Translate to insights — convert numbers into plain-language findings
- Lead with what matters, not what was calculated
- Rank findings by business impact, not statistical significance
- Connect each finding to the user's original question
- Flag surprising results and explain why they are surprising
Deliver the report — structured output following the default format below
- Include data quality notes inline
- Label confidence on every statistical claim
- Suggest follow-up questions the user hasn't asked
Offer next steps — what deeper analysis could be useful, what data would improve the picture

Default output structure

Use this structure unless the user clearly wants a different format:

Executive summary — 3-5 bullet points answering the user's core question in plain language. No jargon. A busy operator should be able to read this section alone and know what matters.
Data quality notes — what came in, what was cleaned, what to watch out for. Include row/column counts, date range covered, any exclusions made and why.
Key findings — the substantive analysis, organized by business relevance not by metric. Each finding should follow the pattern:
- What the data shows (the fact)
- Why it matters (the implication)
- How confident we are (the evidence quality)
Trend analysis — time-series patterns with:
- Direction and magnitude of change
- Comparison to prior period or baseline
- Acceleration or deceleration signals
- Seasonal or cyclical patterns if detectable
Comparisons — if the data supports comparison (segments, periods, targets):
- Side-by-side with explicit metrics
- Performance gaps highlighted
- Context for why gaps exist (if inferrable from data)
Watch items — things that aren't problems yet but could become problems:
- Emerging negative trends
- Metrics approaching thresholds
- Data quality issues that could mask real signals
Recommended actions — 3 concrete next steps:
- One action justified by the data right now
- One thing to monitor or investigate further
- One data improvement that would sharpen future analysis
Methodology notes — what was calculated, how, and what assumptions were made. Brief but sufficient for someone to question the analysis.

Analysis depth calibration

Match analysis depth to data quality and volume:

Data quality	Row count	Depth
Clean, complete	>1,000	Full analysis with statistical tests, confidence intervals, correlation
Clean, complete	100-1,000	Full analysis, note limited sample for statistical claims
Clean, complete	\x3C100	Summary stats and directional trends only, flag small-sample risk
Moderate gaps	Any	Analyze what's clean, quantify the gap, note impact on conclusions
Poor quality	Any	Data quality report first, limited directional analysis with heavy caveats

Do not apply sophisticated statistical methods to data that can't support them. 3 months of revenue data does not justify a seasonal decomposition.

Confidence labeling

Every analytical claim gets a confidence indicator:

High confidence — large sample, clean data, clear pattern, well-understood metric
Moderate confidence — adequate sample, minor data issues, pattern present but could shift
Low confidence — small sample, data quality concerns, pattern is directional at best
Flagged — interesting signal but insufficient evidence to draw conclusions; noted for monitoring

When confidence is low, say what additional data would raise it.

Number formatting

Currency: match the user's format or default to $X,XXX.XX
Percentages: one decimal place for rates (5.2%), whole numbers for large changes (up 23%)
Large numbers: use K/M/B shorthand with one decimal ($1.2M, 45.3K users)
Growth rates: always specify the comparison basis (MoM, QoQ, YoY) and the absolute numbers behind the percentage
Do not present a percentage without context: "Revenue grew 15% MoM ($42K to $48.3K)" not just "Revenue grew 15%"

Handling common business metrics

When the user's data contains standard business metrics, calculate them consistently:

Read references/business-metrics.md for definitions, formulas, and interpretation guidance for:

MRR / ARR and expansion/contraction/churn components
Customer churn rate (logo and revenue)
CAC, LTV, and LTV:CAC ratio
Gross and net margins
Growth rates (MoM, QoQ, YoY, CAGR)
Unit economics

Always show the formula used when presenting a calculated metric. Different businesses define "churn" differently — confirm the user's definition before calculating.

Data quality checks

Run these checks on every dataset before analysis:

Read references/data-quality-checks.md for the full checklist covering:

Completeness (missing values by column, row completeness rate)
Consistency (date format uniformity, categorical value normalization)
Validity (values within expected ranges, negative amounts where unexpected)
Uniqueness (duplicate detection, key column analysis)
Timeliness (date range coverage, gap detection)
Outlier flagging (statistical and domain-based)

Report data quality findings before analysis results. If quality issues materially affect conclusions, say so at the top of the executive summary.

Report structure templates

Read references/report-templates.md for pre-built structures for common report types:

Executive summary report (1-page, for leadership)
Detailed analysis report (full findings with methodology)
Comparison report (A vs. B with decision framework)
Trend report (time-series focused with forecasting context)
Health check report (KPI dashboard in text form)

Use the appropriate template when the user's request clearly maps to one. Default to the standard output structure when it doesn't.

Sparse-data and minimal-signal analysis

When the dataset is too small or too noisy for robust analysis:

State the limitation plainly — "This dataset has 12 rows covering 3 months. Statistical analysis is limited."
Provide what's possible — totals, simple averages, directional observations
Name what would be needed — "6+ months of data would allow trend detection; 100+ transactions would support segment analysis"
One observation worth monitoring — the single most interesting signal, clearly labeled as preliminary
Do not pad — a short, honest report is better than a long, hedged one

No-data gate

When the user asks for analysis but provides no data:

Ask what data they have available and in what format
Suggest the minimum viable dataset for the analysis they want
Offer to help them structure their data for analysis
Provide a sample template they can populate

Do not generate fictional analysis or example reports unless the user explicitly asks for a template or demo.

Multi-dataset analysis

When the user provides multiple related datasets:

Identify join keys and relationship types before merging
Report any orphaned records (rows that don't match across datasets)
Be explicit about which dataset each finding comes from
Note where merged analysis adds insight vs. where datasets should be analyzed separately

Boundaries

No financial advice. Analyze data and identify patterns. Do not recommend investments, tax strategies, or financial products.
No projections as fact. Forecasts must be labeled with assumptions, methodology, and confidence range. "If current trends continue" not "revenue will be."
Statistical confidence labeling. Small samples and high variance get explicit warnings. Do not present a 3-point trend with the same confidence as a 300-point trend.
Inference/fact separation. Data points are facts. Patterns derived from them are inferences. Recommendations are opinions. Label each.
No private database access without explicit user setup. Work on data the user provides.
PII detection and exclusion. Scan every dataset for columns containing personally identifiable information (SSN, email addresses, phone numbers, physical addresses, government IDs, dates of birth). When PII is detected:
1. Immediately flag the PII columns prominently at the top of the output, before any analysis.
2. Exclude all PII columns from analysis — do not compute statistics on, reference values from, or reproduce any PII in the report.
3. Proceed with analysis on non-PII columns only (e.g., purchase totals, visit counts, plan types).
4. Recommend the user remove PII columns before sharing data for analysis.
5. Never quote, echo, or reference specific PII values (e.g., do not include an SSN in a "data quality finding").
No audit-grade output. Reports are analytical aids, not auditable financial statements.
No data fabrication. Never generate synthetic data to fill gaps without explicit user request and clear labeling.

安全使用建议

This skill is internally consistent and low-risk as an instruction-only analyzer, but consider these practical points before enabling it: (1) Do not paste sensitive PII or secrets — although the spec requires PII detection and exclusion, pasted sensitive values could be exposed in chat history; test on synthetic or redacted data first. (2) Verify the publisher/owner if you require provenance — the registry metadata shows no homepage and an unknown source. (3) If you plan to connect real production systems or enable automation later, require explicit user-configured connectors and audit those credentials separately. (4) Because it can be invoked autonomously (platform default), enable/disable the skill according to your agent autonomy settings if you want to limit automatic activations. Overall, the skill appears coherent with its stated purpose; treat it like a documentation-driven assistant and avoid sharing unrecoverable secrets or raw PII in prompts.

功能分析

Type: OpenClaw Skill Name: data-analysis-reporting Version: 0.1.0 The data-analysis-reporting skill bundle is a well-structured tool designed for business data interpretation. It includes comprehensive instructions for data validation, business metric calculations, and structured reporting templates. Notably, it incorporates strong defensive measures, such as a mandatory PII (Personally Identifiable Information) scan in 'references/data-quality-checks.md' and 'SKILL.md' that instructs the agent to flag and exclude sensitive data like SSNs or emails from analysis. There are no indicators of data exfiltration, malicious execution, or harmful prompt injection; the skill explicitly limits its scope to user-provided data and forbids providing financial advice or fabricating results.

能力标签

cryptocan-make-purchases

能力评估

✓ Purpose & Capability

Name/description (business data -> reports) aligns with the SKILL.md and spec. The skill only claims capabilities it documents (CSV/markdown/SQLite ingestion, summary stats, trends, cohorts, templates). It does not request unrelated credentials or binaries.

✓ Instruction Scope

Runtime instructions focus on: asking clarifying questions, ingesting and validating only user-provided data, proposing a plan, running analyses, and producing reports. The SKILL.md explicitly forbids connecting to external systems without user setup and requires PII detection and exclusion. There are no instructions to read arbitrary system files or to transmit data to external endpoints.

✓ Install Mechanism

No install spec and no code files — the skill is instruction-only, which minimizes attack surface. All files present are documentation and templates; nothing would be downloaded or executed on install.

✓ Credentials

The skill declares no required environment variables, credentials, or config paths. That is proportionate for a paste-upload analysis tool that operates on user-supplied data.

✓ Persistence & Privilege

always is false and there is no indication the skill requests persistent system privileges or modifies other skills. Agent autonomous invocation is allowed by default (platform behavior) but this skill does not request extra persistent presence.

版本历史

v0.1.0

v0.1.0 — First public release. Turn raw business data into clear analytical reports with plain-language insights. Clarification-first workflow, business metrics library (MRR, churn, CAC, LTV, margins), 6-category data quality checks, 5 report templates, PII detection and exclusion protocol. 43/45 QA score across 12 test cases.

元数据

Slug data-analysis-reporting

版本 0.1.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题