← 返回 Skills 市场
bettermen

数据分析师skill

作者 bettermen · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
38
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install data-analyst-pipeline
功能描述
数据分析师自动化工作流。从数据加载、质量审计、数据清洗、探索性分析(EDA)、统计建模到可视化HTML报告生成,覆盖完整数据分析管线。支持CSV/Excel/JSON/SQLite多格式输入,内置4层数据防御体系。触发词:分析数据、数据分析、帮我分析数据、数据报告、EDA、data analysis、analyz...
使用说明 (SKILL.md)

数据分析师 (Data Analyst)

AI-powered data analysis workflow. Cover the full pipeline from data ingestion to interactive HTML report generation.

When to Use

Trigger when the user asks to:

  • Analyze a dataset (CSV / Excel / JSON / SQLite)
  • Generate a data analysis report
  • Do exploratory data analysis (EDA)
  • Clean or preprocess data
  • Create data visualizations
  • Understand data distributions and relationships

Workflow Overview

The skill follows a 7-phase CRISP-DM pipeline, executed automatically:

  1. Data Loading — Auto-detect format, load into DataFrame
  2. Data Audit — 4-layer defense: health check, structure, business rules, model readiness
  3. Data Cleaning — Missing values, outliers, type conversion, dedup
  4. EDA — Distribution analysis, correlation, group aggregation
  5. Statistical Analysis — Descriptive stats, hypothesis tests, trend detection
  6. Visualization — Charts for distributions, correlations, category breakdowns
  7. Report Generation — Interactive HTML report with scorecards, charts, and insights

Usage

Quick Start

To analyze a data file:

python {baseDir}/scripts/run_analysis.py \x3Cdata_file> [--output report.html]

The script auto-detects the file format and runs the full pipeline.

Module-Level Usage

Each module can be used independently:

# Load data
from data_loader import load_data
df = load_data("sales.csv")

# Audit data quality
from data_auditor import audit_data
report = audit_data(df)

# Clean data
from data_cleaner import clean_data
df_clean = clean_data(df)

# Run EDA
from eda_runner import run_eda
eda_results = run_eda(df_clean)

# Generate report
from report_builder import build_report
build_report(df_clean, eda_results, "report.html")

Scripts Reference

Script Purpose Input Output
scripts/run_analysis.py Main entry — orchestrates full pipeline data file path HTML report
scripts/data_loader.py Multi-format data loading file path pandas DataFrame
scripts/data_auditor.py 4-layer quality defense DataFrame audit dict
scripts/data_cleaner.py Data cleaning & preprocessing DataFrame cleaned DataFrame
scripts/eda_runner.py Exploratory data analysis DataFrame EDA results dict
scripts/visualizer.py Chart generation DataFrame + config saved .png charts
scripts/report_builder.py HTML report generation Data + results HTML report

Templates

  • templates/report.html — Jinja2 template for the final HTML report

Config

  • config/business_rules.yaml — Optional business validation rules

Dependencies

Install before first use:

pip install pandas numpy matplotlib seaborn scipy jinja2 pyyaml missingno

Notes

  • For files > 100MB, the audit module uses sampling (n=50000) to stay performant
  • Business rules in config/business_rules.yaml are optional; skip if no domain-specific rules exist
  • All charts are saved to a charts/ subdirectory in the output folder before embedding in HTML
安全使用建议
Install only if you are comfortable giving the skill access to the datasets you explicitly point it at. Treat generated HTML reports, charts, summary JSON, and terminal output as potentially sensitive, and avoid using untrusted datasets because report text is built from dataset-derived names and values.
能力评估
Purpose & Capability
The files coherently implement the stated data-analysis workflow: load CSV/Excel/JSON/SQLite data, audit and clean it, run EDA, generate charts, and build an HTML report.
Instruction Scope
The trigger phrases are broad, but they are all data-analysis related; users should invoke it intentionally with explicit dataset paths.
Install Mechanism
No hidden installer, hooks, or auto-start mechanism were found. Dependencies are disclosed as normal Python packages to install manually.
Credentials
Filesystem reads and writes are proportionate to the purpose: it reads the chosen data file and writes reports, charts, and summary JSON. Outputs may reveal dataset schema, values, and analysis results.
Persistence & Privilege
No persistence, privilege escalation, credential access, network access, or destructive behavior was found. It only creates local report artifacts and chart files.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install data-analyst-pipeline
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /data-analyst-pipeline 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
初版发布:完整数据分析管线。支持CSV/Excel/JSON/SQLite多格式输入,4层数据质量审计,自动EDA分析,7类可视化图表,交互式HTML报告。
元数据
Slug data-analyst-pipeline
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

数据分析师skill 是什么?

数据分析师自动化工作流。从数据加载、质量审计、数据清洗、探索性分析(EDA)、统计建模到可视化HTML报告生成,覆盖完整数据分析管线。支持CSV/Excel/JSON/SQLite多格式输入,内置4层数据防御体系。触发词:分析数据、数据分析、帮我分析数据、数据报告、EDA、data analysis、analyz... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 38 次。

如何安装 数据分析师skill?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install data-analyst-pipeline」即可一键安装,无需额外配置。

数据分析师skill 是免费的吗?

是的,数据分析师skill 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

数据分析师skill 支持哪些平台?

数据分析师skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 数据分析师skill?

由 bettermen(@bettermen)开发并维护,当前版本 v1.0.0。

💬 留言讨论