← 返回 Skills 市场
534422530

数据分析引擎

作者 534422530 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
24
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install laosi-data-analyzer
功能描述
数据分析 - 加载CSV/JSON自动计算统计描述(均值/中位数/标准差/极值),异常检测,趋势分析,结果本地持久化
使用说明 (SKILL.md)

Data Analyzer - 数据分析引擎

激活词: 分析数据 / data analyze / 统计

功能

  • CSV/JSON 数据加载解析
  • 自动统计描述:均值、中位数、标准差、极值
  • 异常值检测(IQR/z-score)
  • 趋势判断(上升/下降/波动)
  • 结果保存到本地 JSON

Python 实现

import csv, json, statistics, math
from datetime import datetime
from typing import List, Dict, Any

class DataAnalyzer:
    def __init__(self):
        self.data: List[Dict[str, Any]] = []
        self.numeric_cols: List[str] = []
    
    def load_csv(self, path: str, delimiter: str = ",") -> int:
        """从CSV加载数据"""
        with open(path, newline="", encoding="utf-8") as f:
            reader = csv.DictReader(f, delimiter=delimiter)
            self.data = list(reader)
        self._detect_numeric()
        return len(self.data)
    
    def load_json(self, path: str) -> int:
        """从JSON加载数据(支持列表和记录列表)"""
        with open(path, encoding="utf-8") as f:
            raw = json.load(f)
        if isinstance(raw, list):
            self.data = raw
        elif isinstance(raw, dict):
            # 尝试找到第一个列表字段
            for v in raw.values():
                if isinstance(v, list):
                    self.data = v
                    break
        self._detect_numeric()
        return len(self.data)
    
    def _detect_numeric(self):
        """自动检测数值列"""
        if not self.data:
            return
        for col in self.data[0]:
            try:
                float(self.data[0][col])
                self.numeric_cols.append(col)
            except (ValueError, TypeError):
                pass
    
    def describe(self, col: str) -> dict:
        """数值列的统计描述"""
        if col not in self.numeric_cols:
            return {"error": f"'{col}' is not numeric"}
        vals = [float(r[col]) for r in self.data if r.get(col)]
        
        n = len(vals)
        mean = statistics.mean(vals)
        median = statistics.median(vals)
        stdev = statistics.stdev(vals) if n > 1 else 0
        
        # 异常检测 (IQR方法)
        sorted_vals = sorted(vals)
        q1 = sorted_vals[n // 4]
        q3 = sorted_vals[3 * n // 4]
        iqr = q3 - q1
        lower = q1 - 1.5 * iqr
        upper = q3 + 1.5 * iqr
        outliers = [v for v in vals if v \x3C lower or v > upper]
        
        # 趋势判断
        half = n // 2
        first_half = statistics.mean(vals[:half]) if half > 0 else mean
        second_half = statistics.mean(vals[half:]) if half > 0 else mean
        trend = "up" if second_half > first_half * 1.05 else "down" if second_half \x3C first_half * 0.95 else "stable"
        
        return {
            "column": col,
            "count": n,
            "mean": round(mean, 2),
            "median": round(median, 2),
            "stdev": round(stdev, 2),
            "min": round(min(vals), 2),
            "max": round(max(vals), 2),
            "range": round(max(vals) - min(vals), 2),
            "q1": round(q1, 2),
            "q3": round(q3, 2),
            "iqr": round(iqr, 2),
            "outliers": len(outliers),
            "outlier_values": [round(v, 2) for v in outliers[:10]],
            "trend": trend,
        }
    
    def correlation(self, col1: str, col2: str) -> float:
        """Pearson相关系数"""
        if col1 not in self.numeric_cols or col2 not in self.numeric_cols:
            return None
        pairs = [(float(r[col1]), float(r[col2])) for r in self.data
                 if r.get(col1) and r.get(col2)]
        n = len(pairs)
        if n \x3C 3:
            return None
        sum_x = sum(p[0] for p in pairs)
        sum_y = sum(p[1] for p in pairs)
        sum_xy = sum(p[0] * p[1] for p in pairs)
        sum_x2 = sum(p[0] ** 2 for p in pairs)
        sum_y2 = sum(p[1] ** 2 for p in pairs)
        num = n * sum_xy - sum_x * sum_y
        den = math.sqrt((n * sum_x2 - sum_x ** 2) * (n * sum_y2 - sum_y ** 2))
        return round(num / den, 3) if den else 0
    
    def report(self, output: str = None) -> dict:
        """完整分析报告"""
        report = {
            "rows": len(self.data),
            "columns": list(self.data[0].keys()) if self.data else [],
            "numeric_columns": self.numeric_cols,
            "statistics": {col: self.describe(col) for col in self.numeric_cols},
            "timestamp": datetime.now().isoformat(),
        }
        # 相关性矩阵
        if len(self.numeric_cols) >= 2:
            report["correlations"] = {}
            for i, c1 in enumerate(self.numeric_cols):
                for c2 in self.numeric_cols[i+1:]:
                    corr = self.correlation(c1, c2)
                    if corr is not None:
                        report["correlations"][f"{c1}_vs_{c2}"] = corr
        
        if output:
            with open(output, "w", encoding="utf-8") as f:
                json.dump(report, f, ensure_ascii=False, indent=2)
        return report

# 使用示例
analyzer = DataAnalyzer()

# 模拟数据
sample_data = [
    {"date": "2026-05-01", "revenue": 1200, "users": 45, "conversion": 0.12},
    {"date": "2026-05-02", "revenue": 1350, "users": 52, "conversion": 0.14},
    {"date": "2026-05-03", "revenue": 1100, "users": 38, "conversion": 0.11},
    {"date": "2026-05-04", "revenue": 1600, "users": 61, "conversion": 0.13},
    {"date": "2026-05-05", "revenue": 900,  "users": 30, "conversion": 0.09},
    {"date": "2026-05-06", "revenue": 1450, "users": 55, "conversion": 0.15},
    {"date": "2026-05-07", "revenue": 1300, "users": 48, "conversion": 0.11},
]
analyzer.data = sample_data
analyzer._detect_numeric()

# 描述统计
desc = analyzer.describe("revenue")
print(f"营收: 均值={desc['mean']}, 中位数={desc['median']}, 趋势={desc['trend']}")
print(f"异常值: {desc['outliers']}个")

# 相关性
corr = analyzer.correlation("revenue", "users")
print(f"营收-用户 相关系数: {corr}")

# 完整报告
report = analyzer.report("analysis_results.json")
print(f"分析完成: {report['rows']}条记录, {len(report['statistics'])}个数值列")

输出示例

{
  "rows": 7,
  "columns": ["date", "revenue", "users", "conversion"],
  "statistics": {
    "revenue": {
      "mean": 1271.43,
      "median": 1300.0,
      "stdev": 239.05,
      "min": 900,
      "max": 1600,
      "trend": "stable"
    }
  },
  "correlations": {
    "revenue_vs_users": 0.985,
    "revenue_vs_conversion": 0.672
  }
}

使用场景

  1. 业务报表: 月度/周度运营数据自动分析
  2. A/B测试: 实验组vs对照组的关键指标对比
  3. 数据质量: 异常值检测发现数据采集问题
  4. 趋势监控: 连续跟踪指标变化方向

依赖

  • Python 3.8+
  • 标准库(csv, json, statistics, math)
安全使用建议
Before installing, users should understand that analyzing sensitive datasets may expose derived information in the agent’s context, and saving a report will leave a local JSON file behind. Use it with files you intended to analyze and choose output paths deliberately.
能力评估
Purpose & Capability
The advertised purpose is data analysis, and the artifact implements matching capabilities: CSV/JSON loading, statistical summaries, outlier detection, trend checks, correlations, and optional JSON report output.
Instruction Scope
The activation phrases are broad, so the skill could be invoked for ordinary data-analysis requests, but the behavior remains aligned with those requests and does not add unrelated authority.
Install Mechanism
The package contains only a markdown skill file with embedded example Python code; there are no executable install scripts, dependency installs, background workers, or package registry dependencies.
Credentials
File access is limited to user-specified CSV/JSON input paths and an optional user-specified report path, using Python standard-library modules only.
Persistence & Privilege
The skill can persist derived analysis results to a local JSON file, including column names, statistics, outlier samples, correlations, and a timestamp; this is disclosed in the description, feature list, and example output flow.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install laosi-data-analyzer
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /laosi-data-analyzer 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of laosi-data-analyzer. - Supports loading and parsing CSV/JSON data - Automatically computes statistical descriptions: mean, median, standard deviation, min/max - Includes outlier detection (IQR/z-score) and trend analysis - Generates correlations between numeric columns - Saves analysis results locally as JSON - Suitable for business reports, A/B testing, data quality checks, and trend monitoring
元数据
Slug laosi-data-analyzer
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

数据分析引擎 是什么?

数据分析 - 加载CSV/JSON自动计算统计描述(均值/中位数/标准差/极值),异常检测,趋势分析,结果本地持久化. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 24 次。

如何安装 数据分析引擎?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install laosi-data-analyzer」即可一键安装,无需额外配置。

数据分析引擎 是免费的吗?

是的,数据分析引擎 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

数据分析引擎 支持哪些平台?

数据分析引擎 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 数据分析引擎?

由 534422530(@534422530)开发并维护,当前版本 v1.0.0。

💬 留言讨论