← 返回 Skills 市场
scc-nyy

Data Cleaning and Statistical Analysis Skill

作者 scc-nyy · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
37
总下载
8
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install data-cleaning-and-statistical-analysis-skill
功能描述
Provides data cleaning, quality checks, statistical test selection, analysis, and academic interpretation for quantitative behavioral and experimental datasets.
使用说明 (SKILL.md)

Data Cleaning and Statistical Analysis Skill

Purpose

This skill supports data cleaning, quality checking, statistical analysis, and academic interpretation of quantitative datasets. It is especially useful for experimental psychology, clinical research, behavioral science, education research, questionnaire studies, and project-based data analysis.

When to Use

Use this skill when the user needs help with:

  • Cleaning raw CSV, Excel, SPSS-exported, PsychoPy, PsychoJS, or online experiment data.
  • Checking whether behavioral data are valid or usable.
  • Identifying missing values, duplicate rows, abnormal reaction times, impossible responses, or coding problems.
  • Splitting or merging datasets.
  • Creating derived variables such as accuracy, mean reaction time, omission errors, commission errors, learning scores, block-level performance, or change scores.
  • Selecting statistical tests based on research design.
  • Running descriptive statistics, t-tests, ANOVA, repeated-measures ANOVA, mixed ANOVA, correlation, regression, chi-square tests, or nonparametric tests.
  • Explaining statistical results in academic language.

Inputs

The user may provide:

  • A dataset file such as .csv, .xlsx, .sav, or .tsv.
  • A description of the study design.
  • Variable names and coding rules.
  • Grouping information, such as patient group vs healthy control group.
  • Experimental condition labels, such as block, trial type, congruent/incongruent, target/non-target, or pre/post.
  • Required output format, such as APA style, thesis writing, tables, graphs, or plain-language explanation.

Core Workflow

1. Understand the Research Design

Before analysis, identify:

  • Whether the design is between-subjects, within-subjects, mixed, cross-sectional, longitudinal, or pre-post.
  • What the independent variables are.
  • What the dependent variables are.
  • Whether the main research question is group difference, condition difference, association, prediction, or change over time.
  • Whether the data come from behavioral tasks, questionnaires, clinical scales, or physiological measures.

2. Inspect the Dataset

Check:

  • Number of rows and columns.
  • Variable names.
  • Data types.
  • Missing values.
  • Duplicate participant IDs.
  • Unexpected category labels.
  • Range and distribution of key variables.
  • Whether trial numbers and block numbers match the intended experimental design.

3. Clean the Data

Common cleaning steps include:

  • Removing practice trials when formal analysis should only include experimental trials.
  • Excluding invalid trials, such as no response, timeout, or incorrect response when reaction time analysis requires correct trials only.
  • Filtering implausible reaction times according to task-specific rules.
  • Recoding categorical variables.
  • Creating participant-level summary scores.
  • Calculating condition-level means and accuracy.
  • Checking whether each participant has enough valid trials.

4. Choose Statistical Tests

Select tests according to the design:

  • Two independent groups: independent-samples t-test or Mann-Whitney U test.
  • Two paired conditions: paired-samples t-test or Wilcoxon signed-rank test.
  • More than two repeated conditions: repeated-measures ANOVA or Friedman test.
  • Group × Condition design: mixed ANOVA or linear mixed model.
  • Association between variables: Pearson or Spearman correlation.
  • Prediction model: linear regression, logistic regression, or mixed-effects regression.
  • Categorical variables: chi-square test or Fisher's exact test.

5. Report Results

Results should include:

  • Descriptive statistics.
  • Test statistic.
  • Degrees of freedom when applicable.
  • p value.
  • Effect size.
  • Confidence interval when appropriate.
  • Interpretation linked to the research hypothesis.

Output Requirements

The assistant should provide:

  • A clear summary of data quality.
  • Cleaning decisions and exclusion criteria.
  • A table of key descriptive statistics when useful.
  • Recommended statistical tests with justification.
  • Interpretable results in academic language.
  • Warnings when the data structure does not match the intended design.
  • Suggestions for improving data collection or coding if problems are found.

Style Guidelines

  • Be transparent about assumptions.
  • Do not overclaim statistical significance.
  • Distinguish between descriptive trends and statistically significant findings.
  • Explain statistical concepts in accessible language when the user is a beginner.
  • Use academic wording when the user is preparing a thesis, report, or ethics application.
  • Preserve original data unless the user explicitly asks for a cleaned file.

Example User Requests

  • "帮我检查这个 CPT 数据是否正确。"
  • "请帮我清洗 PsychoPy 导出的数据,并计算每个 block 的正确率和反应时。"
  • "我的研究是患者组和健康组在四个 block 中的表现差异,应该用什么统计方法?"
  • "帮我把这个数据整理成 SPSS 可以分析的格式。"
  • "根据这个结果帮我写 APA 风格的统计结果。"
安全使用建议
Safe to install for data cleaning and statistical analysis help. Users should avoid sharing identifiable participant, clinical, or confidential research data unless they have permission and an appropriate privacy workflow.
能力评估
Purpose & Capability
The artifact coherently focuses on inspecting user-provided datasets, cleaning data, choosing statistical tests, and reporting academic interpretations.
Instruction Scope
Instructions are scoped to user-supplied research data and explicitly advise preserving original data unless the user asks for a cleaned file.
Install Mechanism
The package contains only a non-executable SKILL.md file, no dependencies, no install scripts, and clean static and SkillSpector scans.
Credentials
The skill may be used with sensitive research, behavioral, clinical, or questionnaire datasets, but that data handling is expected for the stated purpose and depends on user-provided files.
Persistence & Privilege
No persistence, privilege escalation, credential access, background execution, network behavior, broad local indexing, or destructive actions are present in the artifact.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install data-cleaning-and-statistical-analysis-skill
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /data-cleaning-and-statistical-analysis-skill 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of the Data Cleaning and Statistical Analysis Skill: - Supports data cleaning, validation, and quality checking for experimental, questionnaire, and behavioral datasets. - Guides choice and performance of statistical analyses, including t-tests, ANOVA (various types), regression, correlation, and nonparametric tests. - Produces academic-style interpretations, result summaries, tables, and explanations suitable for theses or research reports. - Handles various file formats (CSV, Excel, SPSS, etc.) and accommodates design details like grouping and experimental conditions. - Provides dataset inspection, cleaning decisions, test recommendations, and step-by-step statistical reasoning.
元数据
Slug data-cleaning-and-statistical-analysis-skill
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Data Cleaning and Statistical Analysis Skill 是什么?

Provides data cleaning, quality checks, statistical test selection, analysis, and academic interpretation for quantitative behavioral and experimental datasets. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 37 次。

如何安装 Data Cleaning and Statistical Analysis Skill?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install data-cleaning-and-statistical-analysis-skill」即可一键安装,无需额外配置。

Data Cleaning and Statistical Analysis Skill 是免费的吗?

是的,Data Cleaning and Statistical Analysis Skill 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Data Cleaning and Statistical Analysis Skill 支持哪些平台?

Data Cleaning and Statistical Analysis Skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Data Cleaning and Statistical Analysis Skill?

由 scc-nyy(@scc-nyy)开发并维护,当前版本 v1.0.0。

💬 留言讨论