← 返回 Skills 市场

Experiment Designer

Name: Experiment Designer
Author: alirezarezvani

作者 Alireza Rezvani · GitHub ↗ · v2.1.1 · MIT-0

cross-platform ✓ 安全检测通过

630

总下载

当前安装

版本数

在 OpenClaw 中安装

/install experiment-designer

功能描述

Use when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical s...

使用说明 (SKILL.md)

Experiment Designer

Design, prioritize, and evaluate product experiments with clear hypotheses and defensible decisions.

When To Use

Use this skill for:

A/B and multivariate experiment planning
Hypothesis writing and success criteria definition
Sample size and minimum detectable effect planning
Experiment prioritization with ICE scoring
Reading statistical output for product decisions

Core Workflow

Write hypothesis in If/Then/Because format

If we change [intervention]
Then [metric] will change by [expected direction/magnitude]
Because [behavioral mechanism]

Define metrics before running test

Primary metric: single decision metric
Guardrail metrics: quality/risk protection
Secondary metrics: diagnostics only

Estimate sample size

Baseline conversion or baseline mean
Minimum detectable effect (MDE)
Significance level (alpha) and power

Use:

python3 scripts/sample_size_calculator.py --baseline-rate 0.12 --mde 0.02 --mde-type absolute

Prioritize experiments with ICE

Impact: potential upside
Confidence: evidence quality
Ease: cost/speed/complexity

ICE Score = (Impact * Confidence * Ease) / 10

Launch with stopping rules

Decide fixed sample size or fixed duration in advance
Avoid repeated peeking without proper method
Monitor guardrails continuously

Interpret results

Statistical significance is not business significance
Compare point estimate + confidence interval to decision threshold
Investigate novelty effects and segment heterogeneity

Hypothesis Quality Checklist

Contains explicit intervention and audience
Specifies measurable metric change
States plausible causal reason
Includes expected minimum effect
Defines failure condition

Common Experiment Pitfalls

Underpowered tests leading to false negatives
Running too many simultaneous changes without isolation
Changing targeting or implementation mid-test
Stopping early on random spikes
Ignoring sample ratio mismatch and instrumentation drift
Declaring success from p-value without effect-size context

Statistical Interpretation Guardrails

p-value \x3C alpha indicates evidence against null, not guaranteed truth.
Confidence interval crossing zero/no-effect means uncertain directional claim.
Wide intervals imply low precision even when significant.
Use practical significance thresholds tied to business impact.

See:

references/experiment-playbook.md
references/statistics-reference.md

Tooling

`scripts/sample_size_calculator.py`

Computes required sample size (per variant and total) from:

baseline rate
MDE (absolute or relative)
significance level (alpha)
statistical power

Example:

python3 scripts/sample_size_calculator.py \
  --baseline-rate 0.10 \
  --mde 0.015 \
  --mde-type absolute \
  --alpha 0.05 \
  --power 0.8

安全使用建议

This skill appears to be what it claims: documentation plus a local Python sample-size calculator. Before using: (1) review the sample_size_calculator.py to ensure its assumptions (two-proportion A/B, equal group sizes, interpretation of relative vs absolute MDE) match your experiment; (2) validate results against another calculator or statistical package when stakes are high; and (3) remember this tool does not handle sequential monitoring, multiple comparisons, or continuous-metric power analyses — apply appropriate statistical corrections in your workflow.

功能分析

Type: OpenClaw Skill Name: experiment-designer Version: 2.1.1 The experiment-designer skill bundle is a legitimate toolset for planning and analyzing A/B tests. It contains well-documented instructions in SKILL.md and a Python script (scripts/sample_size_calculator.py) that performs statistical calculations using only the Python standard library. No evidence of data exfiltration, malicious execution, or prompt injection was found.

能力评估

✓ Purpose & Capability

Name/description (experiment design, hypothesis writing, sample-size estimation) match the included materials: two reference docs and a local sample-size calculator script. No unrelated credentials, binaries, or config paths are requested.

✓ Instruction Scope

SKILL.md stays on-topic (hypothesis format, metrics, sample-size estimation, ICE prioritization, stopping rules). The instructions only reference local files included in the package and show how to run the local Python script; they do not direct the agent to read unrelated files or transmit data externally.

✓ Install Mechanism

No install spec is present (instruction-only skill with one local script). Nothing is downloaded or extracted from external URLs and no packages are installed automatically.

✓ Credentials

The skill requires no environment variables, no credentials, and no config paths. All functionality is local and proportional to the stated purpose.

✓ Persistence & Privilege

always is false and the skill is user-invocable. It does not request persistent system-wide changes or elevated privileges.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install experiment-designer
安装完成后，直接呼叫该 Skill 的名称或使用 /experiment-designer 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v2.1.1

v2.1.1: optimization, reference splits

v1.0.0

Initial publish

元数据

Slug experiment-designer

版本 2.1.1

许可证 MIT-0

累计安装 4

当前安装数 4

历史版本数 2

常见问题