← 返回 Skills 市场

AB Test Framework

Name: AB Test Framework
Author: nidalghetf

作者 nidalghETF · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

604

总下载

当前安装

版本数

在 OpenClaw 中安装

/install ab-test-framework

功能描述

Compare models with A/B testing for selection

使用说明 (SKILL.md)

A/B Testing Framework

Description

Compare models with A/B testing for selection

Source Reference

This skill is derived from 20. Testing & Quality Assurance of the OpenClaw Agent Mastery Index v4.1.

Sub-heading: A/B Testing Frameworks for Model Selection

Complexity: high

Input Parameters

Name	Type	Required	Description
`model_a`	string	Yes	First model
`model_b`	string	Yes	Second model
`test_prompts`	array	Yes	Test prompts

Output Format

{
  "status": \x3Cstring>,
  "details": \x3Cobject>,
  "winner": \x3Cstring>,
  "confidence": \x3Cnumber>
}

Usage Examples

Example 1: Basic Usage

const result = await openclaw.skill.run('ab-test-framework', {
  model_a: "value",
  model_b: "value",
  test_prompts: 123
});

Example 2: With Optional Parameters

const result = await openclaw.skill.run('ab-test-framework', {
  model_a: "value",
  model_b: "value",
  test_prompts: []
});

Security Considerations

A/B test security per Category 8; prevent test manipulation

Additional Security Measures

Input Validation: All inputs are validated before processing
Least Privilege: Operations run with minimal required permissions
Audit Logging: All actions are logged for security review
Error Handling: Errors are sanitized before returning to caller

Troubleshooting

Common Issues

Issue	Cause	Solution
Permission denied	Insufficient privileges	Check file/directory permissions
Invalid input	Malformed parameters	Validate input format
Dependency missing	Required module not installed	Run `npm install`

Debug Mode

Enable debug logging:

openclaw.logger.setLevel('debug');
const result = await openclaw.skill.run('ab-test-framework', { ... });

Related Skills

model-routing-manager
performance-benchmarker

@param {string} params.model_a - First model
@param {string} params.model_b - Second model
@param {Array} params.test_prompts - Test prompts

安全使用建议

This skill appears unfinished: index.js contains a placeholder implementation (no A/B comparison logic), SKILL.md and package.json list dependencies that are not consistently used, and example/test code contains type/parameter mismatches (e.g., missing or incorrectly-typed test_prompts). Before installing or using in production: 1) Review and complete the core A/B testing implementation and confirm how 'stats-library' will be used. 2) Fix tests and examples so they match required inputs (test_prompts is required). 3) Audit logging: the code logs full params — remove or redact sensitive model identifiers or prompt content if needed. 4) Verify the provenance of any npm dependency (stats-library) and ensure OpenClaw runtime helpers (openclaw/*) are trusted. If you cannot review/modify the code, avoid using this skill for sensitive workloads.

功能分析

Type: OpenClaw Skill Name: ab-test-framework Version: 1.0.0 The skill is classified as suspicious due to the import of high-risk modules (`openclaw/exec`, `fs.promises`) in `index.js` that are currently unused. This, combined with the explicit `// TODO: Implement specific logic for A/B Testing Framework` placeholder, suggests an incomplete implementation that could easily be extended to leverage these powerful capabilities for malicious purposes in the future. While input sanitization is attempted via `openclaw/validator`, the reflection of sanitized user input in the output also presents a minor risk if the sanitization is imperfect.

能力评估

ℹ Purpose & Capability

Name/description match an A/B testing framework. However the code is a placeholder: it performs only input validation and returns sanitized params rather than running any A/B comparison logic. SKILL.md declares dependencies (openclaw/llm, stats-library) but code only depends on a stats-library in package.json and uses OpenClaw runtime helpers; openclaw/llm is not used. Declared complexity 'high' and 'priority: 5 (Critical)' are disproportionate to the actual footprint.

ℹ Instruction Scope

SKILL.md inputs and security guidance are reasonable, but examples contain inconsistencies (example passes test_prompts as '123' — wrong type). The runtime instructions don't request external files, secrets, or unexpected network endpoints. The skill logs full params (logger.info with params), which could capture sensitive identifiers — the SKILL.md does not explicitly warn about logging sensitive inputs.

✓ Install Mechanism

No install spec is present (instruction-only install), and package.json only lists 'stats-library' as an npm dependency. No external download URLs or archive extraction are used. This is a low-risk install surface, though the declared dependency should be verified (stats-library from npm).

✓ Credentials

The skill requests no environment variables, credentials, or config paths. It uses OpenClaw runtime helpers (logger, validator, notify) which are expected for a platform skill. The code conditionally calls sendAlert if params.alert_on_failure is set — no secrets are required for that.

✓ Persistence & Privilege

always is false and the skill does not attempt to modify other skills or system settings. It does not request persistent system-level privileges.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install ab-test-framework
安装完成后，直接呼叫该 Skill 的名称或使用 /ab-test-framework 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

A/B Testing Framework v1.0.0 – Initial release for comparing models using A/B testing. - Allows comparison between two models with user-defined test prompts. - Outputs the status, detailed results, identified winner, and confidence score. - Includes strict input validation and audit logging for security. - Runs with least privilege and mitigates test manipulation as per security requirements. - Depends on openclaw/llm and stats-library.

元数据

Slug ab-test-framework

版本 1.0.0

许可证 —

累计安装 1

当前安装数 1

历史版本数 1

常见问题