← Back to Skills Marketplace

AB Test Framework

Name: AB Test Framework
Author: nidalghetf

by nidalghETF · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

604

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install ab-test-framework

Description

Compare models with A/B testing for selection

README (SKILL.md)

A/B Testing Framework

Description

Compare models with A/B testing for selection

Source Reference

This skill is derived from 20. Testing & Quality Assurance of the OpenClaw Agent Mastery Index v4.1.

Sub-heading: A/B Testing Frameworks for Model Selection

Complexity: high

Input Parameters

Name	Type	Required	Description
`model_a`	string	Yes	First model
`model_b`	string	Yes	Second model
`test_prompts`	array	Yes	Test prompts

Output Format

{
  "status": \x3Cstring>,
  "details": \x3Cobject>,
  "winner": \x3Cstring>,
  "confidence": \x3Cnumber>
}

Usage Examples

Example 1: Basic Usage

const result = await openclaw.skill.run('ab-test-framework', {
  model_a: "value",
  model_b: "value",
  test_prompts: 123
});

Example 2: With Optional Parameters

const result = await openclaw.skill.run('ab-test-framework', {
  model_a: "value",
  model_b: "value",
  test_prompts: []
});

Security Considerations

A/B test security per Category 8; prevent test manipulation

Additional Security Measures

Input Validation: All inputs are validated before processing
Least Privilege: Operations run with minimal required permissions
Audit Logging: All actions are logged for security review
Error Handling: Errors are sanitized before returning to caller

Troubleshooting

Common Issues

Issue	Cause	Solution
Permission denied	Insufficient privileges	Check file/directory permissions
Invalid input	Malformed parameters	Validate input format
Dependency missing	Required module not installed	Run `npm install`

Debug Mode

Enable debug logging:

openclaw.logger.setLevel('debug');
const result = await openclaw.skill.run('ab-test-framework', { ... });

Related Skills

model-routing-manager
performance-benchmarker

@param {string} params.model_a - First model
@param {string} params.model_b - Second model
@param {Array} params.test_prompts - Test prompts

Usage Guidance

This skill appears unfinished: index.js contains a placeholder implementation (no A/B comparison logic), SKILL.md and package.json list dependencies that are not consistently used, and example/test code contains type/parameter mismatches (e.g., missing or incorrectly-typed test_prompts). Before installing or using in production: 1) Review and complete the core A/B testing implementation and confirm how 'stats-library' will be used. 2) Fix tests and examples so they match required inputs (test_prompts is required). 3) Audit logging: the code logs full params — remove or redact sensitive model identifiers or prompt content if needed. 4) Verify the provenance of any npm dependency (stats-library) and ensure OpenClaw runtime helpers (openclaw/*) are trusted. If you cannot review/modify the code, avoid using this skill for sensitive workloads.

Capability Analysis

Type: OpenClaw Skill Name: ab-test-framework Version: 1.0.0 The skill is classified as suspicious due to the import of high-risk modules (`openclaw/exec`, `fs.promises`) in `index.js` that are currently unused. This, combined with the explicit `// TODO: Implement specific logic for A/B Testing Framework` placeholder, suggests an incomplete implementation that could easily be extended to leverage these powerful capabilities for malicious purposes in the future. While input sanitization is attempted via `openclaw/validator`, the reflection of sanitized user input in the output also presents a minor risk if the sanitization is imperfect.

Capability Assessment

ℹ Purpose & Capability

Name/description match an A/B testing framework. However the code is a placeholder: it performs only input validation and returns sanitized params rather than running any A/B comparison logic. SKILL.md declares dependencies (openclaw/llm, stats-library) but code only depends on a stats-library in package.json and uses OpenClaw runtime helpers; openclaw/llm is not used. Declared complexity 'high' and 'priority: 5 (Critical)' are disproportionate to the actual footprint.

ℹ Instruction Scope

SKILL.md inputs and security guidance are reasonable, but examples contain inconsistencies (example passes test_prompts as '123' — wrong type). The runtime instructions don't request external files, secrets, or unexpected network endpoints. The skill logs full params (logger.info with params), which could capture sensitive identifiers — the SKILL.md does not explicitly warn about logging sensitive inputs.

✓ Install Mechanism

No install spec is present (instruction-only install), and package.json only lists 'stats-library' as an npm dependency. No external download URLs or archive extraction are used. This is a low-risk install surface, though the declared dependency should be verified (stats-library from npm).

✓ Credentials

The skill requests no environment variables, credentials, or config paths. It uses OpenClaw runtime helpers (logger, validator, notify) which are expected for a platform skill. The code conditionally calls sendAlert if params.alert_on_failure is set — no secrets are required for that.

✓ Persistence & Privilege

always is false and the skill does not attempt to modify other skills or system settings. It does not request persistent system-level privileges.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install ab-test-framework
After installation, invoke the skill by name or use /ab-test-framework
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

A/B Testing Framework v1.0.0 – Initial release for comparing models using A/B testing. - Allows comparison between two models with user-defined test prompts. - Outputs the status, detailed results, identified winner, and confidence score. - Includes strict input validation and audit logging for security. - Runs with least privilege and mitigates test manipulation as per security requirements. - Depends on openclaw/llm and stats-library.

Metadata

Slug ab-test-framework

Version 1.0.0

License —

All-time Installs 1

Active Installs 1

Total Versions 1

Frequently Asked Questions

What is AB Test Framework?

Compare models with A/B testing for selection. It is an AI Agent Skill for Claude Code / OpenClaw, with 604 downloads so far.

How do I install AB Test Framework?

Run "/install ab-test-framework" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is AB Test Framework free?

Yes, AB Test Framework is completely free (open-source). You can download, install and use it at no cost.

Which platforms does AB Test Framework support?

AB Test Framework is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created AB Test Framework?

It is built and maintained by nidalghETF (@nidalghetf); the current version is v1.0.0.

More Skills

AB Test Framework

A/B Testing Framework

Description

Source Reference

Input Parameters

Output Format

Usage Examples

Example 1: Basic Usage

Example 2: With Optional Parameters

Security Considerations

Additional Security Measures

Troubleshooting

Common Issues

Debug Mode

Related Skills

What is AB Test Framework?

How do I install AB Test Framework?

Is AB Test Framework free?

Which platforms does AB Test Framework support?

Who created AB Test Framework?

💬 Comments