← Back to Skills Marketplace
nidalghetf

AB Test Framework

by nidalghETF · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
604
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install ab-test-framework
Description
Compare models with A/B testing for selection
README (SKILL.md)

A/B Testing Framework

Description

Compare models with A/B testing for selection

Source Reference

This skill is derived from 20. Testing & Quality Assurance of the OpenClaw Agent Mastery Index v4.1.

Sub-heading: A/B Testing Frameworks for Model Selection

Complexity: high

Input Parameters

Name Type Required Description
model_a string Yes First model
model_b string Yes Second model
test_prompts array Yes Test prompts

Output Format

{
  "status": \x3Cstring>,
  "details": \x3Cobject>,
  "winner": \x3Cstring>,
  "confidence": \x3Cnumber>
}

Usage Examples

Example 1: Basic Usage

const result = await openclaw.skill.run('ab-test-framework', {
  model_a: "value",
  model_b: "value",
  test_prompts: 123
});

Example 2: With Optional Parameters

const result = await openclaw.skill.run('ab-test-framework', {
  model_a: "value",
  model_b: "value",
  test_prompts: []
});

Security Considerations

A/B test security per Category 8; prevent test manipulation

Additional Security Measures

  1. Input Validation: All inputs are validated before processing
  2. Least Privilege: Operations run with minimal required permissions
  3. Audit Logging: All actions are logged for security review
  4. Error Handling: Errors are sanitized before returning to caller

Troubleshooting

Common Issues

Issue Cause Solution
Permission denied Insufficient privileges Check file/directory permissions
Invalid input Malformed parameters Validate input format
Dependency missing Required module not installed Run npm install

Debug Mode

Enable debug logging:

openclaw.logger.setLevel('debug');
const result = await openclaw.skill.run('ab-test-framework', { ... });

Related Skills

  • model-routing-manager
  • performance-benchmarker
  • @param {string} params.model_a - First model
  • @param {string} params.model_b - Second model
  • @param {Array} params.test_prompts - Test prompts
Usage Guidance
This skill appears unfinished: index.js contains a placeholder implementation (no A/B comparison logic), SKILL.md and package.json list dependencies that are not consistently used, and example/test code contains type/parameter mismatches (e.g., missing or incorrectly-typed test_prompts). Before installing or using in production: 1) Review and complete the core A/B testing implementation and confirm how 'stats-library' will be used. 2) Fix tests and examples so they match required inputs (test_prompts is required). 3) Audit logging: the code logs full params — remove or redact sensitive model identifiers or prompt content if needed. 4) Verify the provenance of any npm dependency (stats-library) and ensure OpenClaw runtime helpers (openclaw/*) are trusted. If you cannot review/modify the code, avoid using this skill for sensitive workloads.
Capability Analysis
Type: OpenClaw Skill Name: ab-test-framework Version: 1.0.0 The skill is classified as suspicious due to the import of high-risk modules (`openclaw/exec`, `fs.promises`) in `index.js` that are currently unused. This, combined with the explicit `// TODO: Implement specific logic for A/B Testing Framework` placeholder, suggests an incomplete implementation that could easily be extended to leverage these powerful capabilities for malicious purposes in the future. While input sanitization is attempted via `openclaw/validator`, the reflection of sanitized user input in the output also presents a minor risk if the sanitization is imperfect.
Capability Assessment
Purpose & Capability
Name/description match an A/B testing framework. However the code is a placeholder: it performs only input validation and returns sanitized params rather than running any A/B comparison logic. SKILL.md declares dependencies (openclaw/llm, stats-library) but code only depends on a stats-library in package.json and uses OpenClaw runtime helpers; openclaw/llm is not used. Declared complexity 'high' and 'priority: 5 (Critical)' are disproportionate to the actual footprint.
Instruction Scope
SKILL.md inputs and security guidance are reasonable, but examples contain inconsistencies (example passes test_prompts as '123' — wrong type). The runtime instructions don't request external files, secrets, or unexpected network endpoints. The skill logs full params (logger.info with params), which could capture sensitive identifiers — the SKILL.md does not explicitly warn about logging sensitive inputs.
Install Mechanism
No install spec is present (instruction-only install), and package.json only lists 'stats-library' as an npm dependency. No external download URLs or archive extraction are used. This is a low-risk install surface, though the declared dependency should be verified (stats-library from npm).
Credentials
The skill requests no environment variables, credentials, or config paths. It uses OpenClaw runtime helpers (logger, validator, notify) which are expected for a platform skill. The code conditionally calls sendAlert if params.alert_on_failure is set — no secrets are required for that.
Persistence & Privilege
always is false and the skill does not attempt to modify other skills or system settings. It does not request persistent system-level privileges.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ab-test-framework
  3. After installation, invoke the skill by name or use /ab-test-framework
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
A/B Testing Framework v1.0.0 – Initial release for comparing models using A/B testing. - Allows comparison between two models with user-defined test prompts. - Outputs the status, detailed results, identified winner, and confidence score. - Includes strict input validation and audit logging for security. - Runs with least privilege and mitigates test manipulation as per security requirements. - Depends on openclaw/llm and stats-library.
Metadata
Slug ab-test-framework
Version 1.0.0
License
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is AB Test Framework?

Compare models with A/B testing for selection. It is an AI Agent Skill for Claude Code / OpenClaw, with 604 downloads so far.

How do I install AB Test Framework?

Run "/install ab-test-framework" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is AB Test Framework free?

Yes, AB Test Framework is completely free (open-source). You can download, install and use it at no cost.

Which platforms does AB Test Framework support?

AB Test Framework is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created AB Test Framework?

It is built and maintained by nidalghETF (@nidalghetf); the current version is v1.0.0.

💬 Comments