← 返回 Skills 市场
nidalghetf

AB Test Framework

作者 nidalghETF · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
604
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install ab-test-framework
功能描述
Compare models with A/B testing for selection
使用说明 (SKILL.md)

A/B Testing Framework

Description

Compare models with A/B testing for selection

Source Reference

This skill is derived from 20. Testing & Quality Assurance of the OpenClaw Agent Mastery Index v4.1.

Sub-heading: A/B Testing Frameworks for Model Selection

Complexity: high

Input Parameters

Name Type Required Description
model_a string Yes First model
model_b string Yes Second model
test_prompts array Yes Test prompts

Output Format

{
  "status": \x3Cstring>,
  "details": \x3Cobject>,
  "winner": \x3Cstring>,
  "confidence": \x3Cnumber>
}

Usage Examples

Example 1: Basic Usage

const result = await openclaw.skill.run('ab-test-framework', {
  model_a: "value",
  model_b: "value",
  test_prompts: 123
});

Example 2: With Optional Parameters

const result = await openclaw.skill.run('ab-test-framework', {
  model_a: "value",
  model_b: "value",
  test_prompts: []
});

Security Considerations

A/B test security per Category 8; prevent test manipulation

Additional Security Measures

  1. Input Validation: All inputs are validated before processing
  2. Least Privilege: Operations run with minimal required permissions
  3. Audit Logging: All actions are logged for security review
  4. Error Handling: Errors are sanitized before returning to caller

Troubleshooting

Common Issues

Issue Cause Solution
Permission denied Insufficient privileges Check file/directory permissions
Invalid input Malformed parameters Validate input format
Dependency missing Required module not installed Run npm install

Debug Mode

Enable debug logging:

openclaw.logger.setLevel('debug');
const result = await openclaw.skill.run('ab-test-framework', { ... });

Related Skills

  • model-routing-manager
  • performance-benchmarker
  • @param {string} params.model_a - First model
  • @param {string} params.model_b - Second model
  • @param {Array} params.test_prompts - Test prompts
安全使用建议
This skill appears unfinished: index.js contains a placeholder implementation (no A/B comparison logic), SKILL.md and package.json list dependencies that are not consistently used, and example/test code contains type/parameter mismatches (e.g., missing or incorrectly-typed test_prompts). Before installing or using in production: 1) Review and complete the core A/B testing implementation and confirm how 'stats-library' will be used. 2) Fix tests and examples so they match required inputs (test_prompts is required). 3) Audit logging: the code logs full params — remove or redact sensitive model identifiers or prompt content if needed. 4) Verify the provenance of any npm dependency (stats-library) and ensure OpenClaw runtime helpers (openclaw/*) are trusted. If you cannot review/modify the code, avoid using this skill for sensitive workloads.
功能分析
Type: OpenClaw Skill Name: ab-test-framework Version: 1.0.0 The skill is classified as suspicious due to the import of high-risk modules (`openclaw/exec`, `fs.promises`) in `index.js` that are currently unused. This, combined with the explicit `// TODO: Implement specific logic for A/B Testing Framework` placeholder, suggests an incomplete implementation that could easily be extended to leverage these powerful capabilities for malicious purposes in the future. While input sanitization is attempted via `openclaw/validator`, the reflection of sanitized user input in the output also presents a minor risk if the sanitization is imperfect.
能力评估
Purpose & Capability
Name/description match an A/B testing framework. However the code is a placeholder: it performs only input validation and returns sanitized params rather than running any A/B comparison logic. SKILL.md declares dependencies (openclaw/llm, stats-library) but code only depends on a stats-library in package.json and uses OpenClaw runtime helpers; openclaw/llm is not used. Declared complexity 'high' and 'priority: 5 (Critical)' are disproportionate to the actual footprint.
Instruction Scope
SKILL.md inputs and security guidance are reasonable, but examples contain inconsistencies (example passes test_prompts as '123' — wrong type). The runtime instructions don't request external files, secrets, or unexpected network endpoints. The skill logs full params (logger.info with params), which could capture sensitive identifiers — the SKILL.md does not explicitly warn about logging sensitive inputs.
Install Mechanism
No install spec is present (instruction-only install), and package.json only lists 'stats-library' as an npm dependency. No external download URLs or archive extraction are used. This is a low-risk install surface, though the declared dependency should be verified (stats-library from npm).
Credentials
The skill requests no environment variables, credentials, or config paths. It uses OpenClaw runtime helpers (logger, validator, notify) which are expected for a platform skill. The code conditionally calls sendAlert if params.alert_on_failure is set — no secrets are required for that.
Persistence & Privilege
always is false and the skill does not attempt to modify other skills or system settings. It does not request persistent system-level privileges.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install ab-test-framework
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /ab-test-framework 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
A/B Testing Framework v1.0.0 – Initial release for comparing models using A/B testing. - Allows comparison between two models with user-defined test prompts. - Outputs the status, detailed results, identified winner, and confidence score. - Includes strict input validation and audit logging for security. - Runs with least privilege and mitigates test manipulation as per security requirements. - Depends on openclaw/llm and stats-library.
元数据
Slug ab-test-framework
版本 1.0.0
许可证
累计安装 1
当前安装数 1
历史版本数 1
常见问题

AB Test Framework 是什么?

Compare models with A/B testing for selection. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 604 次。

如何安装 AB Test Framework?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ab-test-framework」即可一键安装,无需额外配置。

AB Test Framework 是免费的吗?

是的,AB Test Framework 完全免费(开源免费),可自由下载、安装和使用。

AB Test Framework 支持哪些平台?

AB Test Framework 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 AB Test Framework?

由 nidalghETF(@nidalghetf)开发并维护,当前版本 v1.0.0。

💬 留言讨论