← 返回 Skills 市场

Nm Abstract Subagent Testing

Name: Nm Abstract Subagent Testing
Author: athola

作者 athola · GitHub ↗ · v1.8.3 · MIT-0

cross-platform ✓ 安全检测通过

154

总下载

当前安装

版本数

在 OpenClaw 中安装

/install nm-abstract-subagent-testing

功能描述

Test skills via RED/GREEN/REFACTOR TDD with fresh subagents

使用说明 (SKILL.md)

Night Market Skill — ported from claude-night-market/abstract. For the full experience with agents, hooks, and commands, install the Claude Code plugin.

Subagent Testing - TDD for Skills

Test skills with fresh subagent instances to prevent priming bias and validate effectiveness.

Overview
Why Fresh Instances Matter
Testing Methodology
Quick Start
Detailed Testing Guide
Success Criteria

Overview

Fresh instances prevent priming: Each test uses a new Claude conversation to verify the skill's impact is measured, not conversation history effects.

Why Fresh Instances Matter

The Priming Problem

Running tests in the same conversation creates bias:

Prior context influences responses
Skill effects get mixed with conversation history
Can't isolate skill's true impact

Fresh Instance Benefits

Isolation: Each test starts clean
Reproducibility: Consistent baseline state
Measurement: Clear before/after comparison
Validation: Proves skill effectiveness, not priming

Testing Methodology

Three-phase TDD-style approach:

Phase 1: Baseline Testing (RED)

Test without skill to establish baseline behavior.

Phase 2: With-Skill Testing (GREEN)

Test with skill loaded to measure improvements.

Phase 3: Rationalization Testing (REFACTOR)

Test skill's anti-rationalization guardrails.

Quick Start

# 1. Create baseline tests (without skill)
# Use 5 diverse scenarios
# Document full responses

# 2. Create with-skill tests (fresh instances)
# Load skill explicitly
# Use identical prompts
# Compare to baseline

# 3. Create rationalization tests
# Test anti-rationalization patterns
# Verify guardrails work

Detailed Testing Guide

For complete testing patterns, examples, and templates:

Testing Patterns - Full TDD methodology
Test Examples - Baseline, with-skill, rationalization tests
Analysis Templates - Scoring and comparison frameworks

Success Criteria

Baseline: Document 5+ diverse baseline scenarios
Improvement: ≥50% improvement in skill-related metrics
Consistency: Results reproducible across fresh instances
Rationalization Defense: Guardrails prevent ≥80% of rationalization attempts

This skill is a methodology guide and appears coherent and low-risk. Before running tests, avoid including real secrets or production data in prompts or captured model outputs (use synthetic/test data). If you install any additional plugins the guide references (e.g., Claude Code), review those plugins separately for permissions and network access. If you prefer the agent not to invoke skills autonomously, adjust your agent settings — the skill itself does not demand elevated privileges.

功能分析

Type: OpenClaw Skill Name: nm-abstract-subagent-testing Version: 1.8.3 The skill bundle provides a structured methodology and documentation for testing OpenClaw skills using a TDD-inspired approach (RED/GREEN/REFACTOR). It focuses on using fresh subagent instances to eliminate priming bias and includes templates for adversarial testing to identify and fix AI rationalization behaviors. The files (SKILL.md and modules/testing-patterns.md) contain only educational content, procedural instructions, and non-executable Python code skeletons intended for developer use, with no evidence of malicious intent, data exfiltration, or harmful execution.

能力评估

✓ Purpose & Capability

Name/description match the content: the files describe a methodology for testing skills with fresh subagents. No unrelated binaries, env vars, or credentials are requested.

✓ Instruction Scope

SKILL.md and modules/testing-patterns.md instruct the agent to create fresh conversations, run baseline/with-skill/rationalization tests, and capture responses. All referenced actions are within the stated testing purpose; there are no instructions to read arbitrary host files, access credentials, or transmit data to unexpected endpoints.

✓ Install Mechanism

No install spec and no code files — instruction-only — so nothing is written to disk or pulled from external URLs. This is the lowest-risk install profile.

✓ Credentials

The skill declares no required env vars, credentials, or config paths. The guidance to load other skills during tests (e.g., 'secure-api-design') is expected for a testing workflow and does not itself request unrelated secrets.

✓ Persistence & Privilege

Skill does not request persistent presence (always:false) or modify other skills. It allows normal autonomous invocation (platform default), which is expected for skills but not elevated here.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install nm-abstract-subagent-testing
安装完成后，直接呼叫该 Skill 的名称或使用 /nm-abstract-subagent-testing 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.8.3

Release v1.8.3

v1.8.2

Release v1.8.2

vv1.8.2

Release v1.8.2

元数据

Slug nm-abstract-subagent-testing

版本 1.8.3

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 3

常见问题