功能描述

Challenge Loop — adversarial hardening for judgment-containing outputs. Two modes: inline self-refutation (zero latency) and independent challenger subagent....

使用说明 (SKILL.md)

Challenge Loop

Name: Challenge Loop
Author: enzowyf

Overview

Mode	What happens	Trigger	Cost
Inline	Self-refute in same response	"challenge inline" or agent's discretion	Zero
Subagent	Spawn independent challenger	"challenge this" / "deep challenge" / "brutal challenge"	Higher

Mode 1: Inline Challenge

After producing a judgment/recommendation, append:

**Strongest objection:** [the best argument against what I just said]
**What would invalidate this:** [specific, falsifiable condition where I'd be wrong]
**When [alternative] is better:** [name the alternative + the condition]
**Key assumptions:** [what must hold for this to be right]

Rules:

Objection must be genuine, not a strawman
Invalidation must be specific and falsifiable
Alternative must name a concrete option and when it wins

Example:

I recommend PostgreSQL over MySQL for this project because...

**Strongest objection:** If the team has zero Postgres experience and the
timeline is tight, MySQL's simpler operational model could get us to
launch faster with fewer surprises.
**What would invalidate this:** If the data model stays simple (no JSONB,
no complex joins, \x3C10 tables), Postgres's advantages don't materialize
and we pay the learning curve for nothing.
**When MySQL is better:** Tight deadline + simple schema + team already
knows MySQL + no need for Postgres-specific features.
**Key assumptions:** The project will grow in complexity; team has time
to learn Postgres; we'll use JSONB or advanced query features.

Mode 2: Subagent Challenge

Intensity Levels

Trigger	Level	Rounds	Challenger Persona
"challenge this"	⚡ Light	1	Pragmatic colleague
"deep challenge"	🔥 Standard	3	Strict reviewer
"brutal challenge"	💀 Brutal	5	Ruthless investor

Challenger Prompt Template (Canonical)

All platforms use this template. Insert the intensity block for the selected level.

You are a [persona] challenger. Do NOT trigger challenge-loop.
Do NOT load any challenge/review skills. Do NOT spawn subagents.

## Context
[Original User Request]
{{original_user_request}}

[Current Draft]
{{current_version}}

[Previous Challenges — empty on round 1]
{{previous_challenges}}

## Your Task
{{intensity_block}}

## Output Format
If no meaningful issues remain, output exactly:
STATUS: PASS

Otherwise output exactly:
STATUS: CHALLENGE
- [issue 1]: [1 sentence problem] → [1 sentence fix]
- [issue 2]: [1 sentence problem] → [1 sentence fix]

Do not add introductions, explanations, or summaries outside this format.

Intensity blocks:

⚡ Light:

Review briefly. Flag 1-2 critical issues only. Max 5 items. ≤2 sentences each.

🔥 Standard:

Review with fresh objectivity. 5 blades:
1. Assumption — unverified premises?
2. Blind spot — who's ignored? edge cases?
3. Alternative — better path overlooked?
4. Risk — worst failure mode?
5. Devil's advocate — strongest argument against?
Max 5 challenges, ≤2 sentences each.

💀 Brutal:

Kill this unless it proves it deserves to live. Challenge every assertion.
Competitor attack plan: how would they destroy this?
Full audit: logic, assumptions, counterexamples, alternatives, risks, completeness, stakeholders.
Max 8 challenges, ≤2 sentences each.
If 3+ vulnerabilities: recommend "rebuild from scratch".

Loop Orchestration

The main agent drives the loop. Flow:

Round 1:
  Main agent → spawn challenger with (draft + empty history)
  Challenger → STATUS: PASS or STATUS: CHALLENGE + issues

Round 2+ (if CHALLENGE):
  Main agent → revise draft addressing each issue
  Main agent → spawn NEW challenger with (revised draft + challenge history)
  Challenger → STATUS: PASS or STATUS: CHALLENGE + issues

Repeat until stop condition.

Key: Each round spawns a fresh challenger (no persistent state). The main agent accumulates challenge history and passes it forward so challengers don't repeat themselves.

Stop Conditions

Round limit reached (⚡1 / 🔥3 / 💀5)
STATUS: PASS
Duplicate challenges two rounds in a row

Cost Breakers (All Platforms)

Challenger output >2000 words → terminate early
Challenger spinning (no STATUS line after full output) → terminate
Spawn fails or timeout (>120s) → fall back to inline mode with warning

Revision Rules

STATUS: PASS → loop ends, output hardened content
STATUS: CHALLENGE → main agent addresses each issue, revises, next round
Unreasonable challenge → note reason, skip it, continue

Subagent Example (End-to-End)

User: "I recommend we use a microservices architecture. deep challenge"

Round 1 — Main agent spawns 🔥 Standard challenger:

Challenger returns:

STATUS: CHALLENGE
- [Assumption]: You assume the team can operate 5+ services independently,
  but with 3 engineers, operational overhead will dominate.
  → Verify team size can sustain independent deploy cycles per service.
- [Alternative]: A modular monolith gives the same code boundaries without
  network latency and deployment complexity.
  → Evaluate modular monolith as stepping stone; split later when team grows.
- [Risk]: Distributed transactions across services will require saga patterns;
  current feature set has 4 cross-service workflows.
  → Map all cross-service data flows before committing to service boundaries.

Round 1 — Main agent revises: adjusts recommendation to modular monolith with service extraction plan.

Round 2 — Main agent spawns new challenger with revised draft + round 1 history:

Challenger returns:

STATUS: PASS

Final output:

[Revised recommendation: modular monolith with extraction plan]

---

**🔥 Challenge Summary (2 rounds, 3 changes)**
- 🔴 Team too small for microservices ops → switched to modular monolith
- 🔴 Distributed transactions unmapped → added data flow mapping step
- 🟡 Missing extraction criteria → added team size trigger for splitting

**Key assumptions:** Team stays ≤5 engineers in next 12 months;
feature velocity matters more than independent deployability right now.

Anti-Recursion Guard

Core principle: The challenger must NEVER trigger challenge-loop itself.

Every challenger prompt includes:

Do NOT trigger challenge-loop. Do NOT load any challenge/review skills. Do NOT spawn subagents.

Enforced at the prompt level across all platforms. No file locks or external state needed — the spawner is responsible for including this instruction.

Platform Implementation

Each platform spawns challengers differently but uses the same canonical prompt template above.

Claude Code

Use the Agent tool. Pass the canonical prompt template as the prompt parameter.

Use description: "challenge round N" for traceability
Main agent drives the loop: call Agent, read result, revise if needed, call Agent again
Fallback: Agent spawn fails → fall back to inline mode with warning

OpenClaw

Use sessions_spawn as a one-shot ephemeral subagent.

{
  "runtime": "subagent",
  "mode": "run",
  "agentId": "main",
  "thinking": "off",
  "timeoutSeconds": 120,
  "task": "{{canonical_prompt_template_with_variables_filled}}"
}

mode: "run" — ephemeral, no persistent session
Set agentId explicitly. Use the agent that should perform the challenge in your environment (example: "main")
thinking: "off" for ⚡ Light; "low" for 🔥 Standard and 💀 Brutal
timeoutSeconds: 120
Challenger should be reasoning-only and should not need external tools
Main agent drives the loop: call sessions_spawn, parse result, revise, repeat
Fallback: If spawn fails or times out, fall back to inline mode with: ⚠️ Subagent challenge unavailable, falling back to inline challenge.

Hermes

Use delegate_task to spawn a challenger.

Pass the canonical prompt template via the task payload. Main agent drives the loop as above. Same fallback and cost breaker rules apply.

Output Format

Inline mode:

[Main recommendation/analysis]

**Strongest objection:** ...
**What would invalidate this:** ...
**When [alternative] is better:** ...
**Key assumptions:** ...

Subagent mode:

[Hardened content]

---

**[⚡/🔥/💀] Challenge Summary (X rounds, Y changes)**
- 🔴 [Critical] → [fix applied]
- 🟡 [Optimization] → [adjustment]
- ✅ [Passed]

**Key assumptions:** ...

Usage Summary

Scenario	What happens
"挑战一下" / "帮我看看有没有问题" / "靠谱吗"	Inline 4-line block, zero cost
"challenge this" / "审一下" / "帮我审查一下"	⚡ Light subagent, 1 round
"deep challenge" / "深度挑战" / "严格审查"	🔥 Standard subagent, 3 rounds
"brutal challenge" / "毁灭级挑战" / "往死里挑"	💀 Brutal subagent, 5 rounds
"skip challenge" / "跳过" / "不用审" / "直接给"	No challenge
Agent detects high-risk output	Self-initiates inline challenge
Subagent spawn fails	Fallback to inline only

安全使用建议

This skill is coherent and low-risk in terms of files, installs, and secrets because it is purely a prompt/template. Before installing or using it in production, verify the hosting platform enforces isolation and the documented constraints: ensure subagent spawns cannot load additional skills or access connectors/credentials, confirm timeouts and round limits are enforced to avoid runaway costs, and test with low-risk inputs to validate the anti-recursion guard works in practice. If you cannot confirm sandboxing or spawn restrictions, treat subagent mode as higher risk and prefer inline mode only.

功能分析

Type: OpenClaw Skill Name: challenge-loop Version: 1.0.0 The challenge-loop skill bundle is a utility designed to improve AI agent output quality through structured self-critique or subagent-based adversarial review. It implements a 'Challenge Loop' workflow using standard platform tools like sessions_spawn (OpenClaw) and the Agent tool (Claude Code) to identify assumptions and risks in the agent's own recommendations. The instructions in SKILL.md are transparent, include necessary anti-recursion guards to prevent infinite loops, and lack any indicators of data exfiltration, malicious execution, or unauthorized access.

能力标签

cryptocan-make-purchases

能力评估

✓ Purpose & Capability

Name/description (adversarial hardening, inline or subagent challenger) align with the SKILL.md. There are no unrelated env vars, binaries, or install steps requested, and platform-specific spawn references are reasonable for a subagent-based workflow.

ℹ Instruction Scope

The SKILL.md stays on-topic (self-refutation block and challenger prompt template). However, the anti-recursion and 'do not load other skills' protections are implemented only as prompt-level instructions (not enforced by code), so their effectiveness depends on the hosting platform's sandboxing and spawn behavior. The 'agent discretion' clause gives the agent some autonomy to self-initiate inline challenges which may broaden invocation surface.

✓ Install Mechanism

No install spec and no code files: instruction-only skill is low-risk from an install perspective (nothing is written to disk or downloaded).

✓ Credentials

The skill declares no required environment variables, credentials, or config paths and the instructions do not ask for secrets or unrelated system data.

ℹ Persistence & Privilege

always:false (good). disable-model-invocation is default false (normal). The skill relies on spawning challenger subagents at runtime; this is expected but means the hosting platform will perform additional model calls and needs to enforce timeouts and isolation. The skill does not request to modify other skills or persistent agent state.

版本历史

v1.0.0

challenge-loop 1.0.0 — Initial Release - Introduces Challenge Loop for adversarial hardening of judgment-based outputs. - Supports two modes: inline self-refutation (zero latency) and subagent challenger with configurable intensity. - Manual trigger with natural language; user can escalate, de-escalate, or skip. - Subagent mode features independent challenger persona, multiple rounds, and escalating challenge intensity. - Canonical challenger prompt template with strict anti-recursion enforcement. - Platform-agnostic orchestration, with fallback to inline challenge on failure.

元数据

Slug challenge-loop

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Challenge Loop 是什么？

Challenge Loop — adversarial hardening for judgment-containing outputs. Two modes: inline self-refutation (zero latency) and independent challenger subagent.... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 77 次。

如何安装 Challenge Loop？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install challenge-loop」即可一键安装，无需额外配置。

Challenge Loop 是免费的吗？

是的，Challenge Loop 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Challenge Loop 支持哪些平台？

Challenge Loop 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Challenge Loop？

由 enzowyf（@enzowyf）开发并维护，当前版本 v1.0.0。

Challenge Loop

Challenge Loop

Overview

Mode 1: Inline Challenge

Mode 2: Subagent Challenge

Intensity Levels

Challenger Prompt Template (Canonical)

Loop Orchestration

Stop Conditions

Cost Breakers (All Platforms)

Revision Rules

Subagent Example (End-to-End)

Anti-Recursion Guard

Platform Implementation

Claude Code

OpenClaw

Hermes

Output Format

Usage Summary

Challenge Loop 是什么？

如何安装 Challenge Loop？

Challenge Loop 是免费的吗？

Challenge Loop 支持哪些平台？

谁开发了 Challenge Loop？

💬 留言讨论