功能描述

Adversarial plan review using two different AI models. Supports static mode (fixed roles) and alternating mode (models swap writer/reviewer each round, fully...

使用说明 (SKILL.md)

cross-model-review

Name: Cross Model Review
Author: don-gbot

Metadata

name: cross-model-review
version: 2.0.0
description: >
  Adversarial plan review using two different AI models.
  v2: Alternating mode — models swap writer/reviewer each round.
  Fully autonomous loop — no human input between rounds.
  Use when: building features touching auth/payments/data models,
  plans that will take >1hr to implement.
  NOT for: simple one-file fixes, research tasks, quick scripts.
triggers:
  - "review this plan"
  - "cross review"
  - "challenge this"
  - "is this plan solid?"
  - "adversarial review"

When to Activate

Activate this skill when the user:

Says any trigger phrase above
Shares a plan and asks for adversarial/second-opinion review
Asks you to "sanity check" a multi-step implementation plan

Do NOT activate for: simple fixes, one-liners, pure research tasks.

Modes

Static Mode (v1 — backward compatible)

Fixed roles: planner always writes, reviewer always reviews. Requires human to trigger each round.

Alternating Mode (v2 — recommended)

Models swap roles each round. Fully autonomous — no human input between rounds.

Flow:

Round 1: Model A writes the plan. Model B reviews.
Round 2: Model B rewrites (based on its own review). Model A reviews.
Round 3: Model A rewrites (based on its own review). Model B reviews.
...continues alternating until both agree (reviewer says APPROVED) or max rounds hit.

Why this works:

Each model must implement its own critique — can't nitpick without owning the fix
The other model catches over-engineering or proportionality issues
Natural convergence: each round addresses the other's concerns

Autonomous Orchestration (Alternating Mode)

You (the main agent) run this loop. It's fully autonomous after kickoff.

Step 1 — Save the plan and init

node review.js init \
  --plan /path/to/plan.md \
  --mode alternating \
  --model-a "anthropic/claude-opus-4-6" \
  --model-b "openai-codex/gpt-5.3-codex" \
  --project-context "Brief description for reviewer calibration" \
  --out /home/ubuntu/clawd/tasks/reviews

Captures workspace path from stdout.

Step 2 — The autonomous loop

while true:
  step = next-step(workspace)

  if step.action == "done":
    break  # APPROVED!

  if step.action == "max-rounds":
    ask user: override or manual fix
    break

  if step.action == "review":
    spawn sub-agent with step.model, step.prompt
    save response to workspace/round-N-response.json
    parse-round(workspace, round, response)
    continue

  if step.action == "revise":
    spawn sub-agent with step.model, step.prompt
    save output plan to temp file
    save-plan(workspace, temp file, version)
    continue

Step 3 — Finalize

When the loop exits with APPROVED:

node review.js finalize --workspace \x3Cworkspace>

Present: rounds taken, issues found/resolved, rubric scores, plan-final.md location.

CLI Reference

Commands:
  init           Create a review workspace
  next-step      Get next action for autonomous loop
  parse-round    Parse a reviewer response, update issue tracker
  save-plan      Save a revised plan version from writer output
  finalize       Generate plan-final.md, changelog.md, summary.json
  status         Print current workspace state

init options:
  --plan \x3Cfile>            Path to plan file (required)
  --mode \x3Cm>               "static" (default) or "alternating"
  --model-a \x3Cm>            Model A — writes first (alternating mode, required)
  --model-b \x3Cm>            Model B — reviews first (alternating mode, required)
  --reviewer-model \x3Cm>     Reviewer model (static mode, required)
  --planner-model \x3Cm>      Planner model (static mode, required)
  --project-context \x3Cs>    Brief project context for reviewer calibration
  --out \x3Cdir>              Output base dir (default: tasks/reviews)
  --max-rounds \x3Cn>         Max rounds (default: 5 static, 8 alternating)
  --token-budget \x3Cn>       Token budget for context (default: 8000)

next-step options:
  --workspace \x3Cdir>        Path to review workspace (required)
  Returns JSON: { action, model, round, prompt, planVersion, saveTo }
  Actions: "review", "revise", "done", "max-rounds"

parse-round options:
  --workspace \x3Cdir>        Path to review workspace (required)
  --round \x3Cn>              Round number (required)
  --response \x3Cfile>        Path to raw reviewer response (required)

save-plan options:
  --workspace \x3Cdir>        Path to review workspace (required)
  --plan \x3Cfile>            Path to revised plan markdown (required)
  --version \x3Cn>            Plan version number (required)

finalize options:
  --workspace \x3Cdir>        Path to review workspace (required)
  --override-reason \x3Cs>    Reason for force-approving with open issues
  --ci-force               Required in non-TTY mode when overriding

status options:
  --workspace \x3Cdir>        Path to review workspace (required)

Exit codes:
  0   Approved / OK
  1   Revise / max-rounds
  2   Error

Detailed Orchestration (for agent implementation)

Spawning reviewers

step = next-step(workspace)  # action: "review"
response = sessions_spawn(model=step.model, task=step.prompt, timeout=120s)
# Save raw response to workspace/round-{step.round}-response.json
parse-round(workspace, step.round, response_file)

System instruction for reviewer: "You are a senior engineering reviewer. Output ONLY valid JSON matching the schema. No tool calls. No markdown fences. No preamble."

Spawning writers

step = next-step(workspace)  # action: "revise"
revised_plan = sessions_spawn(model=step.model, task=step.prompt, timeout=300s)
# Save raw output as temp file
save-plan(workspace, temp_file, step.planVersion)

System instruction for writer: none needed — the prompt is self-contained.

Error handling

Reviewer timeout/failure: retry once, then ask user
Writer timeout/failure: retry once, then ask user
Parse error on review JSON: re-prompt reviewer once with "Your response was not valid JSON"
Max rounds hit: present status to user, ask for override or manual fix

Convergence

The loop converges when the reviewer says APPROVED with no open CRITICAL/HIGH blockers. The script enforces this — if reviewer says APPROVED but blockers remain, it overrides to REVISE.

Static Mode (v1 — backward compatible)

For static mode, the original orchestration from v1 still works:

Step 1 — Init

node review.js init --plan \x3Cfile> --reviewer-model \x3Cm> --planner-model \x3Cm>

Step 2 — Manual loop

For each round: build reviewer prompt from template, spawn reviewer, parse-round, revise plan yourself, continue.

Step 3 — Finalize

Same as alternating mode.

Integration with coding-agent

Before dispatching any plan to coding-agent that:

Touches auth, payments, or data models
Has 3+ implementation steps
The user hasn't already reviewed adversarially

Run cross-model-review first. Only proceed if exit code 0.

Notes

Workspace persists in tasks/reviews/ — referenceable later
issues.json tracks full lifecycle of all issues
meta.json stores mode, models, current round, verdict, needsRevision flag
next-step is the state machine — always call it to determine what to do
Dedup warnings help catch semantic drift across rounds
Models must be from different provider families (cross-provider enforcement)
--project-context is injected into reviewer prompts for calibration

安全使用建议

This skill appears to do what it says: it orchestrates an adversarial review loop between two different models and includes on-disk helpers and templates. Before installing or running it, consider the following: (1) Do NOT include secrets, credentials, or PII in plan content or codebase context — those are sent to third-party model APIs. (2) Prefer static/human‑mediated mode for sensitive plans (alternating mode is fully autonomous). (3) Ensure the platform's sessions_spawn uses trusted provider credentials and that you understand where reviewer responses are sent/stored. (4) Review the included scripts (scripts/review.js) and templates yourself — they are provided and straightforward, but you should confirm workspace paths and retention policies meet your security/compliance needs. (5) If you must review sensitive or regulated plans, run this in an isolated environment or redact sensitive fields before invoking the skill.

功能分析

Type: OpenClaw Skill Name: cross-model-review Version: 2.1.0 The cross-model-review skill bundle is a well-architected tool for adversarial AI plan reviews with a strong focus on security and robustness. The core logic in `scripts/review.js` is dependency-free and implements a structured state machine for multi-round reviews, including Jaccard-based issue deduplication and schema validation. Most importantly, the bundle includes explicit prompt-injection mitigations in its templates (e.g., `alternating-reviewer-prompt.md` and `writer-prompt.md`) by wrapping untrusted content in delimiters and providing strict system instructions to sub-agents to treat input as data rather than instructions. No evidence of data exfiltration, malicious execution, or unauthorized persistence was found.

能力评估

✓ Purpose & Capability

Name/description, CLI, templates, and scripts all implement an adversarial cross-model review loop (static and alternating modes). The included Node.js helper (scripts/review.js) manages workspaces, parsing, dedup, and verdicts — this is expected and proportionate to the skill's purpose. There are no unrelated env vars, binaries, or surprising external services requested.

ℹ Instruction Scope

SKILL.md instructs the agent to spawn reviewer/writer sub-agents (sessions_spawn) and to save/parse JSON responses; templates explicitly wrap plan content in UNTRUSTED delimiters and require structured JSON output. This stays within the review orchestration scope, but the skill necessarily transmits plan content to third‑party models and relies on instruction-level sandboxing to mitigate prompt injection. The SKILL.md acknowledges that this is a prompt-level protection (not an API-level isolation) and warns of limitations.

✓ Install Mechanism

No install spec; skill is instruction-first and ships helper scripts and tests that run under Node.js >=18. No downloads from external URLs or package-install steps. The codebase claims zero external dependencies and uses only Node stdlib. This is a low-risk install footprint.

✓ Credentials

The skill declares no required environment variables or credentials. It does assume the platform's sessions_spawn mechanism will provide model access (so the platform will use whatever model/provider credentials it normally has). The absence of required secrets is appropriate; however, users must not include secrets/PII in plan or codebase_context because those values will be sent to external model APIs.

✓ Persistence & Privilege

always:false and no special privileges. The skill writes run artifacts only to a workspace directory supplied at init (user-controlled path). It does not request system-wide changes or modify other skills' configurations.

版本历史

v2.1.0

Round 0 criteria negotiation: Model A proposes 5 task-specific acceptance criteria, Model B challenges/refines. Agreed criteria injected into all reviewer prompts. New command: save-criteria. Backward compatible.

v2.0.1

v2.0: Alternating mode — models swap writer/reviewer each round. Fully autonomous loop. v2.0.1: Atomic meta writes, error states, save-plan validation. Dogfooded on itself (3-round stress test passed).

v1.2.0

Added Differentiation rubric dimension (#8): scores 0 if plan reads like generic LLM output, 5 if decisions grounded in project context. Verification gate catches baseline slop.

v1.1.0

v1.1.0: rubric-based multi-dimensional scoring (7 dimensions, 0-5 scale), backward compatible, 77 tests passing

v1.0.1

Adversarial plan review with stable issue tracking, dedup, fail-closed gating, 55 tests

元数据

Slug cross-model-review

版本 2.1.0

许可证 —

累计安装 3

当前安装数 3

历史版本数 5

常见问题