功能描述

动态评审团多维评分与周期迭代工作流程。根据任务自动生成适合的评审团成员，支持极端评审官加入，形成包围式评审阵势。触发词：评审团、多维评分、迭代优化、代码评审、质量评估、周期迭代、动态评审。

使用说明 (SKILL.md)

动态评审团多维评分系统

Name: Jury Review
Author: kukuxnd

基于 AutoResearch 思路的智能评审框架。

核心理念

任务分析 → 生成评审团 → 极端挑战 → 用户选择 → 终极评审团 → 迭代优化

工作流程

Phase 1: 任务分析

分析用户任务，识别关键维度：

task = "创建一个高并发的 C++ HTTP 服务器"

analysis = {
    "type": "网络服务",
    "keywords": ["高并发", "HTTP", "服务器", "C++"],
    "risk_areas": ["并发安全", "内存管理", "网络协议"],
    "quality_focus": ["性能", "安全", "稳定性"]
}

Phase 2: 生成核心评审团

根据任务类型，生成"上下左右"包围阵势的核心评审团：

                    【上】架构官
                       ↓
【左】安全官 ←─── 核心代码 ───→ 【右】性能官
                       ↑
                    【下】测试官

核心评审团生成规则：

任务类型	核心评审团	说明
网络服务	架构官、安全官、性能官、测试官	四方包围
数据处理	数据官、性能官、安全官、文档官	数据为中心
UI/前端	美术官、体验官、性能官、兼容官	用户为中心
算法/AI	算法官、性能官、测试官、伦理官	质量为先
安全工具	安全官、渗透官、合规官、审计官	安全至上
通用代码	美术官、性能官、安全官、测试官、文档官	五官齐全

Phase 3: 极端评审团挑战

生成"极端评审官"，质疑核心评审团的盲点：

极端评审官类型：

极端评审官	职责	挑战问题
🔥 纵火官	破坏性测试	"如果故意传入恶意输入会怎样？"
🧟 僵尸官	边界极端	"如果内存只剩 1KB 怎么办？"
⏰ 时间官	时间压力	"如果要在 10ms 内完成怎么办？"
💀 死神官	失败场景	"如果这个函数崩溃了怎么办？"
🎭 骗子官	欺骗输入	"如果用户谎称输入类型怎么办？"
🌀 混沌官	随机异常	"如果网络突然断开怎么办？"
📉 吝啬官	资源极限	"如果 CPU 占用必须 \x3C 1% 怎么办？"
🌪️ 风暴官	高压负载	"如果并发 100 万请求怎么办？"

Phase 4: 用户选择

向用户展示极端评审团，选择加入：

## 🎭 极端评审官提议

根据您的任务特点，建议考虑以下极端评审官：

| 评审官 | 挑战维度 | 推荐理由 |
|--------|----------|----------|
| 🔥 纵火官 | 破坏性测试 | 网络服务需要抵抗恶意输入 |
| 🌀 混沌官 | 异常处理 | 高并发场景网络不稳定 |
| 🌪️ 风暴官 | 极限负载 | 高并发需要压测验证 |

**请选择要加入的极端评审官：**
- [ ] 全部加入
- [ ] 选择加入（指定）
- [ ] 不加入，使用核心评审团

Phase 5: 终极评审团

组合核心 + 极端，形成本次任务的终极评审团：

## ⚔️ 终极评审团阵容

### 核心阵势

    【架构官】赵构
        ↓

【安全官】盾山 ─── 代码 ─── 【性能官】闪电 ↑ 【测试官】试金石


### 极端挑战

🔥 纵火官·焚天 | 🌀 混沌官·乱舞 | 🌪️ 风暴官·狂啸


共 7 位评审官，综合权重自动分配。

Phase 6: 多轮迭代

for iteration in range(max_iterations):
    # 1. 生成/改进代码
    code = generate_or_improve(task, previous_feedback)
    
    # 2. 核心评审团评分
    core_scores = core_jury.evaluate(code)
    
    # 3. 极端评审官挑战
    extreme_challenges = extreme_jury.challenge(code)
    
    # 4. 综合得分
    total = weighted_average(core_scores, extreme_challenges)
    
    # 5. 决策
    if total >= threshold:
        return ACCEPT, code
    elif no_improvement:
        return STAGNANT, best_code
    else:
        feedback = generate_feedback(core_scores, extreme_challenges)
        continue

评审官角色库

核心评审官

评审官	符号	维度	权重范围
🎨 美术官	🎨	代码美学	10-25%
⚡ 性能官	⚡	执行效率	10-25%
🔒 安全官	🔒	安全性	10-25%
🧪 测试官	🧪	测试质量	10-25%
📝 文档官	📝	文档完整	10-25%
🏗️ 架构官	🏗️	架构设计	10-20%
📊 数据官	📊	数据处理	10-20%
👁️ 体验官	👁️	用户体验	10-20%
⚖️ 合规官	⚖️	合规性	10-20%
🤖 算法官	🤖	算法质量	10-20%

极端评审官

评审官	符号	挑战类型	适用场景
🔥 纵火官	🔥	破坏性测试	网络、安全、输入处理
🧟 僵尸官	🧟	资源极限	嵌入式、移动端
⏰ 时间官	⏰	时间压力	实时系统、高频交易
💀 死神官	💀	失败恢复	关键系统、金融
🎭 骗子官	🎭	输入欺骗	用户输入、API
🌀 混沌官	🌀	随机异常	分布式、网络
📉 吝啬官	📉	资源极限	性能敏感
🌪️ 风暴官	🌪️	极限负载	高并发、游戏

配置参数

参数	默认值	说明
`max_iterations`	5	最大迭代次数
`accept_threshold`	80	接受阈值
`min_improvement`	5	最低改进分数
`core_jury_size`	4-5	核心评审团人数
`extreme_jury_max`	3	极端评审官最大数

使用示例

示例 1: 高并发服务器

用户: 创建一个高并发 C++ HTTP 服务器

系统分析:
- 类型: 网络服务
- 关键词: 高并发、HTTP、服务器
- 风险点: 并发安全、内存泄漏、连接管理

生成核心评审团:
        【架构官】
            ↓
【安全官】─── 代码 ───【性能官】
            ↑
        【测试官】

极端评审官提议:
- 🔥 纵火官 (恶意请求)
- 🌪️ 风暴官 (极限并发)
- 🌀 混沌官 (网络异常)

用户选择: 全部加入

终极评审团: 7 位评审官
开始多轮迭代...

示例 2: 数据处理脚本

用户: 写一个 Python 数据清洗脚本

系统分析:
- 类型: 数据处理
- 关键词: 数据、清洗、脚本

生成核心评审团:
        【数据官】
            ↓
【安全官】─── 代码 ───【性能官】
            ↓
        【文档官】

极端评审官提议:
- 🎭 骗子官 (脏数据)
- 💀 死神官 (数据丢失)

用户选择: 加入骗子官

终极评审团: 5 位评审官
开始多轮迭代...

反馈输出格式

## ⚔️ 第 N 轮评审

### 核心评分
| 评审官 | 分数 | 状态 | 主要问题 |
|--------|------|------|----------|
| 🏗️ 架构官 | 82 | ✅ | 模块划分清晰 |
| ⚡ 性能官 | 75 | ⚠️ | 可优化连接池 |
| 🔒 安全官 | 68 | ⚠️ | 缺少输入验证 |
| 🧪 测试官 | 60 | ⚠️ | 测试覆盖不足 |

### 极端挑战
| 评审官 | 通过 | 挑战结果 |
|--------|------|----------|
| 🔥 纵火官 | ❌ | 恶意请求导致崩溃 |
| 🌪️ 风暴官 | ⚠️ | 10K 并发延迟增加 |

**综合得分: 71.2 分**
**状态: 继续迭代**

### 改进建议
1. [安全] 添加请求头验证
2. [测试] 添加并发测试用例
3. [极限] 增加请求速率限制

注意事项

极端评审官数量适中，避免过度惩罚
每轮迭代要有明确改进目标
迭代停滞时及时终止
记录评审历史用于分析优化

安全使用建议

This skill appears internally consistent and contains a local Python reviewer that flags common issues (e.g., strcpy, system(), scanf patterns, nested loops). Before installing/use: (1) review scripts/jury-scorer.py yourself to confirm you accept its checks and outputs; (2) don't point the scorer at sensitive files (it opens arbitrary code files passed as an argument); (3) be cautious about automatically executing any code produced by the skill — generated code should be reviewed and tested in a safe environment; (4) because the skill's source is from an unknown publisher, prefer manual inspection before granting it broader automated access or running it on production data.

功能分析

Type: OpenClaw Skill Name: jury-review Version: 1.0.0 The 'jury-review' skill bundle is a structured code review framework that uses multiple AI personas (e.g., Architect, Security Officer) to evaluate code quality. The included Python script (scripts/jury-scorer.py) performs basic static analysis to identify common vulnerabilities like buffer overflows (strcpy) or command injection (system) in the code being reviewed, but it does not execute the code or perform any harmful actions itself. All instructions in SKILL.md and documentation in references/scoring-guide.md are consistent with the stated purpose of multi-dimensional code assessment and iterative improvement.

能力评估

✓ Purpose & Capability

The SKILL.md describes a juried review workflow and the repo includes a scoring guide and a local Python scorer (scripts/jury-scorer.py) that implements detection rules and produces JSON results — these are coherent with the stated goal.

ℹ Instruction Scope

The SKILL.md instructs the agent to analyze tasks, generate reviewer roles, and iteratively generate/improve code based on feedback. Those instructions stay within the declared purpose (code review and iteration). Note: the workflow implies generating and assessing code; the skill does not instruct reading unrelated system secrets, but you should avoid passing sensitive files to the scorer (it reads arbitrary code files you point it at).

✓ Install Mechanism

No install spec or external downloads; the skill is instruction-only with a small included Python script. No unusual installers, archive downloads, or third‑party packages are declared.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. The included script operates on a code file passed as an argument and does not access environment secrets.

✓ Persistence & Privilege

always:false and no special persistence or system configuration changes are requested. The skill can be invoked by the agent (default), which is expected for user-invocable review tools.

版本历史

v1.0.0

Dynamic, role-based jury review workflow for multi-dimensional scoring and iterative improvement. - Introduces an automated process to analyze tasks, generate customized core juries, and propose "extreme" review roles for robust evaluation. - Supports user selection of jury composition, combining core and extreme reviewers for comprehensive review. - Enables multi-round evaluation with weighted scoring and structured feedback based on both standard and edge-case perspectives. - Customizable for different task types (e.g., network, data, UI, algorithm) with predefined role libraries and evaluation parameters. - Offers markdown-based feedback output and clear visual representation of review teams and outcomes. - Designed for clear iteration, improvement tracking, and adaptive quality assessment.

元数据

Slug jury-review

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Jury Review 是什么？

动态评审团多维评分与周期迭代工作流程。根据任务自动生成适合的评审团成员，支持极端评审官加入，形成包围式评审阵势。触发词：评审团、多维评分、迭代优化、代码评审、质量评估、周期迭代、动态评审。它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 160 次。

如何安装 Jury Review？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install jury-review」即可一键安装，无需额外配置。

Jury Review 是免费的吗？

是的，Jury Review 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Jury Review 支持哪些平台？

Jury Review 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Jury Review？

由 kukuxNd（@kukuxnd）开发并维护，当前版本 v1.0.0。

Jury Review