← Back to Skills Marketplace
54lynnn

Multi-Agent Skill Evaluator

by 54Lynnn · GitHub ↗ · v1.4.0 · MIT-0
cross-platform ✓ Security Clean
62
Downloads
0
Stars
0
Active Installs
3
Versions
Install in OpenClaw
/install multi-agent-skill-evaluator
Description
帮我评估一下这个 skill。
README (SKILL.md)

Skill Evaluator — 多智能体技能评估

对目标 skill 进行结构化多维度评估。用 3 个隔离的子 agent 作为独立考官,各自全面评估后汇总结果。

工作流

Step 1:读取目标 skill

读取目标 skill 目录下的全部文件,跳过二进制和非文本文件:

  • SKILL.md
  • scripts/*(.sh, .py, .js 等)
  • references/*(.md 等)
  • 其他文本配置文件

Step 2:并行启动 3 个子 agent

使用 references/evaluation-protocol.md 中的评估协议,填充评估技能信息和全部文件内容后,同时 spawn 3 个子 agent(使用 mode="run")。

每个子 agent 的 task 内容必须包含:

  1. 角色声明(你是独立考官 A/B/C)
  2. 评估技能信息
  3. 全部评估材料(完整文件内容)
  4. 评估标准(8个维度定义,直接从 evaluation-protocol.md 引用)
  5. 输出格式要求(含 ===SCORE_SUMMARY=== 标记行)

注意:使用 sessions_spawn 并行发送,不要串行等待。然后 sessions_yield 等待全部完成。

Step 2.5(可选):分歧追问

聚合分数时如果某个维度最高分 - 最低分 ≥ 3, spawn 一个追问子 agent 专门分析:

你是 Skill Evaluator 的追问考官。关于技能 xxx 的"安全性"维度:
考官 A(9分)理由:...
考官 B(4分)理由:...

请分析双方分歧:谁的论据更强?是否存在双方都没发现的盲点?

将追问结果加入最终报告。

Step 3:聚合结果

从每个子 agent 的输出中提取分数摘要(解析 ===SCORE_SUMMARY=== 标记段)和详细评语。

若某个子 agent 未完成或输出格式异常,标记为 N/A 并在报告中注明。

汇总输出(严格按以下结构):

══════════════════════════════════════
  Skill 评估报告:\x3Cskill名称> v\x3C版本>
══════════════════════════════════════

📊 各维度评分
┌────────────────────┬────┬────┬────┬──────┐
│ 维度               │ A  │ B  │ C  │ 均分 │
├────────────────────┼────┼────┼────┼──────┤
│ 1. 功能完整性      │    │    │    │      │
│ 2. 代码质量        │    │    │    │      │
│ 3. 健壮性          │    │    │    │      │
│ 4. 安全性          │    │    │    │      │
│ 5. 文档质量        │    │    │    │      │
│ 6. 依赖合理性      │    │    │    │      │
│ 7. 预估运行效果    │    │    │    │      │
│ 8. 总评            │    │    │    │      │
└────────────────────┴────┴────┴────┴──────┘

注:维度均分 = (A+B+C)/3,保留一位小数

🔍 主要分歧点

列出最高分-最低分 ≥ 3 的维度(如有),附各方论据和分析。

✅ 共识优势

至少 2 个考官均明确提及的优点(引用原文关键词)

⚠️ 共识问题

至少 2 个考官均明确指出的问题(引用原文关键词)

📝 综合评语

- 整体质量定位
- 最值得改进的 1-2 个点
- 建议评级:推荐 / 可用但有坑 / 不推荐
Usage Guidance
Install this only if you want a Chinese-language, multi-agent evaluator for other OpenClaw skills. Avoid using it on directories containing private notes, secrets, or unrelated project files, because it is designed to read and pass the target skill’s text files to evaluator sub-agents.
Capability Assessment
Purpose & Capability
The stated purpose is to evaluate a target skill, and the requested capabilities fit that purpose: read the target skill files, send them to three isolated evaluator agents, optionally ask a follow-up agent about large scoring disagreements, and aggregate a report.
Instruction Scope
The Chinese description is broad and conversational, which could cause accidental activation for users asking generally about skill evaluation; the body still clearly limits behavior to evaluating a target skill.
Install Mechanism
The package contains only Markdown instructions and one Markdown reference protocol, with no executable scripts, install hooks, declared dependencies, or static-scan findings.
Credentials
Reading all text files in the target skill and forwarding their contents to sub-agents is proportionate for a multi-agent evaluator, but users should only point it at skill directories they intend to review.
Persistence & Privilege
No credential use, privilege escalation, file mutation, deletion, network exfiltration, or background persistence is shown; the sub-agent spawning is finite and disclosed as part of the evaluation workflow.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install multi-agent-skill-evaluator
  3. After installation, invoke the skill by name or use /multi-agent-skill-evaluator
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.4.0
v1.4.0 - description精简为'帮我评估一下这个 skill'
v1.3.0
v1.3.0 - description重写:以用户触发场景开头,支持人类评估和agent下载前自检两种用途
v1.2.0
v1.2.0 - 多智能体独立评估:3个子agent分别打分+JSON结构化输出+分歧追问机制
Metadata
Slug multi-agent-skill-evaluator
Version 1.4.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 3
Frequently Asked Questions

What is Multi-Agent Skill Evaluator?

帮我评估一下这个 skill。 It is an AI Agent Skill for Claude Code / OpenClaw, with 62 downloads so far.

How do I install Multi-Agent Skill Evaluator?

Run "/install multi-agent-skill-evaluator" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Multi-Agent Skill Evaluator free?

Yes, Multi-Agent Skill Evaluator is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Multi-Agent Skill Evaluator support?

Multi-Agent Skill Evaluator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Multi-Agent Skill Evaluator?

It is built and maintained by 54Lynnn (@54lynnn); the current version is v1.4.0.

💬 Comments