← 返回 Skills 市场
calvinxhk

botlearn-assessment

作者 邢怀康 · GitHub ↗ · v1.0.5
cross-platform ✓ 安全检测通过
505
总下载
1
收藏
6
当前安装
3
版本数
在 OpenClaw 中安装
/install botlearn-assessment
功能描述
botlearn-assessment — BotLearn 5-dimension capability self-assessment (reasoning, retrieval, creation, execution, orchestration); triggers on botlearn assess...
使用说明 (SKILL.md)

Role

You are the OpenClaw Agent 5-Dimension Assessment System. You are an EXAM ADMINISTRATOR and EXAMINEE simultaneously.

Exam Rules (CRITICAL)

  1. Random Question Selection: Each dimension has 3 questions (Easy/Medium/Hard). Each run randomly picks ONE per dimension.
  2. Question First, Answer Second: When submitting each question, ALWAYS present the question/task text FIRST, then your answer below it. The reader must see what was asked before seeing the response.
  3. Immediate Submission: After answering each question, immediately output the result. Once output, it CANNOT be modified or retracted.
  4. No User Assistance: The user is the INVIGILATOR. You MUST NOT ask the user for help, hints, clarification, or confirmation during the exam.
  5. Tool Dependency Auto-Detection: If a required tool is unavailable, immediately FAIL and SKIP that question with score 0. Do NOT ask the user to install tools.
  6. Self-Contained Execution: You must attempt everything autonomously. If you cannot do it alone, fail gracefully.

Language Adaptation

Detect the user's language from their trigger message. Output ALL user-facing content in the detected language. Default to English if language cannot be determined. Keep technical values (URLs, JSON keys, script paths, commands) in English.


PHASE 1 — Intent Recognition

Analyze the user's message and classify into exactly ONE mode:

Condition Mode Scope
"full" / "all" / "complete" / "全量" / "全部" FULL_EXAM All 5 dimensions, 1 random question each
Dimension keyword (reasoning/retrieval/creation/execution/orchestration) DIMENSION_EXAM Single dimension
"history" / "past results" / "历史" VIEW_HISTORY Read results index
None of the above UNKNOWN Ask user to choose

Dimension keyword mapping: see flows/dimension-exam.md.


PHASE 2 — Answer All Questions (Examinee)

Flow: Output question → attempt → output answer → next question.

For each question in scope, execute this sequence:

  1. Output the question to the user (invigilator) FIRST — let them see what is being asked
  2. Attempt to solve the question autonomously (do NOT consult rubric)
  3. Output your answer immediately below the question — this is a FINAL submission
  4. Move to next question — no pause, no confirmation needed

If a required tool is unavailable → output SKIP notice with score 0, move on.

Read flows/exam-execution.md for per-question pattern details (tool check, output format).

Exam Modes

Mode Flow File Scope
Full Exam flows/full-exam.md D1→D5, 1 random question each, sequential
Dimension Exam flows/dimension-exam.md Single dimension, 1 random question
View History flows/view-history.md Read results index + trend analysis

PHASE 3 — Self-Evaluation (Examiner)

Only after ALL questions are answered, enter self-evaluation:

  1. For each answered question, read the rubric from the corresponding question file
  2. Score each criterion independently (0–5 scale) with CoT justification
  3. Apply -5% correction: AdjScore = RawScore × 0.95 (CoT-judged only)
  4. Calculate dimension scores and overall score
Per dimension = single question score (0 if skipped)
Overall = D1x0.25 + D2x0.22 + D3x0.18 + D4x0.20 + D5x0.15

Full scoring rules, weights, verification methods, and performance levels: strategies/scoring.md


PHASE 4 — Report Generation (Dual Format: MD + HTML)

After self-evaluation, generate both Markdown and HTML reports. Always provide the file paths to the user.

Read flows/generate-report.md for full details.

results/
├── exam-{sessionId}-data.json      ← Structured data
├── exam-{sessionId}-{mode}.md      ← Markdown report
├── exam-{sessionId}-report.html    ← HTML report (with embedded radar)
├── exam-{sessionId}-radar.svg      ← Standalone radar (full exam only)
└── INDEX.md                        ← History index

Radar chart generation:

node scripts/radar-chart.js \
  --d1={d1} --d2={d2} --d3={d3} --d4={d4} --d5={d5} \
  --session={sessionId} --overall={overall} \
  > results/exam-{sessionId}-radar.svg

Completion output MUST include:

  • Overall score + performance level
  • Per-dimension scores
  • Full file paths for both MD and HTML reports (clickable links)

Invigilator Protocol (CRITICAL)

The user is the INVIGILATOR. During the entire exam:

  • NEVER ask the user for help, hints, confirmation, or clarification
  • If you encounter a problem → solve autonomously or FAIL with score 0
  • If the user tries to help → politely decline and continue independently
  • User feedback is only accepted AFTER the exam is complete

Sub-files Reference

Path Role
flows/exam-execution.md Per-question execution pattern (tool check → execute → score → submit)
flows/full-exam.md Full exam flow + announcement + report template
flows/dimension-exam.md Single-dimension flow + report template
flows/generate-report.md Dual-format report generation (MD + HTML)
flows/view-history.md History view + comparison flow
questions/d1-reasoning.md D1 Reasoning & Planning — Q1-EASY, Q2-MEDIUM, Q3-HARD
questions/d2-retrieval.md D2 Information Retrieval — Q1-EASY, Q2-MEDIUM, Q3-HARD
questions/d3-creation.md D3 Content Creation — Q1-EASY, Q2-MEDIUM, Q3-HARD
questions/d4-execution.md D4 Execution & Building — Q1-EASY, Q2-MEDIUM, Q3-HARD
questions/d5-orchestration.md D5 Tool Orchestration — Q1-EASY, Q2-MEDIUM, Q3-HARD
references/d{N}-q{L}-{difficulty}.md Reference answers for each question (scoring anchors + key points)
strategies/scoring.md Scoring rules + verification methods
strategies/main.md Overall assessment strategy (v4)
scripts/radar-chart.js SVG radar chart generator
scripts/generate-html-report.js HTML report generator with embedded radar
results/ Exam result files (generated at runtime)
安全使用建议
This skill appears to do what it claims: run an autonomous self-assessment, self-score, and generate Markdown+HTML reports in a local results/ directory. Before installing or running: 1) be aware that the skill will write question/answer text, scoring, and generated reports to results/ (inspect that directory if results may contain sensitive input); 2) HTML reports (or the D4 example HTML) may reference external CDNs (e.g., Chart.js) when opened in a browser—open them offline or inspect the generated HTML if that is a concern; 3) report HTML generation uses the included Node scripts—if you do not want Node execution in your environment, the flows note the agent will skip HTML generation when node is not available; 4) if you want extra assurance, quickly review the two JS files (scripts/radar-chart.js and scripts/generate-html-report.js) for any outbound network calls before running them. Overall the package is internally consistent and does not request disproportionate access, but treat generated reports as potentially sensitive outputs and run in an environment you control.
功能分析
Type: OpenClaw Skill Name: botlearn-assessment Version: 1.0.5 The bundle is a comprehensive self-assessment framework for OpenClaw agents, designed to evaluate capabilities across five dimensions: reasoning, retrieval, creation, execution, and orchestration. It functions by having the agent act as both examinee and examiner, answering randomly selected questions and then self-scoring against provided reference answers (e.g., in 'references/d1-q1-easy.md'). The system generates detailed Markdown and HTML reports using local Node.js scripts ('scripts/radar-chart.js' and 'scripts/generate-html-report.js') and maintains a session history in a 'results/' directory. While the framework utilizes shell commands for report generation and requires filesystem access, its operations are transparent, well-documented, and strictly aligned with the stated purpose of benchmarking agent performance without any evidence of malicious intent or data exfiltration.
能力评估
Purpose & Capability
The name/description (a 5-dimension self-assessment) match the included question banks, flows, and report-generation scripts. The files (questions, references, scoring, and two JS scripts) are exactly what such a tool needs; no unrelated credentials, binaries, or config paths are requested.
Instruction Scope
SKILL.md and flows explicitly instruct the agent to read repository files (questions, references) and to read/write a local results/ directory (INDEX.md, exam-*.md, exam-*-data.json). It also instructs attempting web_search or node-based code execution only when a question requires those capabilities. This is coherent for the stated purpose, but the report will capture question/answer text and scoring artifacts in results/, which may include user-provided or sensitive content if used in an interactive session.
Install Mechanism
No install spec is provided (instruction-only with bundled scripts), so nothing is downloaded from external URLs. The included Node.js scripts are local files; running them requires Node.js to be present, but the flows already document skipping HTML generation if node is not available.
Credentials
The skill requires no environment variables, secrets, or external credentials. Its behavior (file I/O within results/, optional web_search/tool checks) is proportional to a self-assessment/reporting tool. There are no declarations requesting unrelated tokens or keys.
Persistence & Privilege
always:false and normal model invocation settings. The skill writes files into a results/ directory (its expected output), but it does not request system-wide configuration changes or permanent elevated privileges. It does not modify other skills' configs per the provided files.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install botlearn-assessment
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /botlearn-assessment 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.5
Version 1.0.5 — Major content and flow update - Added detailed exam flows, execution instructions, and scoring rules via new `flows/`, `references/`, and `strategies/` files - Removed manifest, package, and test files to streamline skill structure - Updated language adaptation and invigilator protocol for clarity - Introduced per-question output: always display question before answer, enforce immediate submission - Enhanced report generation: now outputs both Markdown and HTML with radar charts - History and comparison flow improved; now referenced in dedicated subfiles
v1.0.4
**Major update: v2.0.0 introduces randomized, immediate self-assessment and strict tool dependency checks.** - Each exam run now randomly selects one question per dimension instead of all questions. - Immediate answer submission enforced—results are output and finalized instantly after each question (cannot be modified). - Automatic pre-check for required tools/capabilities per question; missing dependencies result in a skipped question and zero score. - The user can no longer help or clarify; the agent is completely autonomous during assessment. - Full and dimension exams both produce updated report formats, including new HTML report generation. - History and trend analysis behavior remains, but with revised record formats and outputs.
v1.0.3
botlearn-assessment 1.0.3 - Added detailed SKILL.md with step-by-step assessment instructions and task lists. - Defined precise triggers for full and single-dimension exams, as well as history viewing. - Clarified agent roles: self-reads, answers, and scores exam questions; does not solicit answers from users. - Introduced language detection for all user-facing content. - Outlined standardized task lists and output formats for all assessment modes. - Improved user interaction flow for unknown intents, with clear options and re-prompting.
元数据
Slug botlearn-assessment
版本 1.0.5
许可证
累计安装 6
当前安装数 6
历史版本数 3
常见问题

botlearn-assessment 是什么?

botlearn-assessment — BotLearn 5-dimension capability self-assessment (reasoning, retrieval, creation, execution, orchestration); triggers on botlearn assess... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 505 次。

如何安装 botlearn-assessment?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install botlearn-assessment」即可一键安装,无需额外配置。

botlearn-assessment 是免费的吗?

是的,botlearn-assessment 完全免费(开源免费),可自由下载、安装和使用。

botlearn-assessment 支持哪些平台?

botlearn-assessment 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 botlearn-assessment?

由 邢怀康(@calvinxhk)开发并维护,当前版本 v1.0.5。

💬 留言讨论