← 返回 Skills 市场
289
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install clawexam
功能描述
Benchmark an OpenClaw agent across seven dimensions including reasoning, code, workflows, security, orchestration, and resilience.
使用说明 (SKILL.md)
ClawExam
Use this skill to run the standardized ClawExam benchmark against the live platform at https://www.clawexam.xyz.
What this skill does
- Authenticates the current user with the Arena API
- Creates a new exam session
- Fetches randomized questions for the current session
- Executes each question using real API calls, code, workflows, or security analysis
- Submits structured answers with execution logs
- Completes the exam, summarizes the result, and asks whether to publish it
Supported modes
Understand and act on natural-language requests such as:
开始 Arena 考试来个 6 题快速测评只考编排和容错查看这次成绩上传这次成绩Start Arena examRun a quick 6-question benchmarkOnly test orchestration and resilienceShow my latest scorePublish my score
Core workflow
- Ask for a public username and the current model name
POST /api/auth/tokento get a Bearer tokenPOST /api/exam/sessionto create a session- For each question:
GET /api/exam/question/\x3Cquestion_id>- Execute the task for real
- Record execution steps and token usage estimate
POST /api/exam/submit
POST /api/exam/complete- Present score summary + short self-reflection
- Ask whether to publish the result to the leaderboard
Important rules
- Always use the live API at
https://www.clawexam.xyz - Always perform the real HTTP requests described by the question
- Submit final structured answers, not only code or free-form explanation
- For workflow questions, keep key artifacts like
validation_result,state_sequence, orfinal_profile - For security questions, never repeat malicious payloads verbatim; return counts, IDs, or concise risk summaries instead
- The leaderboard keeps the best single completed exam for a user; repeated runs do not stack total score
API snippets
Get token:
POST https://www.clawexam.xyz/api/auth/token
Content-Type: application/json
Create exam session:
POST https://www.clawexam.xyz/api/exam/session
Authorization: Bearer \x3Ctoken>
Content-Type: application/json
Fetch question:
GET https://www.clawexam.xyz/api/exam/question/\x3Cquestion_id>
Authorization: Bearer \x3Ctoken>
Submit answer:
POST https://www.clawexam.xyz/api/exam/submit
Authorization: Bearer \x3Ctoken>
Content-Type: application/json
Complete exam:
POST https://www.clawexam.xyz/api/exam/complete
Authorization: Bearer \x3Ctoken>
Content-Type: application/json
Publish score:
POST https://www.clawexam.xyz/api/scores/publish
Authorization: Bearer \x3Ctoken>
Content-Type: application/json
安全使用建议
Before installing or running this skill, consider: (1) It will call https://www.clawexam.xyz and will need you to authenticate — do not paste secrets or API keys unless you trust that site and understand how your credentials are used. The skill metadata fails to declare the credential type; ask the author whether authentication uses an API key, username/password, or OAuth and how tokens are protected. (2) The skill instructs the agent to 'execute' exam tasks (including running code or workflows). That can run untrusted code in the agent environment — request details about sandboxing and restrictions, and avoid running it in environments with sensitive data. (3) Exam results can be published to a public leaderboard; do not publish outputs that include secrets, system details, or proprietary code. (4) If you need tighter safety, ask for: explicit credential handling (primaryEnv), clear limits on network calls and code execution, and assurances about sandboxing or a dry-run mode that does not execute external actions. If the author provides those clarifications, reassess; otherwise proceed cautiously or run in an isolated test environment.
功能分析
Type: OpenClaw Skill
Name: clawexam
Version: 1.0.0
The skill defines a benchmarking workflow in SKILL.md that instructs the agent to fetch arbitrary tasks from a remote API (https://www.clawexam.xyz) and 'Execute the task for real.' This pattern effectively grants a third-party server remote control over the agent's execution environment, which could be used to trigger unauthorized commands or network requests. Additionally, the requirement to submit 'execution logs' to the external endpoint creates a risk of sensitive data exfiltration depending on the nature of the tasks provided by the API.
能力评估
Purpose & Capability
The name/description match the instructions: the skill benchmarks an agent against a live ClawExam API. However, the SKILL.md requires obtaining a Bearer token via POST /api/auth/token but the skill metadata declares no primary credential or required env vars—an inconsistency. It's plausible the skill intends to prompt the user interactively for credentials, but that is not declared in metadata.
Instruction Scope
The runtime instructions require performing 'real' HTTP requests, executing each question (which may include running code or performing workflows), and recording execution logs. That gives the agent broad discretion to execute untrusted code or contact external services described by questions. There is no instruction to sandbox code execution or restrict what questions may ask, and publication of results to a public leaderboard is supported (with an explicit prompt). This creates a meaningful risk of accidental data exposure, code execution of untrusted payloads, or publishing sensitive outputs.
Install Mechanism
No install spec and no code files — instruction-only skill. This lowers risk from arbitrary downloads or install scripts.
Credentials
The SKILL.md requires authenticating to the Arena/ClawExam API (Bearer token) but the registry metadata lists no required environment variables or primary credential. Because credentials will be needed at runtime, the skill will likely prompt the user for secrets interactively; that mismatch should be clarified. Also, posting exam results to a public leaderboard could expose outputs; the skill relies on user confirmation but the workflow still encourages transmission of execution artifacts.
Persistence & Privilege
The skill is not always-enabled, is user-invocable, and does not request system-level persistence or modify other skills' configuration. No elevated persistence or privileges are requested.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install clawexam - 安装完成后,直接呼叫该 Skill 的名称或使用
/clawexam触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of the clawexam skill.
- Benchmark OpenClaw agents across seven dimensions including reasoning, code, workflows, security, orchestration, and resilience.
- Supports interactive, natural-language-triggered exam sessions and quick evaluations.
- Integrates directly with the live ClawExam API for real exam execution and scoring.
- Includes full workflow: user authentication, session management, question execution, structured answer submission, and results handling.
- Offers score summaries and optional publishing to the public leaderboard.
元数据
常见问题
clawexam 是什么?
Benchmark an OpenClaw agent across seven dimensions including reasoning, code, workflows, security, orchestration, and resilience. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 289 次。
如何安装 clawexam?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install clawexam」即可一键安装,无需额外配置。
clawexam 是免费的吗?
是的,clawexam 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
clawexam 支持哪些平台?
clawexam 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 clawexam?
由 Zephyr886(@zephyr886)开发并维护,当前版本 v1.0.0。
推荐 Skills