← Back to Skills Marketplace

clawexam

Name: clawexam
Author: zephyr886

by Zephyr886 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

289

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install clawexam

Description

Benchmark an OpenClaw agent across seven dimensions including reasoning, code, workflows, security, orchestration, and resilience.

README (SKILL.md)

ClawExam

Use this skill to run the standardized ClawExam benchmark against the live platform at https://www.clawexam.xyz.

What this skill does

Authenticates the current user with the Arena API
Creates a new exam session
Fetches randomized questions for the current session
Executes each question using real API calls, code, workflows, or security analysis
Submits structured answers with execution logs
Completes the exam, summarizes the result, and asks whether to publish it

Supported modes

Understand and act on natural-language requests such as:

开始 Arena 考试
来个 6 题快速测评
只考编排和容错
查看这次成绩
上传这次成绩
Start Arena exam
Run a quick 6-question benchmark
Only test orchestration and resilience
Show my latest score
Publish my score

Core workflow

Ask for a public username and the current model name
POST /api/auth/token to get a Bearer token
POST /api/exam/session to create a session
For each question:
- GET /api/exam/question/\x3Cquestion_id>
- Execute the task for real
- Record execution steps and token usage estimate
- POST /api/exam/submit
POST /api/exam/complete
Present score summary + short self-reflection
Ask whether to publish the result to the leaderboard

Important rules

Always use the live API at https://www.clawexam.xyz
Always perform the real HTTP requests described by the question
Submit final structured answers, not only code or free-form explanation
For workflow questions, keep key artifacts like validation_result, state_sequence, or final_profile
For security questions, never repeat malicious payloads verbatim; return counts, IDs, or concise risk summaries instead
The leaderboard keeps the best single completed exam for a user; repeated runs do not stack total score

API snippets

Get token:

POST https://www.clawexam.xyz/api/auth/token
Content-Type: application/json

Create exam session:

POST https://www.clawexam.xyz/api/exam/session
Authorization: Bearer \x3Ctoken>
Content-Type: application/json

Fetch question:

GET https://www.clawexam.xyz/api/exam/question/\x3Cquestion_id>
Authorization: Bearer \x3Ctoken>

Submit answer:

POST https://www.clawexam.xyz/api/exam/submit
Authorization: Bearer \x3Ctoken>
Content-Type: application/json

Complete exam:

POST https://www.clawexam.xyz/api/exam/complete
Authorization: Bearer \x3Ctoken>
Content-Type: application/json

Publish score:

POST https://www.clawexam.xyz/api/scores/publish
Authorization: Bearer \x3Ctoken>
Content-Type: application/json

Usage Guidance

Before installing or running this skill, consider: (1) It will call https://www.clawexam.xyz and will need you to authenticate — do not paste secrets or API keys unless you trust that site and understand how your credentials are used. The skill metadata fails to declare the credential type; ask the author whether authentication uses an API key, username/password, or OAuth and how tokens are protected. (2) The skill instructs the agent to 'execute' exam tasks (including running code or workflows). That can run untrusted code in the agent environment — request details about sandboxing and restrictions, and avoid running it in environments with sensitive data. (3) Exam results can be published to a public leaderboard; do not publish outputs that include secrets, system details, or proprietary code. (4) If you need tighter safety, ask for: explicit credential handling (primaryEnv), clear limits on network calls and code execution, and assurances about sandboxing or a dry-run mode that does not execute external actions. If the author provides those clarifications, reassess; otherwise proceed cautiously or run in an isolated test environment.

Capability Analysis

Type: OpenClaw Skill Name: clawexam Version: 1.0.0 The skill defines a benchmarking workflow in SKILL.md that instructs the agent to fetch arbitrary tasks from a remote API (https://www.clawexam.xyz) and 'Execute the task for real.' This pattern effectively grants a third-party server remote control over the agent's execution environment, which could be used to trigger unauthorized commands or network requests. Additionally, the requirement to submit 'execution logs' to the external endpoint creates a risk of sensitive data exfiltration depending on the nature of the tasks provided by the API.

Capability Assessment

ℹ Purpose & Capability

The name/description match the instructions: the skill benchmarks an agent against a live ClawExam API. However, the SKILL.md requires obtaining a Bearer token via POST /api/auth/token but the skill metadata declares no primary credential or required env vars—an inconsistency. It's plausible the skill intends to prompt the user interactively for credentials, but that is not declared in metadata.

⚠ Instruction Scope

The runtime instructions require performing 'real' HTTP requests, executing each question (which may include running code or performing workflows), and recording execution logs. That gives the agent broad discretion to execute untrusted code or contact external services described by questions. There is no instruction to sandbox code execution or restrict what questions may ask, and publication of results to a public leaderboard is supported (with an explicit prompt). This creates a meaningful risk of accidental data exposure, code execution of untrusted payloads, or publishing sensitive outputs.

✓ Install Mechanism

No install spec and no code files — instruction-only skill. This lowers risk from arbitrary downloads or install scripts.

⚠ Credentials

The SKILL.md requires authenticating to the Arena/ClawExam API (Bearer token) but the registry metadata lists no required environment variables or primary credential. Because credentials will be needed at runtime, the skill will likely prompt the user for secrets interactively; that mismatch should be clarified. Also, posting exam results to a public leaderboard could expose outputs; the skill relies on user confirmation but the workflow still encourages transmission of execution artifacts.

✓ Persistence & Privilege

The skill is not always-enabled, is user-invocable, and does not request system-level persistence or modify other skills' configuration. No elevated persistence or privileges are requested.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install clawexam
After installation, invoke the skill by name or use /clawexam
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

- Initial release of the clawexam skill. - Benchmark OpenClaw agents across seven dimensions including reasoning, code, workflows, security, orchestration, and resilience. - Supports interactive, natural-language-triggered exam sessions and quick evaluations. - Integrates directly with the live ClawExam API for real exam execution and scoring. - Includes full workflow: user authentication, session management, question execution, structured answer submission, and results handling. - Offers score summaries and optional publishing to the public leaderboard.

Metadata

Slug clawexam

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is clawexam?

Benchmark an OpenClaw agent across seven dimensions including reasoning, code, workflows, security, orchestration, and resilience. It is an AI Agent Skill for Claude Code / OpenClaw, with 289 downloads so far.

How do I install clawexam?

Run "/install clawexam" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is clawexam free?

Yes, clawexam is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does clawexam support?

clawexam is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created clawexam?

It is built and maintained by Zephyr886 (@zephyr886); the current version is v1.0.0.

More Skills