Evalpal
/install evalpal
EvalPal Skill
Run AI agent evaluations inline. Trigger eval runs, poll for results, and list available evaluation definitions — all from chat.
Prerequisites
Set the following environment variables in your OpenClaw skill configuration:
| Variable | Required | Description |
|---|---|---|
EVALPAL_API_KEY |
Yes | Your EvalPal API key (starts with sk_) |
EVALPAL_API_URL |
No | Base URL (defaults to https://evalpal.dev) |
Get your API key from Settings → API Keys at evalpal.dev.
Commands
/evalpal run --eval-id \x3CID>
Trigger an evaluation run and wait for results.
Usage:
bash scripts/run-eval.sh --eval-id \x3CEVAL_DEFINITION_ID>
What it does:
- Triggers a new eval run via the EvalPal API
- Polls for completion with exponential backoff (up to 5 minutes)
- Fetches and formats results as readable markdown
Example output:
✅ Episode Quality — PASSED (15/16)
├── Test Case tc_001: ✓ PASS
├── Test Case tc_002: ✓ PASS
├── Test Case tc_003: ✗ FAIL
└── 12 more passed...
Run ID: run_abc123 · 16 test cases · 47s
Exit codes: 0 = all passed, 1 = failures or error.
/evalpal status --run-id \x3CID>
Check the current status of a running evaluation.
Usage:
bash scripts/check-status.sh --run-id \x3CRUN_ID>
Example output:
📊 Run Status: run_abc123
Status: running
Started: 2026-03-26T20:00:00Z
/evalpal list
List available evaluation definitions across your projects.
Usage:
bash scripts/list-evals.sh [--project-id \x3CPROJECT_ID>]
If --project-id is omitted, lists evals for all projects.
Example output:
📋 Evaluation Definitions
Project: AI Workforce Lab
abc123 Episode Quality Check
def456 Factual Accuracy Eval
Project: Customer Support Bot
ghi789 Response Quality
Error Handling
All scripts handle common error cases:
| Scenario | Output | Exit Code |
|---|---|---|
| No API key set | Error: EVALPAL_API_KEY is not set |
1 |
| Invalid API key | Error: Authentication failed (401) |
1 |
| Eval not found | Error: Eval definition not found (404) |
1 |
| Rate limited | Error: Rate limited — retry after Xs (429) |
1 |
| Timeout (5 min) | Error: Evaluation timed out after 300s |
1 |
| Network error | Error: Could not reach EvalPal API |
1 |
Security
- The API key is read from
EVALPAL_API_KEYenvironment variable only - Scripts never echo or log the API key
- All API calls use HTTPS
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install evalpal - 安装完成后,直接呼叫该 Skill 的名称或使用
/evalpal触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Evalpal 是什么?
Run AI agent evaluations via EvalPal — trigger eval runs, check results, and list available evaluations. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 147 次。
如何安装 Evalpal?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install evalpal」即可一键安装,无需额外配置。
Evalpal 是免费的吗?
是的,Evalpal 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Evalpal 支持哪些平台?
Evalpal 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Evalpal?
由 MatthewEngman(@matthewengman)开发并维护,当前版本 v1.0.1。