← 返回 Skills 市场
matthewengman

Evalpal

作者 MatthewEngman · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ✓ 安全检测通过
147
总下载
0
收藏
1
当前安装
2
版本数
在 OpenClaw 中安装
/install evalpal
功能描述
Run AI agent evaluations via EvalPal — trigger eval runs, check results, and list available evaluations
使用说明 (SKILL.md)

EvalPal Skill

Run AI agent evaluations inline. Trigger eval runs, poll for results, and list available evaluation definitions — all from chat.

Prerequisites

Set the following environment variables in your OpenClaw skill configuration:

Variable Required Description
EVALPAL_API_KEY Yes Your EvalPal API key (starts with sk_)
EVALPAL_API_URL No Base URL (defaults to https://evalpal.dev)

Get your API key from Settings → API Keys at evalpal.dev.

Commands

/evalpal run --eval-id \x3CID>

Trigger an evaluation run and wait for results.

Usage:

bash scripts/run-eval.sh --eval-id \x3CEVAL_DEFINITION_ID>

What it does:

  1. Triggers a new eval run via the EvalPal API
  2. Polls for completion with exponential backoff (up to 5 minutes)
  3. Fetches and formats results as readable markdown

Example output:

✅ Episode Quality — PASSED (15/16)
├── Test Case tc_001: ✓ PASS
├── Test Case tc_002: ✓ PASS
├── Test Case tc_003: ✗ FAIL
└── 12 more passed...

Run ID: run_abc123 · 16 test cases · 47s

Exit codes: 0 = all passed, 1 = failures or error.

/evalpal status --run-id \x3CID>

Check the current status of a running evaluation.

Usage:

bash scripts/check-status.sh --run-id \x3CRUN_ID>

Example output:

📊 Run Status: run_abc123
Status: running
Started: 2026-03-26T20:00:00Z

/evalpal list

List available evaluation definitions across your projects.

Usage:

bash scripts/list-evals.sh [--project-id \x3CPROJECT_ID>]

If --project-id is omitted, lists evals for all projects.

Example output:

📋 Evaluation Definitions

Project: AI Workforce Lab
  abc123  Episode Quality Check
  def456  Factual Accuracy Eval

Project: Customer Support Bot
  ghi789  Response Quality

Error Handling

All scripts handle common error cases:

Scenario Output Exit Code
No API key set Error: EVALPAL_API_KEY is not set 1
Invalid API key Error: Authentication failed (401) 1
Eval not found Error: Eval definition not found (404) 1
Rate limited Error: Rate limited — retry after Xs (429) 1
Timeout (5 min) Error: Evaluation timed out after 300s 1
Network error Error: Could not reach EvalPal API 1

Security

  • The API key is read from EVALPAL_API_KEY environment variable only
  • Scripts never echo or log the API key
  • All API calls use HTTPS
安全使用建议
This skill appears to do exactly what it says: call the EvalPal API to list evals, start runs, and fetch results. Before installing: ensure you trust the https://evalpal.dev service and create an API key with the least privileges needed; avoid supplying a high-privilege or long-lived key if possible. Confirm you are comfortable allowing your agent to call the API (agent invocation is allowed by default). If you override EVALPAL_API_URL, verify the custom domain is trusted. As a routine precaution, rotate the API key if you suspect exposure and review logs for unexpected activity. Finally, you can sanity-check the included scripts locally (they're plain shell) to confirm they meet your operational and security expectations.
功能分析
Type: OpenClaw Skill Name: evalpal Version: 1.0.1 The evalpal skill is a legitimate integration for the EvalPal AI evaluation platform, allowing users to trigger and monitor AI agent evaluations. The bundle contains shell scripts (run-eval.sh, check-status.sh, list-evals.sh) that interact with the official EvalPal API (https://evalpal.dev) using curl and jq. The implementation follows secure practices, such as using environment variables for API keys, employing HTTPS for all communications, and properly quoting variables to prevent shell injection. No evidence of malicious intent, data exfiltration, or prompt injection was found.
能力评估
Purpose & Capability
Name/description describe running EvalPal evaluations and the bundle contains three scripts that call EvalPal API endpoints. The declared required binary/tools (curl, jq) and required env var (EVALPAL_API_KEY) are appropriate for this purpose. The SKILL.md documents an optional EVALPAL_API_URL (defaults to https://evalpal.dev), which is reasonable.
Instruction Scope
All instructions and included scripts consistently perform only API calls to the EvalPal service, poll for run status, and format results. The scripts do not read unrelated files, do not reference other environment secrets, and do not send data to external endpoints beyond the configured API_URL. They explicitly avoid printing the API key and use HTTPS.
Install Mechanism
No install spec; this is instruction-only plus shell scripts. That is low-risk: nothing is downloaded or installed by the skill itself. The only runtime dependencies are standard system tools (curl, jq).
Credentials
The skill requires a single API key (EVALPAL_API_KEY) which matches its stated need to authenticate to EvalPal. The optional EVALPAL_API_URL is documented but not required. No unrelated credentials or config paths are requested.
Persistence & Privilege
The skill is not set to always:true and does not request persistent or cross-skill configuration changes. It can be invoked by the agent (normal default); that autonomous invocation is expected for skills but does not add unusual privileges here.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install evalpal
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /evalpal 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
Declare EVALPAL_API_KEY env var and curl/jq binary requirements in registry metadata
v1.0.0
- Initial release of EvalPal skill. - Run AI agent evaluations via EvalPal directly from chat. - Trigger evaluation runs, poll for results, and display outcomes in markdown. - Check status of evaluation runs by ID. - List all available evaluation definitions across your projects. - Handles authentication, error cases, and API limits securely.
元数据
Slug evalpal
版本 1.0.1
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 2
常见问题

Evalpal 是什么?

Run AI agent evaluations via EvalPal — trigger eval runs, check results, and list available evaluations. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 147 次。

如何安装 Evalpal?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install evalpal」即可一键安装,无需额外配置。

Evalpal 是免费的吗?

是的,Evalpal 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Evalpal 支持哪些平台?

Evalpal 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Evalpal?

由 MatthewEngman(@matthewengman)开发并维护,当前版本 v1.0.1。

💬 留言讨论