Description

AI-powered deliverable evaluation via EvalLayer API. Extracts factual claims, scores quality, returns structured JSON verdicts with pass/fail, confidence sco...

README (SKILL.md)

EvalLayer Evaluator Skill

Name: EvalLayer Evaluator
Author: ryanhall00

AI-powered deliverable evaluation for any OpenClaw agent. Multi-stage verification pipeline extracts factual claims, scores quality, and returns structured JSON verdicts in ~14 seconds.

EvalLayer is a live ERC-8183 evaluator on Virtuals ACP (Agent ID 29588). 250+ evaluations processed. 85% success rate.

Setup

Register for a free API key:

curl -s -X POST https://api.evallayer.ai/register \
  -H "Content-Type: application/json" \
  -d '{"agent_id": "your-agent-id"}'

Save the returned API key — it is shown only once.

Set environment variable:

export EVALLAYER_API_KEY="sk_your_key_here"

Evaluate Content

Submit any deliverable for evaluation:

bash scripts/evaluate.sh "topic" "deliverable content"

Arguments:

topic (required): What the deliverable should address (e.g., "Solana DeFi ecosystem")
deliverable (required): The content to evaluate

Example:

bash scripts/evaluate.sh \
  "Bitcoin ETF adoption" \
  "BlackRock IBIT accumulated 20 billion in assets within 6 months of launch. Fidelity FBTC reached 10 billion AUM by Q3 2024. Total spot Bitcoin ETF net inflows exceeded 17 billion."

Dependencies: This script uses curl for HTTP requests and python3 for safe JSON escaping of input text. Both must be available in your PATH.

Demo (No API Key Required)

Test with 3 free evaluations per day — no registration needed:

bash scripts/demo.sh "topic" "deliverable content"

Dependencies: Same as evaluate.sh — requires curl and python3.

Quick Evaluate (curl only)

For environments without python3, use curl directly:

curl -s -X POST https://api.evallayer.ai/evaluate \
  -H "Authorization: Bearer $EVALLAYER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"task_type": "crypto_research", "topic": "your topic", "deliverable": "content to evaluate"}'

Note: this approach does not escape special characters in inputs. Use the script for content containing quotes or backslashes.

Output Format

{
  "passed": true,
  "quality_score": 0.833,
  "confidence_score": 0.85,
  "rationale": "Evaluated 6 claims: 5 supported, 1 unsupported.",
  "payout_recommendation": "full",
  "claims_total": 6,
  "claims_supported": 5,
  "claims_unsupported": 1,
  "evaluation_id": "eval_abc123_def456"
}

Key fields:

passed: Boolean — overall pass/fail verdict
quality_score: 0.0-1.0 — overall quality rating (0.4+ = pass, 0.7+ = full payout)
claims_total / claims_supported: Claim counts
payout_recommendation: "full", "partial", or "reject"
evaluation_id: Use with GET /evaluate/{id} for detailed claim breakdown

Check Evaluation Details

Retrieve the full claim-by-claim breakdown for any evaluation:

curl -s https://api.evallayer.ai/evaluate/EVALUATION_ID

Returns each extracted claim with type, support status, confidence score, and notes.

Check Provider Reputation

Look up any agent's evaluation history:

curl -s https://api.evallayer.ai/reputation/AGENT_ID

Intelligence API

Access aggregated market intelligence from all evaluations:

curl -s https://api.evallayer.ai/intelligence \
  -H "Authorization: Bearer $EVALLAYER_API_KEY"

Returns trending verified claims, provider leaderboard, and topic trends.

Rate Limits

Free tier: 5 evaluations/day per API key
Demo endpoint: 3 evaluations/day per IP (no key needed)
Pro tier: 5,000 evaluations/day ($99/mo)

Use When

You need to verify research quality before acting on it
You want to score deliverables in agent-to-agent workflows
You need to extract and validate factual claims from content
You are building evaluation gates in ACP or other commerce flows
You want to check a provider's reputation before hiring them

NOT For

Evaluating non-text content (images, audio, video)
Real-time price data or trading signals
Content generation — this is verification only

External Endpoints

api.evallayer.ai — EvalLayer evaluation and intelligence API (HTTPS only)

Security & Privacy

Deliverable content is sent to api.evallayer.ai for evaluation over HTTPS
Content is stored for intelligence aggregation (claims extraction)
API key authenticates requests and tracks usage — use a dedicated key with minimal scope
No personally identifiable information is collected
For sensitive content, review the deliverable before submitting

Usage Guidance

This skill appears to do what it says: it forwards the supplied text to https://api.evallayer.ai for evaluation. Before installing or using it, consider: (1) Metadata mismatch — the registry summary omitted required env/binaries; rely on the SKILL.md which requires EVALLAYER_API_KEY, curl, and python3. (2) Privacy — submitted deliverables are sent to and stored by the provider for intelligence aggregation; do not submit sensitive PII or confidential material unless you trust the provider and have reviewed their retention policy. (3) Use a dedicated, minimal-scope API key as recommended. (4) Verify the provider/domain (api.evallayer.ai) and reputation if you plan to use this in production. Operationally, the demo script permits limited no-key testing. If you want extra assurance, ask the publisher to correct the registry metadata to declare the required env and binaries.

Capability Analysis

Type: OpenClaw Skill Name: evallayer-evaluator Version: 2.0.1 The skill is a legitimate API wrapper for the EvalLayer evaluation service. The scripts (scripts/evaluate.sh and scripts/demo.sh) use python3 to safely escape user input before sending it via curl to the api.evallayer.ai endpoint, preventing JSON injection. No evidence of data exfiltration, unauthorized execution, or malicious prompt instructions was found.

Capability Tags

crypto

Capability Assessment

ℹ Purpose & Capability

The skill name/description (EvalLayer evaluator) match the behavior in SKILL.md and the scripts: HTTP POSTs to api.evallayer.ai to evaluate content. The declared runtime requirements in SKILL.md (EVALLAYER_API_KEY, curl, python3) are appropriate. There is a minor registry metadata inconsistency: the top-level registry summary provided with this submission lists no required env vars/binaries, while the SKILL.md and scripts clearly require EVALLAYER_API_KEY (for the authenticated endpoint) and curl/python3. This is likely an authoring/metadata omission rather than malicious behavior.

ℹ Instruction Scope

Runtime instructions and the two scripts only send the provided 'topic' and 'deliverable' to the service endpoints; they do not read other files, system state, or extra environment variables. The SKILL.md explicitly warns that submitted content is stored for intelligence aggregation — a privacy consideration but consistent with the stated purpose. No commands grant broad discretion or request unrelated data.

✓ Install Mechanism

No install spec is provided (instruction-only plus two small scripts). No downloads or archive extraction occur. This is low-risk from an installation footprint perspective.

✓ Credentials

Only a single provider API key (EVALLAYER_API_KEY) is used for authenticated evaluations; that is proportional to the skill's function. The demo script provides a no-key demo endpoint. No other secrets, unrelated credentials, or config paths are requested.

✓ Persistence & Privilege

The skill does not request permanent/always-on presence (always: false) and does not modify system or other-skill configs. It allows normal autonomous invocation (platform default) but has no elevated privilege requests.

Version History

v2.0.2

Fix display name, corrected binary requirements, broadened description

v2.0.1

evallayer-evaluator 2.0.1 changelog - Documentation updated in SKILL.md; no functional or code changes. - Clarified usage, setup, output format, and API endpoints in the ReadMe. - Version bump from 2.0.0 to 2.0.1 for documentation refresh.

v2.0.0

**Major update with enhanced evaluation workflow and new output fields.** - Now uses a multi-stage verification pipeline and supports general deliverable evaluation (not just crypto research). - New output fields: `claims_unsupported`, `evaluation_id`, and updated pass/fail thresholds. - Requires both `curl` and `python3` for safe parameter handling in scripts. - Demo and evaluation scripts now note dependencies and safer input handling. - Output documentation updated; you can now retrieve a detailed claim-by-claim breakdown using `evaluation_id`. - Clarified rate limits and safe-use guidance for sensitive content.

v1.0.0

Initial release: Evaluate crypto research quality via the EvalLayer API. - Extracts and scores factual claims from research deliverables. - Returns structured verdicts with quality and confidence scores (JSON output). - Includes pass/fail decisions and payout recommendations. - Supports script and direct curl usage; offers 3 daily demo runs with no API key. - Provides utility for provider reputation checks and intelligence aggregation. - Requires EVALLAYER_API_KEY, and curl/jq binaries.

Metadata

Slug evallayer-evaluator

Version 2.0.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 4

Frequently Asked Questions

What is EvalLayer Evaluator?

AI-powered deliverable evaluation via EvalLayer API. Extracts factual claims, scores quality, returns structured JSON verdicts with pass/fail, confidence sco... It is an AI Agent Skill for Claude Code / OpenClaw, with 169 downloads so far.

How do I install EvalLayer Evaluator?

Run "/install evallayer-evaluator" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is EvalLayer Evaluator free?

Yes, EvalLayer Evaluator is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does EvalLayer Evaluator support?

EvalLayer Evaluator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created EvalLayer Evaluator?

It is built and maintained by Ryan Hall (@ryanhall00); the current version is v2.0.1.

More Skills

EvalLayer Evaluator