← Back to Skills Marketplace

Model Tester

Name: Model Tester
Author: nandorocker

by Nando Rossi · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

376

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install model-tester

Description

Test agents or models against predefined test cases to validate model routing, performance, and output quality. Use when: (1) verifying a specific agent or m...

Usage Guidance

Before installing or running this skill: (1) ensure the 'openclaw' CLI is installed and accessible (the skill did not declare this dependency in metadata), (2) run it in a safe/sandboxed environment because it tails OpenClaw logs which may contain unrelated or sensitive information from your deployment, (3) verify your OpenClaw config/gateway credentials are appropriate for testing (the script will use whatever local config the CLI has), (4) review and, if desired, edit references/test-cases.json so test prompts contain no sensitive data, and (5) consider running a single case with verbose output to confirm the tool only parses the expected model/token fields before using it on broader logs or CI. If you need higher assurance, ask the skill author to (a) declare 'openclaw' as a required binary in metadata, (b) add an option to limit log scope/time window, and (c) avoid reading unrelated log lines or optionally write raw logs only to a user-specified local file for manual review.

Capability Analysis

Type: OpenClaw Skill Name: model-tester Version: 1.0.0 The model-tester skill is a legitimate benchmarking and diagnostic tool designed to verify OpenClaw agent routing and performance. It uses the 'openclaw' CLI to execute predefined test cases from 'references/test-cases.json' and monitors system logs via 'openclaw logs' to extract model usage and token statistics. The Python script 'scripts/model_tester.py' uses standard subprocess handling without shell injection risks, and no evidence of data exfiltration, persistence, or malicious prompt injection was found.

Capability Assessment

⚠ Purpose & Capability

The skill's stated purpose (testing agents/models) matches the included code: scripts/model_tester.py runs predefined prompts and checks routing via OpenClaw logs. However, the SKILL metadata declares no required binaries while the code clearly requires the 'openclaw' CLI (used for both 'openclaw logs --follow --json' and 'openclaw agent ...'). This undeclared dependency is an incoherence and should be fixed/verified before install. The code also implicitly requires that the user has a valid OpenClaw configuration (gateway/credentials) available to the 'openclaw' binary.

⚠ Instruction Scope

The runtime instructions and the script explicitly tail OpenClaw logs and run 'openclaw agent' subprocesses. The SKILL.md asserts only structured fields are captured and no user data is sent to models, which the script mostly enforces by using fixed test prompts. However, tailing logs with '--follow' collects arbitrary log lines from the OpenClaw runtime and the script inspects those lines with regexes — that can inadvertently match or expose other log content. The tool does not transmit logs externally, but it reads them and includes parsed tokens/model fields in output; if logs contain unexpected sensitive fields, parsing may capture them. The instruction text is otherwise scoped to the testing task and does not ask for additional unrelated files or env vars.

ℹ Install Mechanism

There is no install spec (instruction-only plus a script file). That is low-risk in that nothing is downloaded or executed at install time, but the packaged script will execute subprocesses at runtime. No external archives or network installers are used.

ℹ Credentials

The skill declares no required environment variables or credentials, which is reasonable. However, it relies on the local 'openclaw' CLI and therefore implicitly on whatever credentials/config the user's OpenClaw installation uses (gateway keys, local config). That implicit access is proportional to the tool's purpose but should be understood by the user: running this script will cause the agent/CLI to execute and may read the user's OpenClaw config.

✓ Persistence & Privilege

The skill does not request persistent presence (always:false) and does not modify other skills or system settings. It runs as a normal, user-invoked tool and does not autonomously enable itself or persist new credentials.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install model-tester
After installation, invoke the skill by name or use /model-tester
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release of model-tester. - Provides a command-line tool to validate agents or models against predefined test cases. - Supports testing model routing, performance, and output quality with structured JSON reporting. - Allows targeting specific agents or models using `--agent`, `--model`, and `--case` parameters. - Extracts actual model usage, token counts, and runtime from OpenClaw logs for verification. - Ensures privacy by using only static prompts and structured log fields—no user data involved.

Metadata

Slug model-tester

Version 1.0.0

License MIT-0

All-time Installs 3

Active Installs 3

Total Versions 1

Frequently Asked Questions

What is Model Tester?

Test agents or models against predefined test cases to validate model routing, performance, and output quality. Use when: (1) verifying a specific agent or m... It is an AI Agent Skill for Claude Code / OpenClaw, with 376 downloads so far.

How do I install Model Tester?

Run "/install model-tester" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Model Tester free?

Yes, Model Tester is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Model Tester support?

Model Tester is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Model Tester?

It is built and maintained by Nando Rossi (@nandorocker); the current version is v1.0.0.

More Skills