← Back to Skills Marketplace

SkillBench

Name: SkillBench
Author: g9pedro

by G9Pedro · GitHub ↗ · v2.0.0

cross-platform ⚠ suspicious

1344

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install skillbench

Description

Track skill versions, benchmark performance, compare improvements, and get self-improvement signals. Integrates with tasktime and ClawVault.

Usage Guidance

This skill appears to be a legitimate benchmarking CLI, but there are several red flags you should address before installing: 1) npm packages run code on install and at runtime — review the package source (GitHub repo) and the published package content before running npm install globally; 2) SKILL.md references the tasktime 'tt' CLI and external services (ClawVault/ClawHub) but the manifest doesn't declare those dependencies or any auth variables — verify how the CLI obtains credentials and ensure it won't read unexpected config files or exfiltrate data; 3) Prefer to install and test this tool in an isolated environment (container or VM) first, inspect what network endpoints it contacts, and check what files/dirs it writes; 4) If you plan to give it access to service tokens, issue scoped tokens with minimal privileges and rotate them after testing; 5) If you need help auditing the npm package contents or confirming the CLI's network behavior, provide the package URL or the package tarball and I can help review it. Proceed with caution.

Capability Analysis

Type: OpenClaw Skill Name: skillbench Version: 2.0.0 The skill is designed for benchmarking and monitoring AI agent performance, which involves syncing data to external services (ClawVault, ClawHub) and generating cron configurations for automated testing. While these actions are explicitly stated as part of the skill's purpose and do not show clear malicious intent, the capabilities to perform network communication to external endpoints (e.g., `skillbench sync --vault` to clawvault.dev) and to generate system-level automation (e.g., `skillbench schedule` for cron jobs) are considered high-risk behaviors. These capabilities, even if intended for benign use, could be exploited or lead to unintended consequences, warranting a 'suspicious' classification.

Capability Assessment

✓ Purpose & Capability

Name and description match the declared binary and CLI functionality (skillbench CLI). The install spec (npm @versatly/skillbench → skillbench binary) is consistent with the stated purpose of a benchmarking CLI. However the registry metadata provides no homepage or source repo to review.

⚠ Instruction Scope

SKILL.md instructs the agent to call the skillbench CLI and the tasktime ('tt') CLI and to sync with external services (ClawVault, ClawHub). The skill's requires.bins only lists 'skillbench' and does not declare 'tt' or any other external tool it references, and it does not declare where ClawVault/ClawHub credentials come from — so the runtime instructions rely on tools/credentials not described in the skill manifest.

ℹ Install Mechanism

Install uses npm (@versatly/skillbench) to create a global 'skillbench' binary — a common pattern for CLIs but one that executes third-party code during install/use. There is no homepage or source URL in the metadata to audit the package, increasing the risk because arbitrary npm package code would run on install and at runtime.

⚠ Credentials

The SKILL.md describes automatic syncing to ClawVault and ClawHub and interaction with external dashboards and CI. Yet the skill declares no required environment variables or auth tokens. This is a mismatch: the CLI likely needs credentials or config files to access those services, but the skill does not declare where those credentials come from or what variables/paths it will read.

✓ Persistence & Privilege

The skill is not 'always' and does not request elevated platform privileges in the manifest. It installs a CLI binary (global npm install) but does not declare modifying other skills or agent-wide config; that is within normal bounds for a user-invokable CLI skill.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install skillbench
After installation, invoke the skill by name or use /skillbench
Provide required inputs per the skill's parameter spec and get structured output

Version History

v2.0.0

v2.0.0: 18 commands, CI command with JSON output, GitHub Action workflow, improved README

v1.7.0

Added baseline command for regression detection (CI-friendly)

v1.6.0

Added improve, trend, leaderboard, schedule commands for comprehensive skill analysis

v1.4.0

Added health command for alerts, watch command for continuous monitoring

v1.2.0

Added: test command for auto-grading, dashboard for HTML reports, sync for ClawHub integration

v1.0.0

Initial release: record, score, compare, export with tasktime + ClawVault integration

Metadata

Slug skillbench

Version 2.0.0

License —

All-time Installs 1

Active Installs 1

Total Versions 6

Frequently Asked Questions

What is SkillBench?

Track skill versions, benchmark performance, compare improvements, and get self-improvement signals. Integrates with tasktime and ClawVault. It is an AI Agent Skill for Claude Code / OpenClaw, with 1344 downloads so far.

How do I install SkillBench?

Run "/install skillbench" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is SkillBench free?

Yes, SkillBench is completely free (open-source). You can download, install and use it at no cost.

Which platforms does SkillBench support?

SkillBench is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created SkillBench?

It is built and maintained by G9Pedro (@g9pedro); the current version is v2.0.0.

More Skills