← Back to Skills Marketplace

llama.cpp Benchmark

Name: llama.cpp Benchmark
Author: alexhegit

by alexhegit · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

119

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install llamacpp-bench

Description

Run llama.cpp benchmarks on GGUF models to measure prompt processing (pp) and token generation (tg) performance. Use when the user wants to benchmark LLM mod...

README (SKILL.md)

llamacpp-bench

Run standardized benchmarks on GGUF models using llama.cpp's llama-bench tool.

Quick Start

# Basic benchmark
llama-bench -m model.gguf -p 512,1024,2048 -n 128,256 -ngl 99

# With specific backend
LLAMA_BACKEND=vulkan llama-bench -m model.gguf -p 512,1024,2048 -n 128,256 -ngl 99

Benchmark Parameters

Parameter	Description	Default
`-m`	Model path (GGUF file)	required
`-p`	Prompt sizes to test	512
`-n`	Generation lengths to test	128
`-ngl`	GPU layers to offload	99
`-t`	CPU threads	auto
`-dev`	Device selection	auto

Standard Test Suite

For consistent comparisons across models, use:

-p 512,1024,2048 -n 128,256 -ngl 99

This tests:

Prompt processing: 512, 1024, 2048 tokens
Token generation: 128, 256 tokens

Interpreting Results

Metric	Meaning	Good Performance
`pp512`	Prompt processing speed at 512 tokens	>1000 t/s
`pp1024`	Prompt processing speed at 1024 tokens	>1000 t/s
`pp2048`	Prompt processing speed at 2048 tokens	>1000 t/s
`tg128`	Token generation speed (128 tokens)	>50 t/s
`tg256`	Token generation speed (256 tokens)	>50 t/s

Backend Selection

llama-bench auto-detects available backends. Priority order:

CUDA (NVIDIA GPUs)
ROCm (AMD GPUs)
Vulkan (cross-platform GPU)
CPU (fallback)

To force a backend, set environment variable or check build:

# Check available backends
llama-bench --help | grep -i "backend\|cuda\|rocm\|vulkan"

Batch Benchmarking

Use the provided script for benchmarking multiple models:

./scripts/benchmark_models.sh /path/to/models/*.gguf

Saving Results

Output can be redirected to a file:

llama-bench -m model.gguf -p 512,1024,2048 -n 128,256 -ngl 99 > results.txt

Or use the benchmark script which auto-saves to timestamped files.

Common Issues

Out of memory: Reduce -ngl (GPU layers) or test smaller prompt sizes
Slow CPU performance: Ensure -t matches CPU core count
Backend not found: Check llama.cpp was built with the desired backend

Building / Updating llama.cpp

Check Current Version

./scripts/build_llamacpp.sh -v

Shows:

Current Git commit and branch
Build date
Whether behind upstream
Available backends

Build or Update

# Interactive mode (prompts for backend selection)
./scripts/build_llamacpp.sh -u

# Specify backend directly
./scripts/build_llamacpp.sh -u -b vulkan   # Vulkan (AMD/Intel GPUs)
./scripts/build_llamacpp.sh -u -b cuda     # CUDA (NVIDIA GPUs)
./scripts/build_llamacpp.sh -u -b rocm     # ROCm (AMD GPUs)
./scripts/build_llamacpp.sh -u -b cpu      # CPU only

# Clean rebuild
./scripts/build_llamacpp.sh -c -b vulkan

# Custom build directory
./scripts/build_llamacpp.sh -u -b cuda -d /custom/path

Build Options

Flag	Description
`-v`	Show version info and exit
`-u`	Update to latest from GitHub
`-c`	Clean build (remove existing)
`-b`	Backend: vulkan, cuda, rocm, cpu
`-d`	Build directory path
`-j`	Parallel jobs (default: CPU count)

Finding llama-bench

The benchmark script auto-detects llama-bench in these locations:

/DATA/Benchmark/llama.cpp/build/bin/llama-bench
~/Repo/llama.cpp/build/bin/llama-bench
~/lab/build/bin/llama-bench

If not found, it will search your home directory or you can build it using the script above.

Usage Guidance

This skill appears to do what it says: it will clone/update the llama.cpp GitHub repo and build llama-bench, then run local benchmarks on GGUF files. Before installing: 1) Be prepared to install and run build tools (git, cmake, make/ninja, a C/C++ compiler) — the metadata doesn't list these dependencies. 2) Expect the build to use network access to GitHub and to write files under ~/Repo/llama.cpp and whatever output directory you choose. 3) The benchmark script searches your home directory and /DATA to find llama-bench; this only reads local paths but can traverse many files and may take time. 4) If you need to be extra cautious, review the upstream repository (https://github.com/ggerganov/llama.cpp) and run the build inside a sandbox or VM, and ensure you have sufficient disk space and GPU drivers for the chosen backend.

Capability Analysis

Type: OpenClaw Skill Name: llamacpp-bench Version: 1.0.0 The skill bundle provides legitimate tools for benchmarking LLM models using llama.cpp. The included bash scripts (benchmark_models.sh and build_llamacpp.sh) perform standard tasks such as searching for local executables, cloning the official llama.cpp repository from GitHub, and compiling the source code using CMake. No evidence of data exfiltration, persistence mechanisms, or malicious prompt injection was found.

Capability Assessment

ℹ Purpose & Capability

The skill's scripts and SKILL.md match the stated purpose: finding/building llama.cpp and running llama-bench. One minor inconsistency: the package metadata declares no required binaries, but the build/benchmark scripts assume tools like git, cmake, a C/C++ toolchain, and typical UNIX utilities (find, grep, make). These are expected for building llama.cpp but should be declared.

ℹ Instruction Scope

Runtime instructions and scripts are narrowly scoped to cloning/updating the llama.cpp repository, building it, and running llama-bench on local GGUF files. The benchmark script searches the user's home directory and /DATA to locate llama-bench (find ~ /DATA ...) — this is local-only scanning (no remote upload) but may traverse many user files. The build script runs git fetch/pull/clone (network access to GitHub) and compiles code locally; it may prompt interactively and will write under the chosen build directory.

✓ Install Mechanism

No remote arbitrary binary blobs or obscure download hosts are used; the build script clones from github.com/ggerganov/llama.cpp — a known upstream repository — and builds locally via cmake. No extract-from-unknown-URL operations detected.

✓ Credentials

The skill declares no environment variables or credentials. It references an optional LLAMA_BACKEND env var in docs (expected). It does not request or use tokens/secret env vars. Git operations are against a public GitHub repo and should not require credentials.

✓ Persistence & Privilege

The skill is not always-enabled and does not alter other skills or system-wide configuration. It creates/clobbers files under the chosen build directory (default ~/Repo/llama.cpp) and output directory (default ./benchmark_results), which is expected for a build/benchmark tool.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install llamacpp-bench
After installation, invoke the skill by name or use /llamacpp-bench
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release: benchmark GGUF models with llama-bench, auto-detect llama-bench, batch benchmarking, and build/update llama.cpp from source

Metadata

Slug llamacpp-bench

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is llama.cpp Benchmark?

Run llama.cpp benchmarks on GGUF models to measure prompt processing (pp) and token generation (tg) performance. Use when the user wants to benchmark LLM mod... It is an AI Agent Skill for Claude Code / OpenClaw, with 119 downloads so far.

How do I install llama.cpp Benchmark?

Run "/install llamacpp-bench" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is llama.cpp Benchmark free?

Yes, llama.cpp Benchmark is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does llama.cpp Benchmark support?

llama.cpp Benchmark is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created llama.cpp Benchmark?

It is built and maintained by alexhegit (@alexhegit); the current version is v1.0.0.

More Skills