← Back to Skills Marketplace
alexhegit

llama.cpp Benchmark

by alexhegit · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
119
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install llamacpp-bench
Description
Run llama.cpp benchmarks on GGUF models to measure prompt processing (pp) and token generation (tg) performance. Use when the user wants to benchmark LLM mod...
README (SKILL.md)

llamacpp-bench

Run standardized benchmarks on GGUF models using llama.cpp's llama-bench tool.

Quick Start

# Basic benchmark
llama-bench -m model.gguf -p 512,1024,2048 -n 128,256 -ngl 99

# With specific backend
LLAMA_BACKEND=vulkan llama-bench -m model.gguf -p 512,1024,2048 -n 128,256 -ngl 99

Benchmark Parameters

Parameter Description Default
-m Model path (GGUF file) required
-p Prompt sizes to test 512
-n Generation lengths to test 128
-ngl GPU layers to offload 99
-t CPU threads auto
-dev Device selection auto

Standard Test Suite

For consistent comparisons across models, use:

-p 512,1024,2048 -n 128,256 -ngl 99

This tests:

  • Prompt processing: 512, 1024, 2048 tokens
  • Token generation: 128, 256 tokens

Interpreting Results

Metric Meaning Good Performance
pp512 Prompt processing speed at 512 tokens >1000 t/s
pp1024 Prompt processing speed at 1024 tokens >1000 t/s
pp2048 Prompt processing speed at 2048 tokens >1000 t/s
tg128 Token generation speed (128 tokens) >50 t/s
tg256 Token generation speed (256 tokens) >50 t/s

Backend Selection

llama-bench auto-detects available backends. Priority order:

  1. CUDA (NVIDIA GPUs)
  2. ROCm (AMD GPUs)
  3. Vulkan (cross-platform GPU)
  4. CPU (fallback)

To force a backend, set environment variable or check build:

# Check available backends
llama-bench --help | grep -i "backend\|cuda\|rocm\|vulkan"

Batch Benchmarking

Use the provided script for benchmarking multiple models:

./scripts/benchmark_models.sh /path/to/models/*.gguf

Saving Results

Output can be redirected to a file:

llama-bench -m model.gguf -p 512,1024,2048 -n 128,256 -ngl 99 > results.txt

Or use the benchmark script which auto-saves to timestamped files.

Common Issues

  1. Out of memory: Reduce -ngl (GPU layers) or test smaller prompt sizes
  2. Slow CPU performance: Ensure -t matches CPU core count
  3. Backend not found: Check llama.cpp was built with the desired backend

Building / Updating llama.cpp

Check Current Version

./scripts/build_llamacpp.sh -v

Shows:

  • Current Git commit and branch
  • Build date
  • Whether behind upstream
  • Available backends

Build or Update

# Interactive mode (prompts for backend selection)
./scripts/build_llamacpp.sh -u

# Specify backend directly
./scripts/build_llamacpp.sh -u -b vulkan   # Vulkan (AMD/Intel GPUs)
./scripts/build_llamacpp.sh -u -b cuda     # CUDA (NVIDIA GPUs)
./scripts/build_llamacpp.sh -u -b rocm     # ROCm (AMD GPUs)
./scripts/build_llamacpp.sh -u -b cpu      # CPU only

# Clean rebuild
./scripts/build_llamacpp.sh -c -b vulkan

# Custom build directory
./scripts/build_llamacpp.sh -u -b cuda -d /custom/path

Build Options

Flag Description
-v Show version info and exit
-u Update to latest from GitHub
-c Clean build (remove existing)
-b Backend: vulkan, cuda, rocm, cpu
-d Build directory path
-j Parallel jobs (default: CPU count)

Finding llama-bench

The benchmark script auto-detects llama-bench in these locations:

  • /DATA/Benchmark/llama.cpp/build/bin/llama-bench
  • ~/Repo/llama.cpp/build/bin/llama-bench
  • ~/lab/build/bin/llama-bench

If not found, it will search your home directory or you can build it using the script above.

Usage Guidance
This skill appears to do what it says: it will clone/update the llama.cpp GitHub repo and build llama-bench, then run local benchmarks on GGUF files. Before installing: 1) Be prepared to install and run build tools (git, cmake, make/ninja, a C/C++ compiler) — the metadata doesn't list these dependencies. 2) Expect the build to use network access to GitHub and to write files under ~/Repo/llama.cpp and whatever output directory you choose. 3) The benchmark script searches your home directory and /DATA to find llama-bench; this only reads local paths but can traverse many files and may take time. 4) If you need to be extra cautious, review the upstream repository (https://github.com/ggerganov/llama.cpp) and run the build inside a sandbox or VM, and ensure you have sufficient disk space and GPU drivers for the chosen backend.
Capability Analysis
Type: OpenClaw Skill Name: llamacpp-bench Version: 1.0.0 The skill bundle provides legitimate tools for benchmarking LLM models using llama.cpp. The included bash scripts (benchmark_models.sh and build_llamacpp.sh) perform standard tasks such as searching for local executables, cloning the official llama.cpp repository from GitHub, and compiling the source code using CMake. No evidence of data exfiltration, persistence mechanisms, or malicious prompt injection was found.
Capability Assessment
Purpose & Capability
The skill's scripts and SKILL.md match the stated purpose: finding/building llama.cpp and running llama-bench. One minor inconsistency: the package metadata declares no required binaries, but the build/benchmark scripts assume tools like git, cmake, a C/C++ toolchain, and typical UNIX utilities (find, grep, make). These are expected for building llama.cpp but should be declared.
Instruction Scope
Runtime instructions and scripts are narrowly scoped to cloning/updating the llama.cpp repository, building it, and running llama-bench on local GGUF files. The benchmark script searches the user's home directory and /DATA to locate llama-bench (find ~ /DATA ...) — this is local-only scanning (no remote upload) but may traverse many user files. The build script runs git fetch/pull/clone (network access to GitHub) and compiles code locally; it may prompt interactively and will write under the chosen build directory.
Install Mechanism
No remote arbitrary binary blobs or obscure download hosts are used; the build script clones from github.com/ggerganov/llama.cpp — a known upstream repository — and builds locally via cmake. No extract-from-unknown-URL operations detected.
Credentials
The skill declares no environment variables or credentials. It references an optional LLAMA_BACKEND env var in docs (expected). It does not request or use tokens/secret env vars. Git operations are against a public GitHub repo and should not require credentials.
Persistence & Privilege
The skill is not always-enabled and does not alter other skills or system-wide configuration. It creates/clobbers files under the chosen build directory (default ~/Repo/llama.cpp) and output directory (default ./benchmark_results), which is expected for a build/benchmark tool.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install llamacpp-bench
  3. After installation, invoke the skill by name or use /llamacpp-bench
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release: benchmark GGUF models with llama-bench, auto-detect llama-bench, batch benchmarking, and build/update llama.cpp from source
Metadata
Slug llamacpp-bench
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is llama.cpp Benchmark?

Run llama.cpp benchmarks on GGUF models to measure prompt processing (pp) and token generation (tg) performance. Use when the user wants to benchmark LLM mod... It is an AI Agent Skill for Claude Code / OpenClaw, with 119 downloads so far.

How do I install llama.cpp Benchmark?

Run "/install llamacpp-bench" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is llama.cpp Benchmark free?

Yes, llama.cpp Benchmark is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does llama.cpp Benchmark support?

llama.cpp Benchmark is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created llama.cpp Benchmark?

It is built and maintained by alexhegit (@alexhegit); the current version is v1.0.0.

💬 Comments