← 返回 Skills 市场

Model Throughput Tester

Name: Model Throughput Tester
Author: tsag1

作者 TSAG1 · GitHub ↗ · v1.0.3 · MIT-0

cross-platform ⚠ pending

总下载

当前安装

版本数

在 OpenClaw 中安装

/install model-throughput-tester

功能描述

Automation skill for Model Throughput Tester.

使用说明 (SKILL.md)

Model Throughput Tester

测试 LLM 模型的吞吐率（tokens/s）。支持两种模式：

Auto 模式：通过 openclaw infer model run 测试当前模型，无需 API Key
API 模式：直接调用 OpenAI 兼容 API，需要 URL 和 Key

触发规则

当用户提到以下关键词时，默认使用 Auto 模式测试当前 session 模型，无需 API Key：

吞吐率、tokens/s、测速、模型速度、模型性能、benchmark
模型测试、模型吞吐率测试、测一下吞吐率

Agent 触发时应自动执行：

python3 throughput.py --auto --model "\x3C当前session模型>"

不要询问用户是否需要 Key 或其他参数，直接用 Auto 模式开测。

核心能力

1. Auto 模式（无 Key，推荐）

自动检测当前 session 的模型并测试吞吐率，无需任何配置。

python3 throughput.py --auto

指定模型测试：

python3 throughput.py --auto --model "zai/glm-5-turbo"

2. API 模式（直接调用 API）

python3 throughput.py \
  --url https://api.example.com/v1 \
  --key sk-xxx \
  --models gpt-4o-mini,gpt-4o

3. 通用参数

参数	默认值	说明
`--iterations`	`3`	每个模型测试次数
`--max-tokens`	`512`	最大输出 token 数
`--test-prompt`	英文散文（夏天的田野）	测试提示词
`--timeout`	`60`	单次请求超时（秒）
`--output`	`throughput-report.md`	输出报告文件名
`--csv`	false	同时生成 CSV

Workflow

Auto 模式流程

1. 从 openclaw.json 读取当前 session 模型（provider/model）
2. 通过 openclaw infer model run 发送测试 prompt
3. 计时：命令开始 → 输出完成
4. 从返回文本估算 token 数（英文 0.75 word/token，中文 1.5 字/token）
5. 计算 tokens/s
6. 汇总输出报告

API 模式流程

1. 构造 /v1/chat/completions 请求
2. 计时：请求开始 → 最后一个 token
3. 从响应中提取 usage.completion_tokens（精确）
4. 计算 tokens/s、错误率
5. 汇总输出报告

指标定义

指标	说明
Tokens/s	吞吐率 = Output Tokens / Elapsed Time
Avg Latency	平均单次请求延迟
Avg Output Tokens	平均输出 token 数
Error Rate	错误请求占比

输出示例

# Model Throughput Report
**Mode:** Auto (openclaw infer)
**Iterations:** 3

## Summary
| Model | Avg Tokens/s | Avg Latency(s) | Avg Output Tokens | Error Rate |
|-------|-------------|----------------|-------------------|------------|
| zai/glm-5-turbo | 57.9 | 20.6 | 979.0 | 0.0% |

## Detail
### zai/glm-5-turbo
| Iter | Latency(s) | Output Tokens | Tokens/s | Status |
|------|------------|--------------|---------|--------|
| 1 | 19.5 | 950 | 48.7 | ✅ |
| 2 | 21.3 | 1010 | 47.4 | ✅ |
| 3 | 20.9 | 977 | 46.7 | ✅ |

错误处理

场景	Auto 模式	API 模式
未安装 openclaw	cli_error	—
模型不存在	api_error	http_404
网络超时	timeout	timeout
Token 估算	英文 0.75 word/token，中文 1.5 字/token	API 返回精确值

使用示例

安装后立即测试（Auto 模式）

# agent 触发时应传入当前模型
python3 ~/.openclaw/workspace/skills/model-throughput-tester/throughput.py --auto --model "\x3C当前session模型>"

# 或使用自动检测（可能不是 session 覆盖的模型）
python3 ~/.openclaw/workspace/skills/model-throughput-tester/throughput.py --auto

测试多个模型（API 模式）

python3 throughput.py \
  --url "https://api.openai.com/v1" \
  --key "sk-xxx" \
  --models "gpt-4o-mini,gpt-4o" \
  --iterations 5

自定义提示词

python3 throughput.py --auto \
  --test-prompt "Explain quantum computing in detail." \
  --iterations 5

技术实现

Auto 模式：openclaw infer model run --json，Python subprocess 调用
API 模式：urllib（Python 内置），OpenAI 兼容 /v1/chat/completions
计时精度：time.perf_counter() 纳秒级精度
Token 计数：API 模式优先 usage.completion_tokens（精确），Auto 模式按字符估算
URL 拼接：智能检测 /v1、/v4、/chat/completions 路径

注意事项

Auto 模式的吞吐率包含网关路由开销，会比直接 API 略低（约 1-3%）
Auto 模式 Token 数为估算值，API 模式为精确值
建议使用英文 prompt 以获得更准确的 token 估算
防缓存：每次迭代自动附加随机 seed 后缀

能力标签

requires-sensitive-credentials

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install model-throughput-tester
安装完成后，直接呼叫该 Skill 的名称或使用 /model-throughput-tester 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.3

新增触发规则：提到吞吐率/测速时默认使用Auto模式自动测试当前session模型

v1.0.2

新增Auto模式：无需API Key，通过openclaw infer自动测试当前模型吞吐率；默认英文prompt提高token估算精度；支持自动检测当前session模型

v1.0.1

优化描述，加入 benchmark、模型评测、延迟测试等搜索关键词；修复吞吐率计时 bug（urlopen 响应时间）和 reasoning_content 读取；添加 cache hit 检测；随机 prompt 后缀防缓存

v1.0.0

初始版本：测试 OpenAI 兼容 API 吞吐率，多模型批量、Markdown+CSV 报告

元数据

Slug model-throughput-tester

版本 1.0.3

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 4

常见问题