← 返回 Skills 市场

TurboQuant+ KV Cache Compression

Name: TurboQuant+ KV Cache Compression
Author: wukai8289

作者 wukai8289 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

122

总下载

当前安装

版本数

在 OpenClaw 中安装

/install turboquant-plus

功能描述

TurboQuant+ compresses llama.cpp KV caches on Apple Silicon up to 6.4x with minimal quality loss, enabling larger models and longer contexts efficiently.

安全使用建议

This skill appears coherent for configuring TurboQuant+ with llama.cpp, but follow these precautions before proceeding: 1) Verify the external GitHub fork (TheTom/llama-cpp-turboquant) is the intended project and review its source/commit history before building. 2) Build and run the code in an isolated or trusted environment (container, dedicated machine) if possible. 3) Be cautious with the suggested sudo sysctl change (iogpu.wired_limit_mb): it requires elevated privileges and changes system GPU memory limits until reboot—backup important state and understand the impact. 4) Prefer official releases/tags rather than an unknown commit/branch. 5) Check checksums/signatures for any downloaded model files. If you are uncomfortable reviewing or building third-party native code, treat this as an operational risk and avoid running the build on production systems.

功能分析

Type: OpenClaw Skill Name: turboquant-plus Version: 1.0.0 The skill bundle instructs the agent to perform high-risk operations, including cloning and compiling an external GitHub repository (TheTom/llama-cpp-turboquant) and executing a system-level command with elevated privileges (sudo sysctl iogpu.wired_limit_mb) to modify GPU memory limits. Furthermore, the documentation contains likely fabricated references, such as an 'ICLR 2026' paper and a 2026 timestamp in _meta.json. While these actions are contextually related to LLM optimization, the combination of sudo requirements, unverified third-party code execution, and hallucinated references constitutes a significant security risk.

能力评估

✓ Purpose & Capability

Name/description claim KV cache compression for llama.cpp on Apple Silicon; the SKILL.md and README exclusively describe using a TurboQuant llama.cpp fork, relevant CLI flags, and platform-specific tuning. No unrelated credentials, binaries, or services are requested.

ℹ Instruction Scope

Instructions stay on-topic (clone/build the turboquant fork, run llama-server with cache-type flags). They also recommend a system-level change (sudo sysctl iogpu.wired_limit_mb) to raise GPU memory caps for large contexts — this is relevant to the stated goal but requires elevated privileges and modifies system state. No instructions collect or transmit user data to unexpected endpoints.

ℹ Install Mechanism

The skill is instruction-only (no install spec), but its README instructs cloning and building a GitHub repository (TheTom/llama-cpp-turboquant). Downloading and compiling third-party code from GitHub is common for this domain but is a moderate operational risk if the repository is untrusted or has malicious contents. The skill itself does not provide an automated installer or opaque download URLs.

✓ Credentials

No environment variables, credentials, or config paths are requested. The requested actions (build/run a local server, sysctl) are proportionate to compressing KV caches for local inference.

✓ Persistence & Privilege

Skill does not request persistent inclusion (always: false) and does not attempt to modify other skills or agent-wide configs. It does recommend a one-off privileged sysctl change (requires sudo) which alters system GPU memory limits until reboot; this is a legitimate but privileged action and not an automatic persistent installation by the skill.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install turboquant-plus
安装完成后，直接呼叫该 Skill 的名称或使用 /turboquant-plus 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

v1.0: TurboQuant+ KV缓存压缩指南，支持Apple Silicon本地LLM推理

元数据

Slug turboquant-plus

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

TurboQuant+ KV Cache Compression 是什么？

TurboQuant+ compresses llama.cpp KV caches on Apple Silicon up to 6.4x with minimal quality loss, enabling larger models and longer contexts efficiently. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 122 次。

如何安装 TurboQuant+ KV Cache Compression？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install turboquant-plus」即可一键安装，无需额外配置。

TurboQuant+ KV Cache Compression 是免费的吗？

是的，TurboQuant+ KV Cache Compression 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

TurboQuant+ KV Cache Compression 支持哪些平台？

TurboQuant+ KV Cache Compression 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 TurboQuant+ KV Cache Compression？

由 wukai8289（@wukai8289）开发并维护，当前版本 v1.0.0。