Local Model Optimizer
/install local-model-optimizer
Local Model Optimizer
Auto-detect hardware → recommend models → configure Ollama → set up hybrid cloud/local routing.
Quick Start
# Full auto-setup: detect hardware, install Ollama, recommend + pull model, configure routing
python3 scripts/local-model-optimizer.py auto
# Hardware detection only
python3 scripts/local-model-optimizer.py detect
# Recommend models for your hardware (no install)
python3 scripts/local-model-optimizer.py recommend
# Set up hybrid routing (cloud for complex tasks, local for simple ones)
python3 scripts/local-model-optimizer.py routing
# Cost comparison: local vs cloud
python3 scripts/local-model-optimizer.py cost
Commands
auto — Full Automated Setup
- Detects GPU (NVIDIA/AMD/Apple Silicon), VRAM, RAM, CPU cores
- Queries Ollama model registry for compatible models
- Recommends top 3 models ranked by benchmark/size ratio
- Installs Ollama if not present
- Pulls recommended model
- Configures OpenClaw provider entry
- Sets up hybrid routing rules
- Runs verification test
detect — Hardware Detection
Reports:
- GPU model, VRAM, driver version (NVIDIA/AMD/Apple)
- System RAM (total/available)
- CPU model, core count, architecture
- Estimated model size capacity
- Compatibility tier: Tiny (≤4GB) / Small (4-8GB) / Medium (8-16GB) / Large (16-32GB) / XL (32GB+)
recommend — Model Recommendations
Based on hardware tier, recommends from:
| Tier | VRAM | Models |
|---|---|---|
| Tiny | ≤4GB | Gemma 4 E2B, Phi-3.5 Mini, Qwen2.5-3B |
| Small | 4-8GB | Gemma 4 E4B, Llama 3.1 8B, Mistral 7B |
| Medium | 8-16GB | Gemma 4 12B, Llama 3.1 8B Q8, CodeGemma |
| Large | 16-32GB | Gemma 4 27B, Llama 3.1 70B Q4, Mixtral 8x7B |
| XL | 32GB+ | Gemma 4 27B Q8, Llama 3.1 70B Q8, DeepSeek V2 |
See references/model-matrix.md for full benchmark comparisons.
routing — Hybrid Cloud/Local Routing
Configures OpenClaw to route requests intelligently:
- Local: Simple Q&A, summarization, code completion, memory operations
- Cloud: Complex reasoning, multi-step planning, code generation, creative writing
Options:
--strategy cost— minimize API spend (prefer local)--strategy quality— maximize output quality (prefer cloud)--strategy balanced— default, smart routing based on task complexity--cloud-provider \x3Cname>— which cloud provider for fallback (default: anthropic)
cost — Cost Analysis
Calculates monthly savings based on:
- Current API usage pattern (reads from OpenClaw logs if available)
- Estimated electricity cost for local inference
- Token throughput comparison
- Break-even analysis for hardware investment
Configuration
The optimizer writes to ~/.openclaw/local-model-config.json:
{
"hardware": { "gpu": "...", "vram_gb": 16, "ram_gb": 32, "tier": "Large" },
"model": { "name": "gemma4:27b", "quantization": "Q4_K_M", "size_gb": 15.2 },
"routing": { "strategy": "balanced", "local_tasks": [...], "cloud_tasks": [...] },
"performance": { "tokens_per_sec": 42, "first_token_ms": 180 }
}
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install local-model-optimizer - 安装完成后,直接呼叫该 Skill 的名称或使用
/local-model-optimizer触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Local Model Optimizer 是什么?
Auto-detect hardware (GPU VRAM, system RAM, CPU), recommend optimal local models from Ollama registry, configure Ollama with tuned parameters, and set up hyb... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 90 次。
如何安装 Local Model Optimizer?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install local-model-optimizer」即可一键安装,无需额外配置。
Local Model Optimizer 是免费的吗?
是的,Local Model Optimizer 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Local Model Optimizer 支持哪些平台?
Local Model Optimizer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Local Model Optimizer?
由 stevojarvisai-star(@stevojarvisai-star)开发并维护,当前版本 v1.0.0。