← Back to Skills Marketplace
alexsjones

llmfit

by Alex Jones · GitHub ↗ · v0.2.2
cross-platform ⚠ suspicious
960
Downloads
1
Stars
7
Active Installs
2
Versions
Install in OpenClaw
/install llmfit
Description
Detect local hardware (RAM, CPU, GPU/VRAM) and recommend the best-fit local LLM models with optimal quantization, speed estimates, and fit scoring.
README (SKILL.md)

llmfit-advisor

Hardware-aware local LLM advisor. Detects your system specs (RAM, CPU, GPU/VRAM) and recommends models that actually fit, with optimal quantization and speed estimates.

When to use (trigger phrases)

Use this skill immediately when the user asks any of:

  • "what local models can I run?"
  • "which LLMs fit my hardware?"
  • "recommend a local model"
  • "what's the best model for my GPU?"
  • "can I run Llama 70B locally?"
  • "configure local models"
  • "set up Ollama models"
  • "what models fit my VRAM?"
  • "help me pick a local model for coding"

Also use this skill when:

  • The user wants to configure models.providers.ollama or models.providers.lmstudio
  • The user mentions running models locally and you need to know what fits
  • A model recommendation is needed and the user has local inference capability (Ollama, vLLM, LM Studio)

Quick start

Detect hardware

llmfit --json system

Returns JSON with CPU, RAM, GPU name, VRAM, multi-GPU info, and whether memory is unified (Apple Silicon).

Get top recommendations

llmfit recommend --json --limit 5

Returns the top 5 models ranked by a composite score (quality, speed, fit, context) with optimal quantization for the detected hardware.

Filter by use case

llmfit recommend --json --use-case coding --limit 3
llmfit recommend --json --use-case reasoning --limit 3
llmfit recommend --json --use-case chat --limit 3

Valid use cases: general, coding, reasoning, chat, multimodal, embedding.

Filter by minimum fit level

llmfit recommend --json --min-fit good --limit 10

Valid fit levels (best to worst): perfect, good, marginal.

Understanding the output

System JSON

{
  "system": {
    "cpu_name": "Apple M2 Max",
    "cpu_cores": 12,
    "total_ram_gb": 32.0,
    "available_ram_gb": 24.5,
    "has_gpu": true,
    "gpu_name": "Apple M2 Max",
    "gpu_vram_gb": 32.0,
    "gpu_count": 1,
    "backend": "Metal",
    "unified_memory": true
  }
}

Recommendation JSON

Each model in the models array includes:

Field Meaning
name HuggingFace model ID (e.g. meta-llama/Llama-3.1-8B-Instruct)
provider Model provider (Meta, Alibaba, Google, etc.)
params_b Parameter count in billions
score Composite score 0–100 (higher is better)
score_components Breakdown: quality, speed, fit, context (each 0–100)
fit_level Perfect, Good, Marginal, or TooTight
run_mode GPU, CPU+GPU Offload, or CPU Only
best_quant Optimal quantization for the hardware (e.g. Q5_K_M, Q4_K_M)
estimated_tps Estimated tokens per second
memory_required_gb VRAM/RAM needed at this quantization
memory_available_gb Available VRAM/RAM detected
utilization_pct How much of available memory the model uses
use_case What the model is designed for
context_length Maximum context window

Fit levels explained

  • Perfect: Model fits comfortably with room to spare. Ideal choice.
  • Good: Model fits but uses most available memory. Will work well.
  • Marginal: Model barely fits. May work but expect slower performance or reduced context.
  • TooTight: Model does not fit. Do not recommend.

Run modes explained

  • GPU: Full GPU inference. Fastest. Model weights loaded entirely into VRAM.
  • CPU+GPU Offload: Some layers on GPU, rest in system RAM. Slower than pure GPU.
  • CPU Only: All inference on CPU using system RAM. Slowest but works without GPU.

Configuring OpenClaw with results

After getting recommendations, configure the user's local model provider.

For Ollama

Map the HuggingFace model name to its Ollama tag. Common mappings:

llmfit name Ollama tag
meta-llama/Llama-3.1-8B-Instruct llama3.1:8b
meta-llama/Llama-3.3-70B-Instruct llama3.3:70b
Qwen/Qwen2.5-Coder-7B-Instruct qwen2.5-coder:7b
Qwen/Qwen2.5-72B-Instruct qwen2.5:72b
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct deepseek-coder-v2:16b
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B deepseek-r1:32b
google/gemma-2-9b-it gemma2:9b
mistralai/Mistral-7B-Instruct-v0.3 mistral:7b
microsoft/Phi-3-mini-4k-instruct phi3:mini
microsoft/Phi-4-mini-instruct phi4-mini

Then update openclaw.json:

{
  "models": {
    "providers": {
      "ollama": {
        "models": ["ollama/\x3Collama-tag>"]
      }
    }
  }
}

And optionally set as default:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "ollama/\x3Collama-tag>"
      }
    }
  }
}

For vLLM / LM Studio

Use the HuggingFace model name directly as the model identifier with the appropriate provider prefix (vllm/ or lmstudio/).

Workflow example

When a user asks "what local models can I run?":

  1. Run llmfit --json system to show hardware summary
  2. Run llmfit recommend --json --limit 5 to get top picks
  3. Present the recommendations with scores and fit levels
  4. If the user wants to configure one, map it to the appropriate Ollama/vLLM/LM Studio tag
  5. Offer to update openclaw.json with the chosen model

When a user asks for a specific use case like "recommend a coding model":

  1. Run llmfit recommend --json --use-case coding --limit 3
  2. Present the coding-specific recommendations
  3. Offer to pull via Ollama and configure

Notes

  • llmfit detects NVIDIA GPUs (via nvidia-smi), AMD GPUs (via rocm-smi), and Apple Silicon (unified memory).
  • Multi-GPU setups aggregate VRAM across cards automatically.
  • The best_quant field tells you the optimal quantization — higher quant (Q6_K, Q8_0) means better quality if VRAM allows.
  • Speed estimates (estimated_tps) are approximate and vary by hardware and quantization.
  • Models with fit_level: "TooTight" should never be recommended to users.
Usage Guidance
This skill appears to do what it says (run the llmfit CLI and recommend models), but the install metadata is inconsistent and the Homebrew tap is a third-party source. Before installing or running anything: 1) ask the maintainer for the upstream source or GitHub repo and a homepage; 2) inspect the Homebrew formula or cargo package code on a trusted repo; 3) prefer installing from an official, traceable release (GitHub releases or crates.io) rather than an anonymous tap; 4) if you must run the binary first, run `llmfit --version` and `llmfit --json system` in a sandbox/container to inspect output; 5) do not provide credentials or elevated privileges to this tool. If you want help vetting the brew formula or package repository, provide the install URL and I can point out risky patterns.
Capability Analysis
Type: OpenClaw Skill Name: llmfit Version: 0.2.2 The skill bundle is benign. It serves as a wrapper for the `llmfit` command-line tool, providing instructions for its installation via `brew` or `cargo`, execution to gather system hardware information and model recommendations, and subsequent configuration of the OpenClaw agent's `openclaw.json` with the chosen local LLM. All actions, including external binary execution and configuration file modification, are transparently documented in `SKILL.md` and directly align with the stated purpose of recommending and configuring local LLMs based on hardware capabilities. There is no evidence of data exfiltration, unauthorized command execution, persistence mechanisms, or prompt injection designed to subvert the agent for malicious purposes.
Capability Assessment
Purpose & Capability
Name and description match the runtime instructions: the SKILL.md tells the agent to run the llmfit CLI to detect hardware and produce model recommendations. Required binaries (llmfit) are consistent with that purpose and no unrelated credentials or files are requested.
Instruction Scope
Instructions are narrowly scoped to running llmfit commands (system, recommend) and mapping outputs to local providers (Ollama, vLLM, LM Studio). They do not instruct reading arbitrary system files or exfiltrating secrets. The skill suggests editing openclaw.json to configure models, which is coherent with its goal.
Install Mechanism
Install metadata is inconsistent and potentially risky: the SKILL.md / registry lists a Homebrew formula 'AlexsJones/llmfit' (a third‑party tap) and a second install entry that is labeled as 'cargo install llmfit' but is marked kind: 'node' (and registry also lists 'node'). This mismatch (node vs cargo label) is sloppy and prevents clear vetting of the install source. Homebrew taps from unknown owners should be reviewed before use because they install binaries from third parties.
Credentials
The skill requests no environment variables or credentials. That is proportionate for a local hardware-detection and recommendation tool.
Persistence & Privilege
always is false and the skill does not request or auto-modify other skills' configs. It only recommends edits to openclaw.json (user-driven). No privileged or persistent presence is requested by the skill metadata or instructions.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install llmfit
  3. After installation, invoke the skill by name or use /llmfit
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.2.2
- Updated the Homebrew install instructions and formula path in the skill metadata. - No functional or documentation changes beyond the installation method update.
v0.2.1
llmfit-advisor v1.0.0 - Initial release of a hardware-aware LLM recommendation tool. - Detects local system specs (RAM, CPU, GPU/VRAM, unified memory). - Recommends best-fit local LLMs with optimal quantization, speed estimates, and fit scoring. - Supports filtering by use case (coding, reasoning, chat, etc.) and fit level. - Provides detailed JSON outputs for both hardware and model recommendations. - Includes step-by-step guidance for configuring local model providers (Ollama, vLLM, LM Studio).
Metadata
Slug llmfit
Version 0.2.2
License
All-time Installs 7
Active Installs 7
Total Versions 2
Frequently Asked Questions

What is llmfit?

Detect local hardware (RAM, CPU, GPU/VRAM) and recommend the best-fit local LLM models with optimal quantization, speed estimates, and fit scoring. It is an AI Agent Skill for Claude Code / OpenClaw, with 960 downloads so far.

How do I install llmfit?

Run "/install llmfit" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is llmfit free?

Yes, llmfit is completely free (open-source). You can download, install and use it at no cost.

Which platforms does llmfit support?

llmfit is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created llmfit?

It is built and maintained by Alex Jones (@alexsjones); the current version is v0.2.2.

💬 Comments