← Back to Skills Marketplace
twinsgeeks

Ollama Load Balancer

by Twin Geeks · GitHub ↗ · v1.0.4 · MIT-0
darwinlinuxwindows ⚠ pending
256
Downloads
0
Stars
3
Active Installs
6
Versions
Install in OpenClaw
/install ollama-load-balancer
Description
Ollama load balancer for Llama, Qwen, DeepSeek, and Mistral inference across multiple machines. Load balancing with auto-discovery via mDNS, health checks, q...
Usage Guidance
This skill appears internally consistent for running a local Ollama load balancer, but it relies on you pip installing a third-party package and running local servers that manage model downloads and expose admin HTTP endpoints. Before installing: (1) review the ollama-herd PyPI package and GitHub repo to ensure the code matches expectations, (2) prefer installing in a sandbox/VM or isolated network, (3) disable or review auto-pull behavior to avoid unexpected large downloads, and (4) restrict access to the daemon's HTTP port (localhost-only or firewall) to prevent unauthorized remote control.
Capability Assessment
Purpose & Capability
Name/description (Ollama load balancer) matches the runtime instructions: auto-discovery, health checks, routing, and admin HTTP endpoints. Required binaries (curl/wget) and optional python/pip/sqlite3 are appropriate for a Python-based local service.
Instruction Scope
SKILL.md instructs the agent to pip install ollama-herd and run local commands (herd / herd-node) and to call local HTTP endpoints on localhost:11435. It does not instruct reading unrelated system files or exfiltrating secrets. It does include administrative endpoints (pull/delete models) — expected for a load-balancer but potentially powerful if misused.
Install Mechanism
There is no registry install spec; the README tells the user to pip install ollama-herd from PyPI. pip installs are common but will execute third-party code on the host — moderate risk. The SKILL.md points to a PyPI project and GitHub repo (traceable), which is better than an arbitrary download, but users should review the package/source before installing.
Credentials
The skill declares no required environment variables or credentials. The use of FLEET_MAX_RETRIES and runtime settings is plausible for a load balancer; no unexplained secret access is requested.
Persistence & Privilege
always:false (normal). The instructions create and use local config paths (~/.fleet-manager/...), run long-lived processes, and expose an admin HTTP interface. This is consistent with the stated functionality but means the package will persist on disk and run services — run with appropriate isolation and review.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ollama-load-balancer
  3. After installation, invoke the skill by name or use /ollama-load-balancer
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.4
Cross-platform support: macOS, Linux, and Windows. Updated OS metadata, descriptions, and hardware recommendations.
v1.0.3
Version 1.0.3 - Expanded internationalization in the description (added Chinese and Spanish). - Reworded across the documentation to consistently refer to "load balancer" for clarity. - Enhanced deployment and API examples for more explicit load balancer usage. - Updated feature and endpoint explanations to clarify their relation to the load balancer. - Incremented version metadata to 1.0.3.
v1.1.0
- Updated the description for improved clarity and to highlight support for Llama, Qwen, DeepSeek, and Mistral. - Reduced the zombie reaper’s request cleanup threshold from 15 minutes to 10 minutes. - No changes to functionality or API; documentation only.
v1.0.2
- Updated version to 1.0.2 in SKILL.md. - Clarified that auto-pull of missing models is now optional and disabled by default; enable via settings API. - Updated metadata for configPaths location. - No code or feature changes; documentation and metadata only.
v1.0.1
- Added "optionalBins" and "configPaths" fields to the metadata section in SKILL.md. - Now explicitly lists optional dependencies: python3, sqlite3, and pip. - Declares expected config and log file locations for fleet manager operation. - No changes to code or user-facing features.
v1.0.0
Initial release. - Load balances Ollama inference across multiple machines with automatic discovery, health checks, and real-time monitoring dashboard. - Built-in queue management, zero configuration setup, and automatic failover with retry logic. - Includes zombie request cleanup, VRAM-aware model routing, and operational analytics via SQL queries. - Web dashboard provides fleet status, trends, insights, and runtime toggles. - Exposes multiple REST API endpoints for fleet health, request traces, usage, and model management. - Designed for high availability and operational visibility in Ollama deployments.
Metadata
Slug ollama-load-balancer
Version 1.0.4
License MIT-0
All-time Installs 3
Active Installs 3
Total Versions 6
Frequently Asked Questions

What is Ollama Load Balancer?

Ollama load balancer for Llama, Qwen, DeepSeek, and Mistral inference across multiple machines. Load balancing with auto-discovery via mDNS, health checks, q... It is an AI Agent Skill for Claude Code / OpenClaw, with 256 downloads so far.

How do I install Ollama Load Balancer?

Run "/install ollama-load-balancer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ollama Load Balancer free?

Yes, Ollama Load Balancer is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ollama Load Balancer support?

Ollama Load Balancer is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin, linux, windows).

Who created Ollama Load Balancer?

It is built and maintained by Twin Geeks (@twinsgeeks); the current version is v1.0.4.

💬 Comments