← Back to Skills Marketplace

Ollama Load Balancer

Name: Ollama Load Balancer
Author: twinsgeeks

by Twin Geeks · GitHub ↗ · v1.0.4 · MIT-0

darwinlinuxwindows ⚠ pending

256

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install ollama-load-balancer

Description

Ollama load balancer for Llama, Qwen, DeepSeek, and Mistral inference across multiple machines. Load balancing with auto-discovery via mDNS, health checks, q...

Usage Guidance

This skill appears internally consistent for running a local Ollama load balancer, but it relies on you pip installing a third-party package and running local servers that manage model downloads and expose admin HTTP endpoints. Before installing: (1) review the ollama-herd PyPI package and GitHub repo to ensure the code matches expectations, (2) prefer installing in a sandbox/VM or isolated network, (3) disable or review auto-pull behavior to avoid unexpected large downloads, and (4) restrict access to the daemon's HTTP port (localhost-only or firewall) to prevent unauthorized remote control.

Capability Assessment

✓ Purpose & Capability

Name/description (Ollama load balancer) matches the runtime instructions: auto-discovery, health checks, routing, and admin HTTP endpoints. Required binaries (curl/wget) and optional python/pip/sqlite3 are appropriate for a Python-based local service.

✓ Instruction Scope

SKILL.md instructs the agent to pip install ollama-herd and run local commands (herd / herd-node) and to call local HTTP endpoints on localhost:11435. It does not instruct reading unrelated system files or exfiltrating secrets. It does include administrative endpoints (pull/delete models) — expected for a load-balancer but potentially powerful if misused.

ℹ Install Mechanism

There is no registry install spec; the README tells the user to pip install ollama-herd from PyPI. pip installs are common but will execute third-party code on the host — moderate risk. The SKILL.md points to a PyPI project and GitHub repo (traceable), which is better than an arbitrary download, but users should review the package/source before installing.

✓ Credentials

The skill declares no required environment variables or credentials. The use of FLEET_MAX_RETRIES and runtime settings is plausible for a load balancer; no unexplained secret access is requested.

ℹ Persistence & Privilege

always:false (normal). The instructions create and use local config paths (~/.fleet-manager/...), run long-lived processes, and expose an admin HTTP interface. This is consistent with the stated functionality but means the package will persist on disk and run services — run with appropriate isolation and review.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install ollama-load-balancer
After installation, invoke the skill by name or use /ollama-load-balancer
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.4

Cross-platform support: macOS, Linux, and Windows. Updated OS metadata, descriptions, and hardware recommendations.

v1.0.3

Version 1.0.3 - Expanded internationalization in the description (added Chinese and Spanish). - Reworded across the documentation to consistently refer to "load balancer" for clarity. - Enhanced deployment and API examples for more explicit load balancer usage. - Updated feature and endpoint explanations to clarify their relation to the load balancer. - Incremented version metadata to 1.0.3.

v1.1.0

- Updated the description for improved clarity and to highlight support for Llama, Qwen, DeepSeek, and Mistral. - Reduced the zombie reaper’s request cleanup threshold from 15 minutes to 10 minutes. - No changes to functionality or API; documentation only.

v1.0.2

- Updated version to 1.0.2 in SKILL.md. - Clarified that auto-pull of missing models is now optional and disabled by default; enable via settings API. - Updated metadata for configPaths location. - No code or feature changes; documentation and metadata only.

v1.0.1

- Added "optionalBins" and "configPaths" fields to the metadata section in SKILL.md. - Now explicitly lists optional dependencies: python3, sqlite3, and pip. - Declares expected config and log file locations for fleet manager operation. - No changes to code or user-facing features.

v1.0.0

Initial release. - Load balances Ollama inference across multiple machines with automatic discovery, health checks, and real-time monitoring dashboard. - Built-in queue management, zero configuration setup, and automatic failover with retry logic. - Includes zombie request cleanup, VRAM-aware model routing, and operational analytics via SQL queries. - Web dashboard provides fleet status, trends, insights, and runtime toggles. - Exposes multiple REST API endpoints for fleet health, request traces, usage, and model management. - Designed for high availability and operational visibility in Ollama deployments.

Metadata

Slug ollama-load-balancer

Version 1.0.4

License MIT-0

All-time Installs 3

Active Installs 3

Total Versions 6

Frequently Asked Questions

What is Ollama Load Balancer?

Ollama load balancer for Llama, Qwen, DeepSeek, and Mistral inference across multiple machines. Load balancing with auto-discovery via mDNS, health checks, q... It is an AI Agent Skill for Claude Code / OpenClaw, with 256 downloads so far.

How do I install Ollama Load Balancer?

Run "/install ollama-load-balancer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ollama Load Balancer free?

Yes, Ollama Load Balancer is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ollama Load Balancer support?

Ollama Load Balancer is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin, linux, windows).

Who created Ollama Load Balancer?

It is built and maintained by Twin Geeks (@twinsgeeks); the current version is v1.0.4.

More Skills