← Back to Skills Marketplace

Gemma Gemma3

Name: Gemma Gemma3
Author: twinsgeeks

by Twin Geeks · GitHub ↗ · v1.0.1 · MIT-0

darwinlinuxwindows ✓ Security Clean

177

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install gemma-gemma3

Description

Gemma 3 by Google — run Gemma 3 (4B, 12B, 27B) across your local device fleet. Google's most capable open model with 128K context, strong coding, and multili...

README (SKILL.md)

Gemma 3 — Run Google's Open Models Across Your Fleet

Gemma 3 is Google's most capable open-source LLM family. 128K context window, strong coding performance, multilingual support across 140+ languages. The fleet router picks the best device for every request — no manual load balancing.

Supported Gemma models

Model	Parameters	Ollama name	Best for
Gemma 3 27B	27B	`gemma3:27b`	Highest quality — rivals much larger models
Gemma 3 12B	12B	`gemma3:12b`	Balanced quality and speed
Gemma 3 4B	4B	`gemma3:4b`	Fast, runs on low-RAM devices
Gemma 3 1B	1B	`gemma3:1b`	Ultra-light, instant responses
CodeGemma 7B	7B	`codegemma`	Code-focused variant

Quick start

pip install ollama-herd    # PyPI: https://pypi.org/project/ollama-herd/
herd                       # start the router (port 11435)
herd-node                  # run on each device — finds the router automatically

No models are downloaded during installation. Models are pulled on demand when a request arrives, or manually via the dashboard. All pulls require user confirmation.

Use Gemma through the fleet

OpenAI SDK (drop-in replacement)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

# Gemma 3 27B for complex reasoning
response = client.chat.completions.create(
    model="gemma3:27b",
    messages=[{"role": "user", "content": "Explain quantum entanglement to a 10-year-old"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

Code generation with CodeGemma

response = client.chat.completions.create(
    model="codegemma",
    messages=[{"role": "user", "content": "Write a binary search tree in Rust with insert, delete, and search"}],
)
print(response.choices[0].message.content)

curl (Ollama format)

# Gemma 3 27B
curl http://localhost:11435/api/chat -d '{
  "model": "gemma3:27b",
  "messages": [{"role": "user", "content": "Translate to Japanese: The weather is beautiful today"}],
  "stream": false
}'

curl (OpenAI format)

curl http://localhost:11435/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gemma3:4b", "messages": [{"role": "user", "content": "Hello"}]}'

Which Gemma for your hardware

Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

Device	RAM	Best Gemma model
MacBook Air (8GB)	8GB	`gemma3:1b` — instant responses
Mac Mini (16GB)	16GB	`gemma3:4b` — strong for its size
Mac Mini (24GB)	24GB	`gemma3:12b` — great balance
MacBook Pro (36GB)	36GB	`gemma3:27b` — full power
Mac Studio (64GB+)	64GB+	`gemma3:27b` + `codegemma` simultaneously

Why Gemma locally

128K context — process entire codebases and long documents
140+ languages — multilingual without switching models
Google quality, zero cost — no per-token charges after hardware
Privacy — all data stays on your network
Fleet routing — multiple machines share the load

Check what's running

# Models loaded in memory
curl -s http://localhost:11435/api/ps | python3 -m json.tool

# Fleet health
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

Web dashboard at http://localhost:11435/dashboard — live monitoring.

Also available on this fleet

Other LLMs

Llama 3.3, Qwen 3.5, DeepSeek-V3, DeepSeek-R1, Phi 4, Mistral, Codestral — same endpoint.

Image generation

curl -o image.png http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "a gemstone catching light", "width": 1024, "height": 1024}'

Speech-to-text

curl http://localhost:11435/api/transcribe -F "[email protected]" -F "model=qwen3-asr"

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Google Gemma open source language model"}'

Full documentation

Contribute

Ollama Herd is open source (MIT). Stars, issues, and PRs welcome — from humans and AI agents alike:

GitHub — 444 tests, fully async, CLAUDE.md makes AI agents productive instantly
Found a bug? Open an issue
Want to add a feature? Fork, branch, PR — the test suite runs in under 40 seconds

Guardrails

Model downloads require explicit user confirmation — Gemma models range from 1GB (1B) to 16GB (27B).
Model deletion requires explicit user confirmation.
Never delete or modify files in ~/.fleet-manager/.
No models are downloaded automatically — all pulls are user-initiated or require opt-in via auto_pull.

Usage Guidance

This skill is internally consistent with its purpose, but before installing you should: 1) Verify the upstream project and PyPI package (https://github.com/geeks-accelerator/ollama-herd and the PyPI package 'ollama-herd') to ensure they are official/trustworthy and inspect the code if possible; 2) Prefer pinning a known-good package version rather than installing an unpinned latest; 3) Run installation/testing in an isolated environment (VM/container) first; 4) Be aware that running 'herd'/'herd-node' opens a local network service (port 11435) and may pull multi-gigabyte model files — restrict network/firewall access to trusted hosts and confirm that model downloads truly require explicit confirmation; 5) Review ~/.fleet-manager/* logs/configs for sensitive data and follow the documented guardrails rather than blindly deleting/modifying files. If you cannot verify the package source or code, treat the installation as higher risk.

Capability Assessment

✓ Purpose & Capability

The name/description claim (run Gemma models locally across a fleet via an Ollama Herd router) matches the instructions: pip-install an 'ollama-herd' package and run 'herd' and 'herd-node' to provide a local endpoint. Required binaries (curl/wget) and optional python/pip are reasonable for this functionality.

✓ Instruction Scope

SKILL.md stays on-topic: it tells the agent to install/run the herd/router, how to call the local API (localhost:11435), how to check status, and documents model choices and guardrails (downloads require user confirmation). It does not instruct reading unrelated system files or exfiltrating secrets.

ℹ Install Mechanism

There is no built-in install spec; the instructions tell the user to 'pip install ollama-herd' from PyPI. Installing a third-party package and running a network service is expected for this use case, but it is a higher-risk action because the package code executes locally and is not vetted by this scanner.

✓ Credentials

The skill declares no required environment variables or credentials. Metadata references a couple of config paths (~/.fleet-manager/...), which are plausible for a fleet manager and are mentioned in the guardrails (do not modify). There are no unexplained secret requests.

✓ Persistence & Privilege

The skill is not always-enabled and does not request elevated platform privileges. It instructs running a local service (herd) and per-node agents (herd-node), which is appropriate for a fleet router and does not modify other skill configurations.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install gemma-gemma3
After installation, invoke the skill by name or use /gemma-gemma3
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

Cross-platform support: macOS, Linux, and Windows. Updated OS metadata, descriptions, and hardware recommendations.

v1.0.0

- Initial release of Gemma 3 support via Ollama Herd for Mac and Linux. - Run Gemma 3 (4B, 12B, 27B, 1B) and CodeGemma 7B models locally, routed across your device fleet. - 128K context, strong multilingual and coding abilities, with zero cloud costs. - Fleet routing automatically balances requests to the best available machine. - Built-in privacy: all data stays on your network; models downloaded only with user confirmation. - Additional features include dashboard monitoring, compatibility with major LLMs, image generation, speech-to-text, and embeddings.

Metadata

Slug gemma-gemma3

Version 1.0.1

License MIT-0

All-time Installs 2

Active Installs 2

Total Versions 2

Frequently Asked Questions

What is Gemma Gemma3?

Gemma 3 by Google — run Gemma 3 (4B, 12B, 27B) across your local device fleet. Google's most capable open model with 128K context, strong coding, and multili... It is an AI Agent Skill for Claude Code / OpenClaw, with 177 downloads so far.

How do I install Gemma Gemma3?

Run "/install gemma-gemma3" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Gemma Gemma3 free?

Yes, Gemma Gemma3 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Gemma Gemma3 support?

Gemma Gemma3 is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin, linux, windows).

Who created Gemma Gemma3?

It is built and maintained by Twin Geeks (@twinsgeeks); the current version is v1.0.1.

More Skills