Description

Mac Studio AI — run LLMs, image generation, speech-to-text, and embeddings on your Mac Studio. M2 Ultra (192GB), M3 Ultra (512GB), M4 Max (128GB), and M4 Ult...

README (SKILL.md)

Mac Studio AI — The Most Powerful Local AI Machine

Name: Mac Studio Ai
Author: twinsgeeks

The Mac Studio is the best hardware for local AI. Mac Studio M4 Ultra with 256GB of unified memory runs 120B+ parameter models. Mac Studio M3 Ultra with 512GB loads frontier models that need 4-8 NVIDIA A100s elsewhere. The Mac Studio runs everything in one memory pool — no PCIe bottleneck.

One Mac Studio is a powerhouse. Multiple Mac Studios become a fleet.

Mac Studio configurations for AI

Mac Studio Config	Chip	Memory	GPU Cores	Mac Studio LLM Sweet Spot
Mac Studio M4 Max	M4 Max	128GB	40	70B models on Mac Studio
Mac Studio M4 Ultra	M4 Ultra	256GB	80	120B+ models on Mac Studio
Mac Studio M3 Ultra	M3 Ultra	192-512GB	76	236B models on Mac Studio
Mac Studio M2 Ultra	M2 Ultra	192GB	76	70B-120B on Mac Studio

Setup your Mac Studio

pip install ollama-herd    # install on your Mac Studio
herd                       # start Mac Studio as the router (port 11435)
herd-node                  # connect additional Mac Studios or other devices

Mac Studios discover each other automatically on your local network.

Add Mac Studio image generation

uv tool install mflux           # Flux models (~5s at 512px on Mac Studio M4 Ultra)
uv tool install diffusionkit    # Stable Diffusion 3/3.5 on Mac Studio

Use your Mac Studio for AI inference

Mac Studio LLM inference — run the biggest models

from openai import OpenAI

# Connect to Mac Studio running Ollama Herd
mac_studio = OpenAI(base_url="http://mac-studio:11435/v1", api_key="not-needed")

# 120B model — runs smoothly on Mac Studio M4 Ultra (256GB unified memory)
response = mac_studio.chat.completions.create(
    model="gpt-oss:120b",  # loaded entirely in Mac Studio unified memory
    messages=[{"role": "user", "content": "How does Mac Studio handle large AI models?"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

Mac Studio image generation

# Flux via mflux — ~5s on Mac Studio M4 Ultra
curl -o mac_studio_art.png http://mac-studio:11435/api/generate-image \
  -H "Content-Type: application/json" \
  -d '{"model": "z-image-turbo", "prompt": "a Mac Studio on a minimalist desk with holographic AI display", "width": 1024, "height": 1024}'

# Stable Diffusion 3 on Mac Studio — ~9s
curl -o mac_studio_sd3.png http://mac-studio:11435/api/generate-image \
  -H "Content-Type: application/json" \
  -d '{"model": "sd3-medium", "prompt": "Mac Studio M4 Ultra rendering AI art", "width": 1024, "height": 1024, "steps": 20}'

Mac Studio speech-to-text

# Transcribe on Mac Studio via Qwen3-ASR
curl http://mac-studio:11435/api/transcribe \
  -F "file=@mac_studio_meeting.wav" \
  -F "model=qwen3-asr"

Mac Studio embeddings

# Generate embeddings on Mac Studio
curl http://mac-studio:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Mac Studio M4 Ultra unified memory AI inference"}'

Recommended models for Mac Studio

Mac Studio Config	Models for this Mac Studio
Mac Studio M4 Max (128GB)	`llama3.3:70b`, `qwen3:72b`, `deepseek-r1:70b`, `codestral`
Mac Studio M4 Ultra (256GB)	`gpt-oss:120b`, `qwen3:110b`, two 70B models simultaneously
Mac Studio M3 Ultra (512GB)	`deepseek-v3:236b` (quantized), multiple 70B models at once

Ask the Mac Studio for recommendations: GET http://mac-studio:11435/dashboard/api/recommendations

Multiple Mac Studios as a fleet

Mac Studio #1 (M4 Ultra, 256GB)  ─┐
Mac Studio #2 (M4 Max, 128GB)    ├──→  Mac Studio Router (:11435)  ←──  Your apps
Mac Mini (32GB)                   ─┘

The Mac Studio router scores each device on 7 signals. Big models route to the Mac Studio with the most memory.

Monitor your Mac Studio

Mac Studio dashboard at http://mac-studio:11435/dashboard — models loaded on each Mac Studio, queue depths, thermal state, memory.

# Mac Studio fleet status
curl -s http://mac-studio:11435/fleet/status | python3 -m json.tool

# Mac Studio health checks
curl -s http://mac-studio:11435/dashboard/api/health | python3 -m json.tool

Example Mac Studio fleet status response:

{
  "fleet": {"nodes_online": 2, "nodes_total": 2},
  "nodes": [
    {"node_id": "Mac-Studio-Ultra", "memory": {"total_gb": 256, "used_gb": 120}},
    {"node_id": "Mac-Studio-Max", "memory": {"total_gb": 128, "used_gb": 85}}
  ]
}

Full documentation

Contribute

Ollama Herd is open source (MIT). Built by Mac Studio owners for Mac Studio owners:

Star on GitHub — help other Mac Studio users find us
Open an issue — share your Mac Studio AI setup
PRs welcome — CLAUDE.md gives AI agents full context. 444 tests, async Python.

Guardrails

No automatic downloads — Mac Studio model pulls require explicit user confirmation.
Model deletion requires explicit user confirmation.
All Mac Studio requests stay local — no data leaves your network.
Never delete or modify files in ~/.fleet-manager/.

Usage Guidance

This skill is coherent with its stated purpose, but take normal precautions before following its install/run instructions: 1) Review the 'ollama-herd' project repository and PyPI package contents before running pip install. 2) Understand that running 'herd' will open a network service on port 11435 and may discover/communicate with other machines on your LAN — consider firewall rules and network segmentation. 3) The metadata references ~/.fleet-manager logs/db files; avoid exposing sensitive files and inspect what the herd service logs. 4) If you want to be extra safe, test installs in an isolated environment (VM/VMware, separate account, or container) and audit any model/tool downloads (uv tool installs) before use.

Capability Assessment

✓ Purpose & Capability

Name/description describe running LLMs, image generation, STT and embeddings on Mac Studio; SKILL.md shows commands to install and run 'ollama-herd', and curl/python examples that target a local service at :11435. Required bins (curl/wget, optional python/pip) and the Darwin OS restriction are consistent with that purpose.

ℹ Instruction Scope

Instructions tell the user to pip install 'ollama-herd', run herd/herd-node, and call local HTTP endpoints (mac-studio:11435) for inference, image gen, transcribe and embeddings — all consistent. Note: SKILL.md metadata includes configPaths (~/.fleet-manager/latency.db and ~/.fleet-manager/logs/herd.jsonl), which suggests the tool reads/writes fleet state and logs; the document does not instruct the agent to exfiltrate unrelated system files, but these paths may contain local telemetry and should be considered sensitive.

ℹ Install Mechanism

The registry contains no automated install spec, but SKILL.md instructs running pip install (ollama-herd) and 'uv tool install' for models — standard for local AI tooling but carries the usual risks of installing third-party packages. No downloads from untrusted direct URLs are recommended in the skill itself.

ℹ Credentials

The skill does not request environment variables or credentials and uses local endpoints. This is proportional. Minor inconsistency: SKILL.md metadata lists configPaths (fleet manager files) that could be sensitive; the registry-level metadata earlier showed no required config paths—users should be aware those paths exist in the skill metadata and could be accessed by the installed herd software.

✓ Persistence & Privilege

Skill is user-invocable and not always-enabled. There is no registry install script requesting persistent platform privileges. Running the recommended 'herd' service will open a local port and create local state (expected behavior for a fleet router), but the skill itself does not request elevated agent privileges or modification of other skills.

Version History

v1.0.3

Cross-platform support: macOS, Linux, and Windows. Updated OS metadata, descriptions, and hardware recommendations.

v1.0.2

- Updated documentation to use "Mac Studio" branding and context throughout all examples, tables, and usage instructions. - Clarified hardware configuration tables, use cases, and model recommendations specifically for Mac Studio. - Expanded language support in the description (added Chinese and Spanish). - Streamlined code and curl command examples to emphasize Mac Studio endpoints and usage patterns. - Improved documentation consistency and simplified setup, monitoring, and guardrail instructions.

v1.0.1

**No changes detected in this version.** - Version number updated to 1.0.1, but file contents and documentation remain identical to 1.0.0.

v1.0.0

mac-studio-ai 1.0.0 – Launch version - Run LLMs, image generation, speech-to-text, and embeddings locally on Mac Studio. - Supports all major Apple Silicon Mac Studio models (M2 Ultra, M3 Ultra, M4 Max, M4 Ultra). - Load and run 120B+ parameter models fully in unified memory; distribute requests across multiple Mac Studios automatically. - Zero-config setup for multi-device clustering on your local network. - Built-in dashboard and monitoring for your Mac Studio fleet. - Strong guardrails: local-only data handling, explicit confirmation for large model downloads and deletions.

Metadata

Slug mac-studio-ai

Version 1.0.3

License MIT-0

All-time Installs 2

Active Installs 2

Total Versions 4

Frequently Asked Questions

What is Mac Studio Ai?

Mac Studio AI — run LLMs, image generation, speech-to-text, and embeddings on your Mac Studio. M2 Ultra (192GB), M3 Ultra (512GB), M4 Max (128GB), and M4 Ult... It is an AI Agent Skill for Claude Code / OpenClaw, with 154 downloads so far.

How do I install Mac Studio Ai?

Run "/install mac-studio-ai" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Mac Studio Ai free?

Yes, Mac Studio Ai is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Mac Studio Ai support?

Mac Studio Ai is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin).

Who created Mac Studio Ai?

It is built and maintained by Twin Geeks (@twinsgeeks); the current version is v1.0.3.

More Skills

Mac Studio Ai