Mac Studio Ai
/install mac-studio-ai
Mac Studio AI — The Most Powerful Local AI Machine
The Mac Studio is the best hardware for local AI. Mac Studio M4 Ultra with 256GB of unified memory runs 120B+ parameter models. Mac Studio M3 Ultra with 512GB loads frontier models that need 4-8 NVIDIA A100s elsewhere. The Mac Studio runs everything in one memory pool — no PCIe bottleneck.
One Mac Studio is a powerhouse. Multiple Mac Studios become a fleet.
Mac Studio configurations for AI
| Mac Studio Config | Chip | Memory | GPU Cores | Mac Studio LLM Sweet Spot |
|---|---|---|---|---|
| Mac Studio M4 Max | M4 Max | 128GB | 40 | 70B models on Mac Studio |
| Mac Studio M4 Ultra | M4 Ultra | 256GB | 80 | 120B+ models on Mac Studio |
| Mac Studio M3 Ultra | M3 Ultra | 192-512GB | 76 | 236B models on Mac Studio |
| Mac Studio M2 Ultra | M2 Ultra | 192GB | 76 | 70B-120B on Mac Studio |
Setup your Mac Studio
pip install ollama-herd # install on your Mac Studio
herd # start Mac Studio as the router (port 11435)
herd-node # connect additional Mac Studios or other devices
Mac Studios discover each other automatically on your local network.
Add Mac Studio image generation
uv tool install mflux # Flux models (~5s at 512px on Mac Studio M4 Ultra)
uv tool install diffusionkit # Stable Diffusion 3/3.5 on Mac Studio
Use your Mac Studio for AI inference
Mac Studio LLM inference — run the biggest models
from openai import OpenAI
# Connect to Mac Studio running Ollama Herd
mac_studio = OpenAI(base_url="http://mac-studio:11435/v1", api_key="not-needed")
# 120B model — runs smoothly on Mac Studio M4 Ultra (256GB unified memory)
response = mac_studio.chat.completions.create(
model="gpt-oss:120b", # loaded entirely in Mac Studio unified memory
messages=[{"role": "user", "content": "How does Mac Studio handle large AI models?"}],
stream=True,
)
for chunk in response:
print(chunk.choices[0].delta.content or "", end="")
Mac Studio image generation
# Flux via mflux — ~5s on Mac Studio M4 Ultra
curl -o mac_studio_art.png http://mac-studio:11435/api/generate-image \
-H "Content-Type: application/json" \
-d '{"model": "z-image-turbo", "prompt": "a Mac Studio on a minimalist desk with holographic AI display", "width": 1024, "height": 1024}'
# Stable Diffusion 3 on Mac Studio — ~9s
curl -o mac_studio_sd3.png http://mac-studio:11435/api/generate-image \
-H "Content-Type: application/json" \
-d '{"model": "sd3-medium", "prompt": "Mac Studio M4 Ultra rendering AI art", "width": 1024, "height": 1024, "steps": 20}'
Mac Studio speech-to-text
# Transcribe on Mac Studio via Qwen3-ASR
curl http://mac-studio:11435/api/transcribe \
-F "file=@mac_studio_meeting.wav" \
-F "model=qwen3-asr"
Mac Studio embeddings
# Generate embeddings on Mac Studio
curl http://mac-studio:11435/api/embed \
-d '{"model": "nomic-embed-text", "input": "Mac Studio M4 Ultra unified memory AI inference"}'
Recommended models for Mac Studio
| Mac Studio Config | Models for this Mac Studio |
|---|---|
| Mac Studio M4 Max (128GB) | llama3.3:70b, qwen3:72b, deepseek-r1:70b, codestral |
| Mac Studio M4 Ultra (256GB) | gpt-oss:120b, qwen3:110b, two 70B models simultaneously |
| Mac Studio M3 Ultra (512GB) | deepseek-v3:236b (quantized), multiple 70B models at once |
Ask the Mac Studio for recommendations: GET http://mac-studio:11435/dashboard/api/recommendations
Multiple Mac Studios as a fleet
Mac Studio #1 (M4 Ultra, 256GB) ─┐
Mac Studio #2 (M4 Max, 128GB) ├──→ Mac Studio Router (:11435) ←── Your apps
Mac Mini (32GB) ─┘
The Mac Studio router scores each device on 7 signals. Big models route to the Mac Studio with the most memory.
Monitor your Mac Studio
Mac Studio dashboard at http://mac-studio:11435/dashboard — models loaded on each Mac Studio, queue depths, thermal state, memory.
# Mac Studio fleet status
curl -s http://mac-studio:11435/fleet/status | python3 -m json.tool
# Mac Studio health checks
curl -s http://mac-studio:11435/dashboard/api/health | python3 -m json.tool
Example Mac Studio fleet status response:
{
"fleet": {"nodes_online": 2, "nodes_total": 2},
"nodes": [
{"node_id": "Mac-Studio-Ultra", "memory": {"total_gb": 256, "used_gb": 120}},
{"node_id": "Mac-Studio-Max", "memory": {"total_gb": 128, "used_gb": 85}}
]
}
Full documentation
Contribute
Ollama Herd is open source (MIT). Built by Mac Studio owners for Mac Studio owners:
- Star on GitHub — help other Mac Studio users find us
- Open an issue — share your Mac Studio AI setup
- PRs welcome —
CLAUDE.mdgives AI agents full context. 444 tests, async Python.
Guardrails
- No automatic downloads — Mac Studio model pulls require explicit user confirmation.
- Model deletion requires explicit user confirmation.
- All Mac Studio requests stay local — no data leaves your network.
- Never delete or modify files in
~/.fleet-manager/.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install mac-studio-ai - 安装完成后,直接呼叫该 Skill 的名称或使用
/mac-studio-ai触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Mac Studio Ai 是什么?
Mac Studio AI — run LLMs, image generation, speech-to-text, and embeddings on your Mac Studio. M2 Ultra (192GB), M3 Ultra (512GB), M4 Max (128GB), and M4 Ult... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 154 次。
如何安装 Mac Studio Ai?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install mac-studio-ai」即可一键安装,无需额外配置。
Mac Studio Ai 是免费的吗?
是的,Mac Studio Ai 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Mac Studio Ai 支持哪些平台?
Mac Studio Ai 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(darwin)。
谁开发了 Mac Studio Ai?
由 Twin Geeks(@twinsgeeks)开发并维护,当前版本 v1.0.3。