← 返回 Skills 市场
twinsgeeks

Llama Llama3

作者 Twin Geeks · GitHub ↗ · v1.0.1 · MIT-0
darwinlinuxwindows ✓ 安全检测通过
153
总下载
2
收藏
2
当前安装
2
版本数
在 OpenClaw 中安装
/install llama-llama3
功能描述
Llama 3 by Meta — run Llama 3.3, Llama 3.2, and Llama 3.1 across your local device fleet. The most popular open-source LLM family routed to the best availabl...
使用说明 (SKILL.md)

Llama 3 — Run Meta's LLMs Across Your Local Fleet

The Llama family is the most widely deployed open-source LLM. This skill routes Llama requests across your devices — the fleet picks the best machine for every request automatically.

Supported Llama models

Model Parameters Ollama name Best for
Llama 3.3 70B llama3.3:70b Best overall — matches GPT-4o on most benchmarks
Llama 3.2 1B, 3B llama3.2:3b Fast responses on low-RAM devices
Llama 3.1 8B, 70B, 405B llama3.1:70b Proven workhorse, massive community
Llama 3 8B, 70B llama3:70b Original release, still widely used

Quick start

pip install ollama-herd    # PyPI: https://pypi.org/project/ollama-herd/
herd                       # start the router (port 11435)
herd-node                  # run on each device — finds the router automatically

No models are downloaded during installation. Models are pulled on demand when a request arrives, or manually via the dashboard. All pulls require user confirmation.

Use Llama through the fleet

OpenAI SDK (drop-in replacement)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

response = client.chat.completions.create(
    model="llama3.3:70b",
    messages=[{"role": "user", "content": "Explain transformer architecture"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

curl (Ollama format)

curl http://localhost:11435/api/chat -d '{
  "model": "llama3.3:70b",
  "messages": [{"role": "user", "content": "Write a Python quicksort"}],
  "stream": false
}'

curl (OpenAI format)

curl http://localhost:11435/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "llama3.2:3b", "messages": [{"role": "user", "content": "Hello"}]}'

Which Llama model for your hardware

Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

Pick the model that fits your available memory — smaller models work great for most tasks:

Model Min RAM Example hardware
llama3.2:1b 2GB Any Mac — even 8GB
llama3.2:3b 4GB Mac Mini (16GB)
llama3:8b 8GB Mac Mini (16GB)
llama3.3:70b 48GB Mac Studio M4 Max (128GB)
llama3.1:405b 256GB+ Mac Studio M4 Ultra (256GB) or distributed

The fleet router sends requests to the machine where the model is loaded. No manual routing needed.

Why run Llama locally

  • Free after hardware — Meta's license allows commercial use with no per-token cost
  • Privacy — prompts and responses never leave your network
  • No rate limits — your hardware, your throughput
  • Fleet routing — multiple machines share the load automatically

See what's running

# Models loaded in memory right now
curl -s http://localhost:11435/api/ps | python3 -m json.tool

# All models available across the fleet
curl -s http://localhost:11435/api/tags | python3 -m json.tool

Monitor Llama performance

# Recent request traces — see latency, tokens, which node handled each request
curl -s "http://localhost:11435/dashboard/api/traces?limit=10" | python3 -m json.tool

# Fleet health — 15 automated checks
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

Web dashboard at http://localhost:11435/dashboard — live view of all nodes, queues, and models.

Also available on this fleet

Other LLM models

Qwen 3.5, DeepSeek-V3, DeepSeek-R1, Phi 4, Mistral, Gemma 3, Codestral — any Ollama model routes through the same endpoint.

Image generation

curl http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "a llama in the mountains", "width": 512, "height": 512}'

Speech-to-text

curl http://localhost:11435/api/transcribe -F "[email protected]" -F "model=qwen3-asr"

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Meta Llama open source language model"}'

Full documentation

Guardrails

  • Model downloads require explicit user confirmation — Llama models range from 1GB (1B) to 230GB+ (405B). Always confirm before pulling.
  • Model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • If a model is too large for available memory, suggest a smaller variant.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in via the auto_pull setting.
安全使用建议
This skill is internally consistent with being a local fleet router, but you should still do basic hygiene before installing: 1) Inspect the PyPI package 'ollama-herd' and the linked GitHub repository to confirm the code matches the docs and that model downloads are interactive as stated. 2) Run the software in an isolated/test environment first (or a VM) to verify it only listens on localhost or your intended network interfaces. 3) Review ~/.fleet-manager/ contents and logs for any sensitive data; back them up if needed. 4) Confirm that model pulls truly require explicit confirmation and that no automatic outbound traffic uploads prompts/responses. 5) Limit which devices/users can join your fleet (authentication/ACLs) to avoid exposing local models. If you cannot review the package source, treat the PyPI install as a moderate risk.
能力评估
Purpose & Capability
Name/description (a local fleet router for Llama models) lines up with what's requested and documented: the SKILL.md instructs installing a herd router package, running local binaries (herd, herd-node), and talking to localhost endpoints. Required binaries (curl/wget, optional python/pip) are appropriate. Declared config paths (~/.fleet-manager/latency.db and logs/herd.jsonl) are consistent with a fleet manager that records latency and logs.
Instruction Scope
SKILL.md is instruction-only and stays within the stated purpose: it tells the operator to pip install the herd package, run herd and herd-node, and call local HTTP endpoints. It does not instruct reading arbitrary user files or exfiltrating data. One point to note: metadata lists fleet config paths (logs/db) which are sensitive — the doc warns not to modify them, but if installed, the herd software will likely read/write those files. Verify that behavior in the package source before trusting logs/latency data.
Install Mechanism
There is no formal install spec in the registry (instruction-only), but the SKILL.md recommends 'pip install ollama-herd' from PyPI. Installing third-party packages from PyPI is a normal distribution route but carries modest risk—inspect the PyPI package and the linked GitHub repo before installing. No downloads from unknown personal servers or archive extracts are specified.
Credentials
The skill requests no environment variables, no credentials, and no system config paths beyond its own fleet config directory. That is proportionate for a local fleet router that operates on localhost and local devices.
Persistence & Privilege
always is false and the skill does not request system-wide privileges or modifications to other skills. The declared config paths imply it will maintain local state under ~/.fleet-manager/, which is expected for this type of software; ensure you are comfortable with that directory being created/used.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install llama-llama3
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /llama-llama3 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
Cross-platform support: macOS, Linux, and Windows. Updated OS metadata, descriptions, and hardware recommendations.
v1.0.0
llama-llama3 1.0.0 — Initial release - Route Meta’s Llama 3 family of models (including 3.3, 3.2, 3.1, and original 3) across your local device fleet. - Automatically selects the best available device for each Llama request. - Supports OpenAI-compatible API and Ollama API endpoints. - Manual, user-confirmed model downloads—no automatic pulls. - Includes fleet monitoring, dashboard, and support for multiple tasks (chat, images, speech-to-text, embeddings). - Commercial use supported with zero cloud or per-token costs.
元数据
Slug llama-llama3
版本 1.0.1
许可证 MIT-0
累计安装 2
当前安装数 2
历史版本数 2
常见问题

Llama Llama3 是什么?

Llama 3 by Meta — run Llama 3.3, Llama 3.2, and Llama 3.1 across your local device fleet. The most popular open-source LLM family routed to the best availabl... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 153 次。

如何安装 Llama Llama3?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install llama-llama3」即可一键安装,无需额外配置。

Llama Llama3 是免费的吗?

是的,Llama Llama3 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Llama Llama3 支持哪些平台?

Llama Llama3 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(darwin, linux, windows)。

谁开发了 Llama Llama3?

由 Twin Geeks(@twinsgeeks)开发并维护,当前版本 v1.0.1。

💬 留言讨论