Local Coding
/install local-coding
Local Coding Assistant — Code Models Across Your Fleet
Run the best open-source coding models on your own hardware. DeepSeek-Coder, Codestral, StarCoder, and Qwen-Coder routed across your devices — the fleet picks the best machine for every code generation request.
Your code never leaves your network. No GitHub Copilot subscription, no cloud API costs.
Coding models available
| Model | Parameters | Ollama name | Strengths |
|---|---|---|---|
| Codestral | 22B | codestral |
80+ languages, fill-in-the-middle, Mistral's code specialist |
| DeepSeek-Coder-V2 | 236B MoE (21B active) | deepseek-coder-v2 |
Matches GPT-4 Turbo on code tasks |
| DeepSeek-Coder | 6.7B, 33B | deepseek-coder:33b |
Purpose-built for code (87% code training data) |
| Qwen2.5-Coder | 7B, 32B | qwen2.5-coder:32b |
Strong multi-language code generation |
| StarCoder2 | 3B, 7B, 15B | starcoder2:15b |
Trained on The Stack v2, 600+ languages |
| CodeGemma | 7B | codegemma |
Google's code-focused Gemma variant |
Quick start
pip install ollama-herd # PyPI: https://pypi.org/project/ollama-herd/
herd # start the router (port 11435)
herd-node # run on each device — finds the router automatically
No models are downloaded during installation. All pulls require user confirmation.
Code generation
Write new code
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")
response = client.chat.completions.create(
model="codestral",
messages=[{"role": "user", "content": "Write a thread-safe LRU cache in Python with TTL support"}],
)
print(response.choices[0].message.content)
Code review
curl http://localhost:11435/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-coder-v2:16b",
"messages": [{"role": "user", "content": "Review this code for bugs and security issues:\
\
```python\
def process_payment(amount, card_number):\
...\
```"}]
}'
Refactoring
curl http://localhost:11435/api/chat -d '{
"model": "qwen2.5-coder:32b",
"messages": [{"role": "user", "content": "Refactor this to use async/await: ..."}],
"stream": false
}'
Works with your IDE tools
The fleet exposes an OpenAI-compatible API at http://localhost:11435/v1. Point any coding tool at it:
| Tool | Config |
|---|---|
| Aider | aider --openai-api-base http://localhost:11435/v1 --model codestral |
| Continue.dev | Set API base to http://localhost:11435/v1 in VS Code settings |
| Cline | Set provider to OpenAI-compatible, base URL http://localhost:11435/v1 |
| Open WebUI | Set Ollama URL to http://localhost:11435 |
| LangChain | ChatOpenAI(base_url="http://localhost:11435/v1", model="codestral") |
Pick the right model for your RAM
Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works.
| Device | RAM | Best coding model |
|---|---|---|
| MacBook Air (8GB) | 8GB | starcoder2:3b or deepseek-coder:6.7b |
| Mac Mini (16GB) | 16GB | codestral or starcoder2:15b |
| Mac Mini (32GB) | 32GB | qwen2.5-coder:32b or deepseek-coder:33b |
| Mac Studio (128GB) | 128GB | deepseek-coder-v2 — frontier code quality |
Check what's running
# Models loaded in memory
curl -s http://localhost:11435/api/ps | python3 -m json.tool
# All available models
curl -s http://localhost:11435/api/tags | python3 -m json.tool
# Recent coding request traces
curl -s "http://localhost:11435/dashboard/api/traces?limit=5" | python3 -m json.tool
Also available on this fleet
General-purpose LLMs
Llama 3.3, Qwen 3.5, DeepSeek-R1, Mistral Large — for non-code tasks through the same endpoint.
Image generation
curl http://localhost:11435/api/generate-image \
-d '{"model": "z-image-turbo", "prompt": "developer workspace illustration", "width": 512, "height": 512}'
Speech-to-text
curl http://localhost:11435/api/transcribe -F "[email protected]" -F "model=qwen3-asr"
Full documentation
- Agent Setup Guide — all 4 model types
- API Reference — complete endpoint docs
Guardrails
- Model downloads require explicit user confirmation — coding models range from 2GB to 130GB+. Always confirm before pulling.
- Model deletion requires explicit user confirmation.
- Never delete or modify files in
~/.fleet-manager/. - No models are downloaded automatically — all pulls are user-initiated or require opt-in.
- Your code stays local — no prompts or generated code leave your network.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install local-coding - 安装完成后,直接呼叫该 Skill 的名称或使用
/local-coding触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Local Coding 是什么?
Local coding assistant — run DeepSeek-Coder, Codestral, StarCoder, and Qwen-Coder across your device fleet. Code generation, review, refactoring, and debuggi... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 125 次。
如何安装 Local Coding?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install local-coding」即可一键安装,无需额外配置。
Local Coding 是免费的吗?
是的,Local Coding 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Local Coding 支持哪些平台?
Local Coding 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(darwin, linux, windows)。
谁开发了 Local Coding?
由 Twin Geeks(@twinsgeeks)开发并维护,当前版本 v1.0.1。