← Back to Skills Marketplace

DeepSeek — DeepSeek-V3, DeepSeek-R1, DeepSeek-Coder on Your Local Devices

Name: DeepSeek — DeepSeek-V3, DeepSeek-R1, DeepSeek-Coder on Your Local Devices
Author: twinsgeeks

by Twin Geeks · GitHub ↗ · v1.0.1 · MIT-0

darwinlinux ✓ Security Clean

105

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install deepseek-deepseek-v3

Description

DeepSeek models on your local fleet — DeepSeek-V3, DeepSeek-V3.2, DeepSeek-R1, DeepSeek-Coder routed across multiple devices via Ollama Herd. 7-signal scorin...

README (SKILL.md)

DeepSeek — Run DeepSeek Models Across Your Local Fleet

Run DeepSeek-V3, DeepSeek-R1, and DeepSeek-Coder on your own hardware. The fleet router picks the best device for every request — no cloud API needed, zero per-token costs, all data stays on your machines.

Supported DeepSeek models

Model	Parameters	Ollama name	Best for
DeepSeek-V3	671B MoE (37B active)	`deepseek-v3`	General — matches GPT-4o on most benchmarks
DeepSeek-V3.1	671B MoE	`deepseek-v3.1`	Hybrid thinking/non-thinking modes
DeepSeek-V3.2	671B MoE	`deepseek-v3.2`	Improved reasoning + agent performance
DeepSeek-R1	1.5B–671B	`deepseek-r1`	Reasoning — approaches O3 and Gemini 2.5 Pro
DeepSeek-Coder	1.3B–33B	`deepseek-coder`	Code generation (87% code, 13% NL training)
DeepSeek-Coder-V2	236B MoE (21B active)	`deepseek-coder-v2`	Code — matches GPT-4 Turbo on code tasks

Setup

pip install ollama-herd
herd              # start the router (port 11435)
herd-node         # run on each machine

# Pull a DeepSeek model
ollama pull deepseek-r1:70b

Package: ollama-herd | Repo: github.com/geeks-accelerator/ollama-herd

Use DeepSeek through the fleet

OpenAI SDK

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

# DeepSeek-R1 for reasoning
response = client.chat.completions.create(
    model="deepseek-r1:70b",
    messages=[{"role": "user", "content": "Prove that there are infinitely many primes"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

DeepSeek-Coder for code

response = client.chat.completions.create(
    model="deepseek-coder-v2:16b",
    messages=[{"role": "user", "content": "Write a Redis cache decorator in Python"}],
)
print(response.choices[0].message.content)

Ollama API

# DeepSeek-V3 general chat
curl http://localhost:11435/api/chat -d '{
  "model": "deepseek-v3",
  "messages": [{"role": "user", "content": "Explain quantum computing"}],
  "stream": false
}'

# DeepSeek-R1 reasoning
curl http://localhost:11435/api/chat -d '{
  "model": "deepseek-r1:70b",
  "messages": [{"role": "user", "content": "Solve this step by step: ..."}],
  "stream": false
}'

Hardware recommendations

DeepSeek models are large. Here's what fits where:

Model	Min RAM	Recommended hardware
`deepseek-r1:1.5b`	4GB	Any Mac
`deepseek-r1:7b`	8GB	Mac Mini M4 (16GB)
`deepseek-r1:14b`	12GB	Mac Mini M4 (24GB)
`deepseek-r1:32b`	24GB	Mac Mini M4 Pro (48GB)
`deepseek-r1:70b`	48GB	Mac Studio M4 Max (128GB)
`deepseek-coder-v2:16b`	12GB	Mac Mini M4 (24GB)
`deepseek-v3`	256GB+	Mac Studio M3 Ultra (512GB)

The fleet router automatically sends requests to the machine where the model is loaded — no manual routing needed.

Why run DeepSeek locally

Zero cost — DeepSeek API charges per token. Local is free after hardware.
Privacy — code and business data never leave your network.
No rate limits — DeepSeek API throttles during peak hours. Local has no throttle.
Availability — DeepSeek API has had outages. Your hardware doesn't depend on their servers.
Fleet routing — multiple machines share the load. One busy? Request goes to the next.

Fleet features

7-signal scoring — picks the optimal node for every request
Auto-retry — fails over to next best node transparently
VRAM-aware fallback — routes to a loaded model in the same category instead of cold-loading
Context protection — prevents expensive model reloads from num_ctx changes
Request tagging — track per-project DeepSeek usage

Also available on this fleet

Other LLM models

Llama 3.3, Qwen 3.5, Phi 4, Mistral, Gemma 3 — any Ollama model routes through the same endpoint.

Image generation

curl -o image.png http://localhost:11435/api/generate-image \
  -H "Content-Type: application/json" \
  -d '{"model":"z-image-turbo","prompt":"a sunset","width":1024,"height":1024,"steps":4}'

Speech-to-text

curl http://localhost:11435/api/transcribe -F "[email protected]"

Embeddings

curl http://localhost:11435/api/embeddings -d '{"model":"nomic-embed-text","prompt":"query"}'

Dashboard

http://localhost:11435/dashboard — monitor DeepSeek requests alongside all other models. Per-model latency, token throughput, health checks.

Full documentation

Agent Setup Guide

Guardrails

Never pull or delete DeepSeek models without user confirmation — downloads are 4-400+ GB.
Never delete or modify files in ~/.fleet-manager/.
If a DeepSeek model is too large for available memory, suggest a smaller variant.

Usage Guidance

This skill appears to be what it claims: a guide to running DeepSeek models locally via an Ollama Herd router. Before installing, verify the ollama-herd PyPI package and its GitHub repository (review code, recent activity, and maintainers). Be prepared for large downloads and big disk/RAM usage when pulling models. Run installations on a trusted machine or isolated environment, check network access (model pulls will download large artifacts), and inspect the ~/.fleet-manager directory and any created services before granting broader network access. If you need higher assurance, review the package source or run it in a VM/container first.

Capability Analysis

Type: OpenClaw Skill Name: deepseek-deepseek-v3 Version: 1.0.1 The skill bundle provides instructions and metadata for managing a local fleet of DeepSeek models using the 'ollama-herd' utility. It includes transparent configuration paths (~/.fleet-manager/) and explicit guardrails in SKILL.md that instruct the AI agent to avoid resource-intensive downloads or file deletions without user confirmation. No evidence of data exfiltration, malicious execution, or harmful prompt injection was found.

Capability Assessment

✓ Purpose & Capability

Name/description (running DeepSeek via an Ollama Herd router) align with the runtime instructions: installing ollama-herd, running herd/herd-node, and using ollama pull to fetch models. Declared binaries (curl/wget, optional python/pip) make sense for interacting with local HTTP endpoints and installing the Python package.

✓ Instruction Scope

SKILL.md contains only setup and usage steps for a local fleet router and examples showing how to call localhost endpoints. It does not instruct reading or exfiltrating unrelated system files or environment variables; it even warns not to delete/edit ~/.fleet-manager. Sample code points at localhost (http://localhost:11435).

ℹ Install Mechanism

Installation is via pip install ollama-herd (PyPI) and running local binaries (herd, herd-node). Using PyPI is a common approach but carries moderate supply‑chain risk — the package and its GitHub repo should be reviewed before installation.

✓ Credentials

The skill declares no required environment variables or unrelated credentials. Metadata lists config paths under ~/.fleet-manager, which are consistent with a fleet manager and are not excessive for the stated purpose.

✓ Persistence & Privilege

No 'always' privilege requested; the skill is user‑invocable only. It does not request writing to other skills' configs or system‑wide settings in the instructions.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install deepseek-deepseek-v3
After installation, invoke the skill by name or use /deepseek-deepseek-v3
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

Initial public release of DeepSeek models on local hardware through Ollama Herd. - Run DeepSeek-V3, V3.2, R1, and Coder models locally on Apple Silicon or Linux, with zero cloud costs. - Supports automatic fleet routing: selects the best node for each request based on 7-signal scoring; seamless failover and VRAM-aware fallback. - Compatible with OpenAI and Ollama APIs for chat, code, image generation, speech-to-text, and embeddings. - Provides setup instructions, recommended hardware guidance, and dashboard monitoring at a unified endpoint. - Prioritizes privacy, local performance, and user control over model management.

Metadata

Slug deepseek-deepseek-v3

Version 1.0.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is DeepSeek — DeepSeek-V3, DeepSeek-R1, DeepSeek-Coder on Your Local Devices?

DeepSeek models on your local fleet — DeepSeek-V3, DeepSeek-V3.2, DeepSeek-R1, DeepSeek-Coder routed across multiple devices via Ollama Herd. 7-signal scorin... It is an AI Agent Skill for Claude Code / OpenClaw, with 105 downloads so far.

How do I install DeepSeek — DeepSeek-V3, DeepSeek-R1, DeepSeek-Coder on Your Local Devices?

Run "/install deepseek-deepseek-v3" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is DeepSeek — DeepSeek-V3, DeepSeek-R1, DeepSeek-Coder on Your Local Devices free?

Yes, DeepSeek — DeepSeek-V3, DeepSeek-R1, DeepSeek-Coder on Your Local Devices is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does DeepSeek — DeepSeek-V3, DeepSeek-R1, DeepSeek-Coder on Your Local Devices support?

DeepSeek — DeepSeek-V3, DeepSeek-R1, DeepSeek-Coder on Your Local Devices is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin, linux).

Who created DeepSeek — DeepSeek-V3, DeepSeek-R1, DeepSeek-Coder on Your Local Devices?

It is built and maintained by Twin Geeks (@twinsgeeks); the current version is v1.0.1.

More Skills