Description

Reduces token usage from paid providers by offloading work to local LM Studio models. Use when: (1) Cutting costs—use local models for summarization, extraction, classification, rewriting, first-pass review, brainstorming when quality suffices, (2) Avoiding paid API calls for high-volume or repetitive tasks, (3) No extra model configuration—JIT loading and REST API work with existing LM Studio setup, (4) Local-only or privacy-sensitive work. Requires LM Studio 0.4+ with server (default :1234). No CLI required.

README (SKILL.md)

LM Studio Models

Name: Offload Tasks to LM Studio Models
Author: t-sinclair2500

Offload tasks to local models when quality suffices. Base URL: http://127.0.0.1:1234. Auth: Authorization: Bearer lmstudio. instance_id = loaded_instances[].id (same model can have multiple, e.g. key and key:2).

Key Terms

model: From GET models key; use in chat and optional load.
lm_studio_api_url: Default http://127.0.0.1:1234 (paths /api/v1/...).
response_id / previous_response_id: Chat returns response_id; pass as previous_response_id for stateful.
instance_id: For unload, use only the value from GET /api/v1/models for that model: each loaded_instances[].id. Do not assume it equals the model key; with multiple instances ids can be like key:2. LM Studio docs: List (loaded_instances[].id), Unload (instance_id).

Trigger in frontmatter; below = implementation.

Prerequisites

LM Studio 0.4+, server :1234, models on disk; load/unload via API (JIT optional); Node for script (curl ok).

Quick start

Minimal path: list models, then one chat. Replace \x3Cmodel> with a key from GET /api/v1/models and \x3Ctask> with the task text.

curl -s -H 'Authorization: Bearer lmstudio' http://127.0.0.1:1234/api/v1/models
node scripts/lmstudio-api.mjs \x3Cmodel> '\x3Ctask>' --temperature=0.5 --max-output-tokens=200

Stateful multi-turn: pass --previous-response-id=\x3Cid> from the prior script output. Or use --stateful to persist response_id automatically. Optional --log \x3Cpath> for request/response.

node scripts/lmstudio-api.mjs \x3Cmodel> 'First turn...' --previous-response-id=$ID1
node scripts/lmstudio-api.mjs \x3Cmodel> 'Second turn...' --previous-response-id=$ID2

Complete Workflow

Step 0: Preflight

GET \x3Cbase>/api/v1/models; non-200 or connection error = server not ready.

exec command:"curl -s -o /dev/null -w '%{http_code}' -H 'Authorization: Bearer lmstudio' http://127.0.0.1:1234/api/v1/models"

Step 1: List Models and Select

GET /api/v1/models to list models. Parse each entry: key, type, loaded_instances, max_context_length, capabilities. If a model already has loaded_instances.length > 0 and fits the task, skip to Step 5; otherwise pick a key for chat (and optional load in Step 3). Choose by task: vision -> capabilities.vision; embedding -> type=embedding; context -> max_context_length. Prefer already-loaded; prefer smaller for speed, larger for reasoning. Note loaded_instances[].id for optional unload later.

Example — list models:

exec command:"curl -s -H 'Authorization: Bearer lmstudio' http://127.0.0.1:1234/api/v1/models"

Parse models[] (key, type, loaded_instances, max_context_length, capabilities, params_string). If a model has loaded_instances.length > 0 and fits task, skip to Step 5; else pick key for chat (and optional load). Note loaded_instances[].id for optional unload.

Step 2: Model Selection

Pick key from GET response; use as model in chat (optional load). Constraints: vision -> capabilities.vision; embedding -> type=embedding; context -> max_context_length. Prefer loaded (loaded_instances non-empty), smaller for speed/larger for reasoning; fallback primary. If unsure, use the first loaded instance for that key or the smallest loaded model that fits the task. Optional POST load; else JIT on first chat.

Step 3: Load Model (optional)

Optional: POST /api/v1/models/load { model, context_length?, ... }. Or run scripts/load.mjs <model>. JIT: first chat loads; explicit load only for specific options.

Step 4: Verify Loaded (optional)

If explicit load: GET models, confirm loaded_instances. If JIT: no verify; first chat returns model_instance_id, stats.model_load_time_seconds.

Step 5: Call API

From the skill folder: node scripts/lmstudio-api.mjs <model> '<task>' [options].

exec command:"node scripts/lmstudio-api.mjs \x3Cmodel> '\x3Ctask>' --temperature=0.7 --max-output-tokens=2000"

Stateful: add --previous-response-id=\x3Cresponse_id>. Curl: POST \x3Cbase>/api/v1/chat, body model, input, store, temperature, max_output_tokens; optional previous_response_id. Parse: output (type message) -> content; response_id, model_instance_id, stats. Script outputs content, model_instance_id, response_id, usage.

Step 6: Unload (optional)

For the model key you used: GET /api/v1/models, then for each loaded_instances[].id for that model, POST /api/v1/models/unload with body {"instance_id": "\x3Cthat id>"}. Use the id from the response only (do not send the model key unless it exactly equals that id). Or run scripts/unload.mjs <model_key> (script does GET then unloads each instance id). Optional --unload-after (default off); use --keep to leave loaded. Unload only that model's instances. JIT+TTL auto-unload; explicit when needed.

# One unload per instance_id; repeat for each id in that model's loaded_instances
exec command:"curl -s -X POST http://127.0.0.1:1234/api/v1/models/unload -H 'Content-Type: application/json' -H 'Authorization: Bearer lmstudio' -d '{\"instance_id\": \"\x3Cinstance_id>\"}'"

Step 7: Verify unload (optional)

After unloading, confirm no instances remain for that model key. Run the jq check below; result must be 0. If non-zero, unload the remaining instance_id(s) from that model and re-run the check. Do not infer from "model object exists"; the object still exists with an empty loaded_instances array.

exec command:"curl -s -H 'Authorization: Bearer lmstudio' http://127.0.0.1:1234/api/v1/models | jq '.models[]|select(.key==\"\x3Cmodel_key>\")|.loaded_instances|length'"

Expect output 0. If not, unload remaining instance_ids and re-run.

Error Handling

Script retries on transient failure (2-3 attempts with backoff).
Model not found -> pick another model from GET response.
API/server errors -> GET models, check URL.
Invalid output -> retry.
Memory -> unload or smaller model.
Unload fails -> instance_id must be exactly from GET /api/v1/models for that model's loaded_instances[].id (not the model key unless it matches).

Copy-paste

Replace \x3Cmodel> with a key from GET /api/v1/models and \x3Ctask> with the task text. Optional unload per Step 6 (instance_id from GET models for that key).

exec command:"curl -s -H 'Authorization: Bearer lmstudio' http://127.0.0.1:1234/api/v1/models"
exec command:"node scripts/lmstudio-api.mjs \x3Cmodel> '\x3Ctask>' --temperature=0.7 --max-output-tokens=2000"

LM Studio API Details

Helper/API: see Step 5. Output: content, model_instance_id, response_id, usage. Auth: Bearer lmstudio. List GET /api/v1/models. Load POST /api/v1/models/load (optional). Unload POST /api/v1/models/unload { instance_id }.

Scripts

lmstudio-api.mjs: chat; options --stateful, --unload-after, --keep, --log <path>, --previous-response-id, --temperature, --max-output-tokens. load.mjs: load model by key. unload.mjs: unload by model key (all instances). test.mjs: smoke test (load, chat, unload one model).

Notes

LM Studio 0.4+.
JIT (first chat loads; model_load_time_seconds in stats); stateful (response_id / previous_response_id).

Usage Guidance

This skill appears to do what it says: it lists models and calls a LM Studio REST API (default http://127.0.0.1:1234) to run local models. Before installing, check two things: (1) Ensure the configured LM Studio API URL is actually your local, trusted instance — the scripts allow overriding the API URL, and if pointed to an external host your task text (and any logs you enable) could be sent off-host. (2) Be aware the skill writes .lmstudio-state (response_id) to the current working directory when used in stateful mode and will write request/response data to any log path you pass; avoid enabling logging to untrusted locations or including sensitive content in logs. Finally, review the included scripts (they are short and readable) and ensure you are comfortable running Node-based helper scripts in your environment.

Capability Analysis

Type: OpenClaw Skill Name: lm-studio-subagents Version: 1.0.3 The skill bundle is designed to integrate OpenClaw agents with local LM Studio models, primarily by interacting with the LM Studio REST API at `http://127.0.0.1:1234`. All network calls observed in `SKILL.md` and the Node.js scripts (`lmstudio-api.mjs`, `load.mjs`, `unload.mjs`, `test.mjs`) are directed to this local endpoint. File system access is limited to a local state file (`.lmstudio-state`) and an optional log file, both within the current working directory. There is no evidence of data exfiltration, malicious execution, persistence mechanisms, or prompt injection attempts against the agent that would lead to unauthorized actions or access to sensitive data.

Capability Assessment

✓ Purpose & Capability

Name and description match the implementation: the skill discovers models and issues GET/POST requests to LM Studio endpoints (/api/v1/models, /api/v1/chat, /api/v1/models/load|unload). There are no unrelated binaries or environment credentials requested.

ℹ Instruction Scope

SKILL.md and scripts only reference LM Studio REST endpoints and local files. The instructions show curl/node usage and allow overriding the API URL (LM_STUDIO_API_URL / --api-url) and optional logging. That is expected for this skill, but it means task content will be transmitted to whichever API endpoint is configured; if a user points the API URL at a remote host, data would be sent off-host.

✓ Install Mechanism

No install spec; this is instruction- and script-based only. No downloads from third-party URLs or archive extraction are present.

✓ Credentials

The skill requires no external credentials or config paths. It uses an optional LM_STUDIO_API_URL env var and optionally MODEL. Scripts include a fixed Authorization header ('Bearer lmstudio') which appears to be the LM Studio default; no secrets are requested from the user.

ℹ Persistence & Privilege

always:false and the skill is user-invocable. The scripts write a .lmstudio-state file to the working directory (for --stateful) and can append full requests/responses to a user-specified log path. This is limited to files in the agent's working dir (no system-wide changes), but users should be aware of local state/log creation and their contents.

Version History

v1.0.3

Changelog **New helper scripts for load/unload/test added; unload guidance clarified** - Added scripts: load.mjs (model loader), unload.mjs (unloads all instances for model key), test.mjs (basic test suite). - Significantly expanded and clarified unload instructions: always use instance_id from GET /api/v1/models (never assume it matches model key; keys can have multiple instances with ids like key:2). - Improved example workflows, especially for multi-turn and unload/verify unload steps. - Documented new script features (like --unload-after, --keep, --stateful, --log). - Added troubleshooting and error handling procedures for instance management. - Updated copy-paste and example sections for practical ease-of-use.

v1.0.2

Summary: Significantly leaner skill, now on LM Studio’s latest API (v0.4+). Token diet: Skill content cut by ~80%. Agents load less context and use fewer paid tokens. Shorter, clearer steps and no redundant API detail. - LM Studio's v1 Rest API: Migrated to LM Studio’s v1 REST API, which adds support for JIT loading, explicit load/unload, and stateful chats (response_id). No CLI required; everything goes through the API. - Simplified workflow: One streamlined flow—preflight, list/select model, chat, optional load/unload. README added for quick onboarding. - Requires: LM Studio 0.4+ with server running. Old CLI-based flow removed.

v1.0.1

- Added Node.js helper script `scripts/lmstudio-api.mjs` for robust LM Studio API calls. - Updated documentation to recommend using the helper script for making API requests (with fallback to curl). - Clarified that no Clawdbot configuration for API URL is needed; the default LM Studio server is sufficient. - Noted Node.js as a new prerequisite for using the script.

v1.0.0

Initial release of the lmstudio-subagents skill for Clawdbot. This skill allows agents to discover, select, and utilize locally-hosted AI models within LM Studio for efficient and private task offloading. - Enables agents to search for, load, and use local LLMs and VLMs via LM Studio. - Offloads suitable tasks (e.g., summarization, extraction, classification, drafting) to available local models for privacy and cost savings. - Automatically discovers models, matches them to task requirements, and manages model loading and unloading. - Provides a complete workflow: model discovery, selection, loading, verification, and API invocation. - Enforces safety checks to confirm models are properly loaded before use. - Requires LM Studio with lms CLI and server running.

Metadata

Slug lm-studio-subagents

Version 1.0.3

License —

All-time Installs 1

Active Installs 1

Total Versions 4

Frequently Asked Questions

What is Offload Tasks to LM Studio Models?

Reduces token usage from paid providers by offloading work to local LM Studio models. Use when: (1) Cutting costs—use local models for summarization, extraction, classification, rewriting, first-pass review, brainstorming when quality suffices, (2) Avoiding paid API calls for high-volume or repetitive tasks, (3) No extra model configuration—JIT loading and REST API work with existing LM Studio setup, (4) Local-only or privacy-sensitive work. Requires LM Studio 0.4+ with server (default :1234). No CLI required. It is an AI Agent Skill for Claude Code / OpenClaw, with 2511 downloads so far.

How do I install Offload Tasks to LM Studio Models?

Run "/install lm-studio-subagents" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Offload Tasks to LM Studio Models free?

Yes, Offload Tasks to LM Studio Models is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Offload Tasks to LM Studio Models support?

Offload Tasks to LM Studio Models is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Offload Tasks to LM Studio Models?

It is built and maintained by Tyler Sinclair (@t-sinclair2500); the current version is v1.0.3.

More Skills

Offload Tasks to LM Studio Models