← 返回 Skills 市场
princnl

Smart Model Routing for Z.AI

作者 T · GitHub ↗ · v1.0.0
cross-platform ✓ 安全检测通过
1329
总下载
1
收藏
2
当前安装
1
版本数
在 OpenClaw 中安装
/install smart-model-routing-for-zai
功能描述
Auto-route tasks to the cheapest z.ai (GLM) model that works correctly. Three-tier progression: Flash → Standard → Plus/32B. Classify before responding. FLASH (default): factual Q&A, greetings, reminders, status checks, lookups, simple file ops, heartbeats, casual chat, 1–2 sentence tasks, cron jobs. ESCALATE TO STANDARD: code >10 lines, analysis, comparisons, planning, reports, multi-step reasoning, tables, long writing >3 paragraphs, summarization, research synthesis, most user conversations. ESCALATE TO PLUS/32B: architecture decisions, complex debugging, multi-file refactoring, strategic planning, nuanced judgment, deep research, critical production decisions. Rule: If a human needs >30 seconds of focused thinking, escalate. If Standard struggles with complexity, go to Plus/32B. Save major API costs by starting cheap and escalating only when needed.
使用说明 (SKILL.md)

Smart Model Switching

Three-tier z.ai (GLM) routing: Flash → Standard → Plus / 32B

Start with the cheapest model. Escalate only when needed. Designed to minimize API cost without sacrificing correctness.


The Golden Rule

If a human would need more than 30 seconds of focused thinking, escalate from Flash to Standard.
If the task involves architecture, complex tradeoffs, or deep reasoning, escalate to Plus / 32B.


Model Reality (Relative)

Tier Example Models Purpose
Flash GLM-4.5-Flash, GLM-4.7-Flash Fastest & cheapest
Standard GLM-4.6, GLM-4.7 Strong reasoning & code
Plus / 32B GLM-4-Plus, GLM-4-32B-128K Heavy reasoning & architecture

Bottom line: Wrong model selection wastes money OR time. Flash for simple, Standard for normal work, Plus/32B for complex decisions.


💚 FLASH — Default for Simple Tasks

Stay on Flash for:

  • Factual Q&A — “what is X”, “who is Y”, “when did Z”
  • Quick lookups — definitions, unit conversions, short translations
  • Status checks — monitoring, file reads, session state
  • Heartbeats — periodic checks, OK responses
  • Memory & reminders
  • Casual conversation — greetings, acknowledgments
  • Simple file ops — read, list, basic writes
  • One-liner tasks — anything answerable in 1–2 sentences
  • Cron jobs (always Flash by default)

NEVER do these on Flash

  • ❌ Write code longer than 10 lines
  • ❌ Create comparison tables
  • ❌ Write more than 3 paragraphs
  • ❌ Do multi-step analysis
  • ❌ Write reports or proposals

💛 STANDARD — Core Workhorse

Escalate to Standard for:

Code & Technical

  • Code generation — functions, scripts, features
  • Debugging — normal bug investigation
  • Code review — PRs, refactors
  • Documentation — README, comments, guides

Analysis & Planning

  • Comparisons and evaluations
  • Planning — roadmaps, task breakdowns
  • Research synthesis
  • Multi-step reasoning

Writing & Content

  • Long-form writing (>3 paragraphs)
  • Summaries of long documents
  • Structured output — tables, outlines

Most real user conversations belong here.


❤️ PLUS / 32B — Complex Reasoning Only

Escalate to Plus / 32B for:

Architecture & Design

  • System and service architecture
  • Database schema design
  • Distributed or multi-tenant systems
  • Major refactors across multiple files

Deep Analysis

  • Complex debugging (race conditions, subtle bugs)
  • Security reviews
  • Performance optimization strategy
  • Root cause analysis

Strategic & Judgment-Based Work

  • Strategic planning
  • Nuanced judgment and ambiguity
  • Deep or multi-source research
  • Critical production decisions

🔄 Implementation

For Subagents

// Routine monitoring
sessions_spawn(task="Check backup status", model="GLM-4.5-Flash")

// Standard code work
sessions_spawn(task="Build the REST API endpoint", model="GLM-4.7")

// Architecture decisions
sessions_spawn(task="Design the database schema for multi-tenancy", model="GLM-4-Plus")
For Cron Jobs
json
Copy code
{
  "payload": {
    "kind": "agentTurn",
    "model": "GLM-4.5-Flash"
  }
}
Always use Flash for cron unless the task genuinely needs reasoning.

📊 Quick Decision Tree
pgsql
Copy code
Is it a greeting, lookup, status check, or 1–2 sentence answer?
  YES → FLASH
  NO ↓

Is it code, analysis, planning, writing, or multi-step?
  YES → STANDARD
  NO ↓

Is it architecture, deep reasoning, or a critical decision?
  YES → PLUS / 32B
  NO → Default to STANDARD, escalate if struggling
📋 Quick Reference Card
less
Copy code
┌─────────────────────────────────────────────────────────────┐
│                  SMART MODEL SWITCHING                      │
│              Flash → Standard → Plus / 32B                  │
├─────────────────────────────────────────────────────────────┤
│  💚 FLASH (cheapest)                                        │
│  • Greetings, status checks, quick lookups                  │
│  • Factual Q&A, reminders                                   │
│  • Simple file ops, 1–2 sentence answers                    │
├─────────────────────────────────────────────────────────────┤
│  💛 STANDARD (workhorse)                                    │
│  • Code > 10 lines, debugging                               │
│  • Analysis, comparisons, planning                          │
│  • Reports, long writing                                    │
├─────────────────────────────────────────────────────────────┤
│  ❤️ PLUS / 32B (complex)                                    │
│  • Architecture decisions                                   │
│  • Complex debugging, multi-file refactoring                │
│  • Strategic planning, deep research                        │
├─────────────────────────────────────────────────────────────┤
│  💡 RULE: >30 sec human thinking → escalate                 │
│  💰 START CHEAP → SCALE ONLY WHEN NEEDED                    │
└─────────────────────────────────────────────────────────────┘
Built for z.ai (GLM) setups.
安全使用建议
This skill is a policy document (routing heuristics) rather than an integration: it won't call z.ai or incur costs on its own and it doesn't ask for any secrets. Before relying on it, confirm your agent platform actually supports and enforces model selection (sessions_spawn or equivalent) and respects these rules. Test the escalation rules with non-critical tasks to ensure they don't cause excessive upscaling (and cost). If you intend automatic spawning of z.ai models, you'll need appropriate platform wiring and credentials elsewhere — the skill itself does not provide or request them. Finally, monitor logs and set budget/timeout guards so repeated automatic escalation or retries can't run up unexpected API charges.
功能分析
Type: OpenClaw Skill Name: smart-model-routing-for-zai Version: 1.0.0 The skill bundle defines a model routing strategy for an AI agent to select appropriate z.ai (GLM) models based on task complexity, aiming to optimize costs. The `SKILL.md` file provides detailed guidelines and examples for model selection but contains no malicious instructions, prompt injection attempts to subvert the agent's core purpose, data exfiltration, malicious execution commands, persistence mechanisms, or obfuscation. The `_meta.json` file is standard metadata. All content is aligned with the stated purpose of smart model switching.
能力评估
Purpose & Capability
The name and description (auto-route to the cheapest working GLM model) match the SKILL.md guidance. However, the skill is purely an instruction document — it contains rules and example session_spawn calls but no code, no integration, and no credentials. That means it cannot itself contact z.ai or enforce routing; it only tells an agent how to decide which model to pick. This is coherent but important to understand.
Instruction Scope
SKILL.md confines itself to classification and escalation rules for model selection and includes example usage (sessions_spawn). It does not instruct reading unrelated files, sending data to external endpoints, or harvesting secrets. The guidance assumes the agent/platform provides a sessions_spawn capability.
Install Mechanism
No install spec and no code files are present. This minimizes disk writes and third-party installs; lower risk and consistent with an instruction-only policy.
Credentials
No environment variables, credentials, or config paths are requested. That is proportionate to an instruction-only routing policy.
Persistence & Privilege
Skill does not request always:true, does not modify other skills, and keeps no persistent privileges. Autonomous invocation is allowed by platform default but the skill itself doesn't demand elevated presence or cross-skill access.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install smart-model-routing-for-zai
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /smart-model-routing-for-zai 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
smart-model-routing-for-zai 1.0.0 introduces an auto-escalating model-routing strategy for cost-effective, accurate task processing on z.ai (GLM): - Implements three-tier task routing: Flash → Standard → Plus/32B, starting with the cheapest model and escalating only when needed. - Clear rules and decision trees guide when to escalate tasks based on task complexity and required reasoning. - Provides detailed examples and quick-reference guides for selecting the correct model tier. - Aims to minimize API costs while ensuring correct and efficient responses for a wide range of user tasks.
元数据
Slug smart-model-routing-for-zai
版本 1.0.0
许可证
累计安装 3
当前安装数 2
历史版本数 1
常见问题

Smart Model Routing for Z.AI 是什么?

Auto-route tasks to the cheapest z.ai (GLM) model that works correctly. Three-tier progression: Flash → Standard → Plus/32B. Classify before responding. FLASH (default): factual Q&A, greetings, reminders, status checks, lookups, simple file ops, heartbeats, casual chat, 1–2 sentence tasks, cron jobs. ESCALATE TO STANDARD: code >10 lines, analysis, comparisons, planning, reports, multi-step reasoning, tables, long writing >3 paragraphs, summarization, research synthesis, most user conversations. ESCALATE TO PLUS/32B: architecture decisions, complex debugging, multi-file refactoring, strategic planning, nuanced judgment, deep research, critical production decisions. Rule: If a human needs >30 seconds of focused thinking, escalate. If Standard struggles with complexity, go to Plus/32B. Save major API costs by starting cheap and escalating only when needed. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1329 次。

如何安装 Smart Model Routing for Z.AI?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install smart-model-routing-for-zai」即可一键安装,无需额外配置。

Smart Model Routing for Z.AI 是免费的吗?

是的,Smart Model Routing for Z.AI 完全免费(开源免费),可自由下载、安装和使用。

Smart Model Routing for Z.AI 支持哪些平台?

Smart Model Routing for Z.AI 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Smart Model Routing for Z.AI?

由 T(@princnl)开发并维护,当前版本 v1.0.0。

💬 留言讨论