功能描述

Tiered model selection and cost optimization for multi-agent AI workflows. Use this skill whenever you are choosing a model for a task, spinning up a sub-age...

使用说明 (SKILL.md)

Agent Cost Strategy

Name: Agent Cost Strategy
Author: djc00p

Use the cheapest model that can reliably do the job. Most tasks don't need your most powerful model.

The Three Tiers

Tier	When to Use	Examples
Fast/Cheap	Sub-agents, background tasks, automated fixes, simple lookups, short replies	Claude Haiku, GPT-4o-mini, Gemini Flash
Mid-tier	Main session dialogue, moderate reasoning, multi-step tasks	Claude Sonnet, GPT-4o, Gemini Pro
Powerful	Architecture decisions, deep reviews, hard problems, after cheaper models fail twice	Claude Opus, GPT-4.5, Gemini Ultra

Task → Tier Routing

Fix failing tests          → Fast/Cheap
Write boilerplate          → Fast/Cheap
Research / search          → Fast/Cheap
Cron / scheduled tasks     → Fast/Cheap (always)
Short replies (hi, ok)     → Fast/Cheap (always)
Background monitoring      → Fast/Cheap (always)
Build new feature          → Mid-tier
Review a PR                → Mid-tier
Main assistant dialogue    → Mid-tier (default)
Architecture decisions     → Powerful
Deep code review           → Powerful
Stuck after 2 attempts     → Escalate one tier up

Heartbeat / Cron Model Rule

Always specify the cheapest model for scheduled and background tasks — they run frequently and costs add up fast. Check your platform's config for how to set a model per cron/heartbeat job.

For heartbeat intervals: set them just under your provider's cache TTL to keep the prompt cache warm and pay cache-read rates instead of full input rates. Check your provider's docs for the exact TTL.

Communication Pattern Rule

One-word and short conversational messages (hi, thanks, ok, sure, yes, no) should always route to Fast/Cheap. Never burn a mid-tier or powerful model on an acknowledgment.

Cache Optimization

Prompt caching cuts costs 50-90% on repeated context. Cache writes cost ~25% more but pay off after just 1-2 reuses. See references/cache-optimization.md for patterns and break-even math.

Batch API (Non-Urgent Tasks)

For cron jobs, scheduled analysis, or anything that doesn't need an immediate response — use the Batch API (Anthropic/OpenAI both offer it). 50% discount in exchange for async delivery (results within 24h). Never use real-time API for background work that can wait.

Sub-Agent Model Rule (Critical)

Always explicitly set the model when spawning sub-agents. Never rely on defaults — the default inherits the parent session model (expensive mid-tier). One month of sub-agents defaulting to Sonnet = 96% of costs going to Sonnet when it should be split ~80/20 Haiku/Sonnet.

sessions_spawn → always include model: "claude-haiku-4-5-20251001" (or equivalent fast-cheap)

Default sub-agent tasks to Haiku for cost efficiency. Override with a stronger model when task complexity or accuracy requirements justify it.

New Session / Machine Cold Start Cost

When starting a fresh session (new machine, new session after /new), the cache is empty. The first few messages will write the entire context (skills, workspace files, memory) to cache at 1.25x the normal input rate. This is unavoidable but temporary — it pays off within 2-3 messages once the cache warms up.

Don't panic at the first few messages being expensive on a new machine. The cache write cost is a one-time investment that makes every subsequent message ~90% cheaper.

Signs You're Over-Spending

Running powerful models on tasks Fast/Cheap can handle
No caching on repeated system prompts
Heartbeat/cron jobs using the default (expensive) model
Sub-agents spawned without explicit model = biggest cost leak

Session & Cache Management

Keep sessions alive when possible — longer sessions build cache and reduce costs. Only end sessions when context is genuinely full or for privacy reasons.

Anthropic's prompt cache builds from repeated context within a live session. When a session starts fresh, all context (system prompt, workspace files, skills) loads cold — typically 400-600k tokens at full cost. Once cached, subsequent messages cost ~10% of that.

The math:

Cold session start: 600k tokens × full price = expensive
After cache warms up: 600k tokens × 10% cache price = ~90% cheaper per message
Ending a session destroys the cache and forces a full cold reload next time

Rules:

Let sessions run as long as possible for cost efficiency
Only start a new session (/new) when context is genuinely full (>80%) or when you need a fresh privacy boundary
Ending sessions should be intentional — for privacy/data-retention reasons, not routine cost management
The longer a session runs, the cheaper each message gets

Privacy & Cache Note: Cached context may include workspace files and memory — avoid caching sessions containing secrets or sensitive PII. If a session will cache sensitive data, plan to end it when done.

Delegation rule (keep main agent lean):

Main agent (Sonnet/mid-tier) = conversational only: planning, coordination, reviewing results
Sub-agents (Haiku/fast-cheap) = all actual doing: file edits, research, builds, data tasks
Keeping the main agent conversational reduces its context growth and keeps cache hits high

安全使用建议

This skill is coherent and useful for reducing API spend, but pay attention to the privacy and operational trade-offs it recommends: keeping sessions long and caching large system prompts, workspace files, and memory will reduce cost but increases data retention and the chance that sensitive data could be stored and reused. Before enabling broadly: 1) Verify your provider's cache TTLs and how cached data is stored/isolated by tenant; 2) Ensure organizational policies prohibit caching secrets or PII (or implement filters to strip them); 3) Test the sub-agent spawn rule in a non-production environment to confirm model selection is enforced (so defaults don't leak expensive models); 4) Monitor cache hit rates, costs, and any unexpected retention of sensitive context; 5) If you have compliance or privacy constraints, consult those teams before adopting the 'keep sessions alive' recommendations. If you want, I can extract the exact actionable rules from the SKILL.md into a checklist or a gating policy for safe rollout.

功能分析

Type: OpenClaw Skill Name: agent-cost-strategy Version: 1.3.6 The skill bundle provides legitimate strategies for cost optimization in multi-agent AI workflows, focusing on tiered model selection, prompt caching, and Batch API usage. It includes instructions for routing tasks to appropriate models (e.g., using Claude Haiku for sub-agents and background tasks) and managing session longevity to maximize cache hits. No malicious code, data exfiltration, or harmful prompt injections were identified in SKILL.md or the associated documentation.

能力标签

cryptocan-make-purchases

能力评估

✓ Purpose & Capability

The skill's name and description (tiered model selection, sub-agent model rules, cron/heartbeat guidance) match the SKILL.md content. It asks for no binaries, no env vars, and contains only policy-style instructions appropriate for cost optimization.

ℹ Instruction Scope

Instructions stay focused on cost strategies (tier routing, cache patterns, batch API, explicit model selection for sub-agents). However, several recommendations increase data retention/caching (keep sessions alive, cache workspace files and memory, put static content in system prompts). The skill does warn to avoid caching secrets, but those caching/long-session recommendations materially affect privacy and attack surface and should be considered before application in sensitive contexts.

✓ Install Mechanism

Instruction-only skill with no install spec or code files — lowest install risk. Nothing will be written to disk or downloaded by the skill itself.

✓ Credentials

No environment variables, credentials, or config paths are requested. The lack of requested secrets is proportionate to an instructions-only cost-optimization guide.

✓ Persistence & Privilege

The skill does not request permanent presence (always: false) and contains no mechanism to alter other skills or system settings. Note: autonomous invocation is allowed by platform default (not flagged here) — combine with caching guidance this increases operational impact and should be monitored.

版本历史

v1.3.6

Soften absolute session/model rules; add privacy note on cached context

v1.3.5

Fix: replace bare code blocks with ```text for consistent rendering

v1.3.4

Remove internal FUTURE.md from published bundle — not intended for public distribution

v1.3.3

Add explicit sub-agent model rule (always specify Haiku, never rely on defaults) and cold start cache write cost explanation

v1.3.2

Remove internal mmlog self-improvement section (not for public skills)

v1.3.1

Add cache break-even math and Batch API section (50% discount for non-urgent tasks). Cache writes pay off after 1-2 reuses.

v1.3.0

Added session & cache management section — explains why deleting sessions spikes costs, how prompt cache works, and the delegation rule for keeping main agent lean

v1.2.1

Fix: Added clawdbot runtime metadata

v1.2.0

Refactor: Fully provider-agnostic. Removed Anthropic-specific heartbeat config. Punchier description for better triggering. Cleaner task-to-tier routing table. Cache optimization reference updated to cover all providers.

v1.1.0

Added: 55min heartbeat TTL alignment for Anthropic cache, always-Haiku rule for cron/scheduled tasks, communication pattern rule (short replies never use expensive models)

v1.0.0

Initial release — tiered model selection framework for multi-agent workflows with cache optimization patterns

元数据

Slug agent-cost-strategy

版本 1.3.6

许可证 MIT-0

累计安装 2

当前安装数 2

历史版本数 11

常见问题