← 返回 Skills 市场
nimaansari

LLM Cost Watchdog

作者 Nima Ansari · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
108
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install llm-cost-watchdog
功能描述
Monitors real-time LLM API costs, detects runaway loops, enforces budgets, audits code risk, and reports usage across multiple providers and models.
使用说明 (SKILL.md)

Cost Watchdog 💰

Real-time cost tracking layer for LLM-based agents. Prices every call live, detects runaway loops in code, enforces budget ceilings mid-execution.

1. Identity

Observes LLM spend without disturbing the agent. Prevents $2,400-overnight-loop disasters by making cost a first-class concern: priced at write time, budgeted at check time, surfaced in reports.

2. Triggers

Activate when:

  • User mentions cost, budget, tokens, billing.
  • Code contains LLM API calls (Anthropic, OpenAI, OpenRouter, Google, Groq, ...).
  • Agent loops or recursive workflows.
  • Batch / streaming processing with unclear bounds.
  • /cost-watchdog [command] is invoked.

3. Commands

Run via python3 scripts/cost_watchdog.py \x3Ccmd> (or hook into your own CLI).

Command What it does
session Spend totals from usage.jsonl — calls, tokens, cost, top models.
report 24h / 7d / 30d windows with top model per window.
tail [--once] Watch OpenClaw session JSONL and log every assistant turn.
detect [--json] Identify which model the agent is currently using (5 probes).
audit \x3Cfile.py> AST-based code risk scan: unbounded loops, recursion, missing max_tokens.
price \x3Cmodel> Live pricing for one model, with source + cache age.
estimate \x3Cmodel> Project cost for n iterations of a given call.
alternatives \x3Cmodel> Cheaper same-unit models.
errors [--limit N] Recent swallowed exceptions (silent failures made visible).
validate-tokens \x3Cmodel> Compare our heuristic against provider's authoritative count.
reset [--all] Clear current-day log (--all also clears rolled files).

4. Pricing layer

Source chain

openrouter/*            → OpenRouter API (live)           → static fallback
anything else           → LiteLLM JSON (live, cached 24h) → OpenRouter (permissive) → static fallback
  • 2600+ models indexed across chat, completion, embedding, image, audio, video, rerank, OCR, search modes.
  • 30+ providers in the static fallback: Anthropic, OpenAI, Google, Groq, Mistral, Cohere, DeepSeek, Perplexity, xAI, Bedrock, Azure, and more.
  • Unit-aware: token, image, second, query, page, character, pixel. Alternatives never compare across units.
  • Circuit breaker opens after 3 consecutive network failures for a host; falls through to cache/static until the cool-down ends (60s).

Tuning

Env var Default Effect
CW_PRICE_TTL_SECONDS 86400 (24h) Cache lifetime. 0 = hit network every call.
CW_OFFLINE unset If 1, never touch the network.
CW_STATIC_ONLY unset If 1, skip live sources entirely. Used by tests.
CW_LOG_DIR ~/.cost-watchdog Where usage/errors/cache files live.
CW_BUDGET_USD unset Ceiling; wrappers raise BudgetExceeded when crossed.

Refresh static pricing

python3 scripts/refresh_pricing.py

Regenerates references/pricing.md from the live sources so the offline fallback is fresh. Aborts if fewer than 100 rows came back (protects against clobbering on a network outage).

5. Tracking layer — how we know what was spent

Four independent paths, all write to ~/.cost-watchdog/usage.jsonl:

Path When to use Covers streams?
openclaw_tailer.py --watch Running OpenClaw. Zero code changes. yes (reads completed turns)
track_openai(client) You call OpenAI-compatible SDK (covers OpenRouter, Groq, DeepSeek, Mistral, Together, Fireworks, Cerebras, Anyscale, ...). yes (tee'd iterator, auto-injects stream_options={"include_usage": True})
track_anthropic(client) Direct Anthropic SDK. yes (wraps messages.stream())
track_gemini(model) / track_cohere(client) / track_bedrock(client) Direct provider SDKs. no (add wrappers if you need streams)
install_global_capture() (httpx) Any modern Python SDK using httpx. no — streams are flagged into errors.jsonl so the gap is visible. Use the SDK wrappers for stream coverage.

Usage log rotates daily: usage.YYYY-MM-DD.jsonl. session_total(since=...) skips files outside the window before scanning.

Aggregation uses canonical_family() so claude-haiku-4-5-20251001, claude-haiku-4-5, and claude-haiku-4.5 are one row in reports.

6. Budget enforcement

Two mechanisms:

  1. Write-time check (race-safe): append_usage(entry, budget_ceiling=X) takes an fcntl.flock on a sidecar, sums the current session, and refuses the write (raises BudgetExceeded) if the call would cross X.
  2. Post-write check: wrappers compare cumulative spend to CW_BUDGET_USD after logging and raise if over. Used when the wrapper doesn't know the ceiling at call time.

Either path stops the agent mid-loop; the LLM call still returns to the caller, but the next one blocks.

7. Code audit (AST)

python3 scripts/cost_watchdog.py audit path/to/agent.py

Walks the AST and reports:

  • CRITICALwhile True with an LLM call and no max_iterations-style bound.
  • CRITICAL — function that recurses and calls an LLM API with no depth argument.
  • HIGH — plain while that calls an API with no retry/iteration counter.
  • MEDIUM — LLM call missing max_tokens / max_completion_tokens.
  • MEDIUM — function with ≥5 sequential LLM calls (batching candidate).

Every finding has a file line number. No more count('def ') > 3 and count('self.') > 5 → "recursion detected" false positives.

8. Detection — "what model is the agent using?"

python3 scripts/cost_watchdog.py detect

Five probe layers, ranked by confidence:

Probe Confidence
OpenClaw session JSONL high
Claude Code session JSONL high
Most recent usage-log entry high
Claude Code settings.json medium
Env vars (ANTHROPIC_MODEL, OPENAI_MODEL, ...) medium

Emits a table or --json.

9. Files

Path Purpose
scripts/_pricing.py Router: picks LiteLLM / OpenRouter / static per query.
scripts/_sources.py Three PricingSource classes + disk cache + circuit breaker.
scripts/tokenizer.py Provider-aware token counting (tiktoken for OpenAI; calibrated heuristics for others).
scripts/model_canon.py canonical_family() — collapses model variants.
scripts/code_audit.py AST cost-risk walker.
scripts/usage_log.py JSONL writer + rotation + aggregation.
scripts/tracker.py SDK wrappers + streaming + budget enforcement.
scripts/http_capture.py install_global_capture() — httpx transport hook.
scripts/openclaw_tailer.py Watches OpenClaw sessions.
scripts/detect_model.py Multi-layer detector.
scripts/errors.py errors.jsonl writer + reader.
scripts/io_utils.py write_json_atomic / read_json.
scripts/refresh_pricing.py Regenerates static pricing.md from live sources.
scripts/cost_watchdog.py Unified CLI dispatcher.
references/pricing.md Static fallback (regenerated; ~2600 models).
tests/test_cost_watchdog.py 73 tests: router, cache, AST, tokenizer, rotation, cassettes, circuit breaker, canonicalization.

10. Quality checklist

  • Live pricing from LiteLLM + OpenRouter, 24h-cached, with static fallback.
  • Exact-match model lookup (no substring conflation).
  • Multi-modal (token / image / second / query / page / character).
  • Unit-aware alternatives (never compares tokens to images).
  • AST-based code audit with line numbers.
  • Provider-aware tokenization (no more tiktoken-for-Claude).
  • Variance-based confidence (no += 0.05 theater).
  • Atomic writes to all shared state files.
  • fcntl.flock-guarded budget check-and-log (no race).
  • Circuit breaker on flaky networks (no 5s hang per call).
  • Streaming capture via SDK wrappers; streams flagged in errors.jsonl via HTTP capture.
  • Daily log rotation + date-scoped aggregation.
  • Canonical model families (variants collapse in reports).
  • errors.jsonl surfaces silent failures; cost_watchdog errors shows them.
  • Cassette tests for LiteLLM + OpenRouter parse paths (schema-drift safety net).
  • 73 logic tests passing.

11. Known limits (be honest)

  • Tokenizer heuristics for Claude/Gemini/etc. are calibrated from docs, not measured. Run cost_watchdog validate-tokens \x3Cmodel> to check drift against the provider's authoritative count when you have an API key.
  • install_global_capture() can't see streaming responses — httpx exposes an empty body until the user reads the stream. Use track_openai / track_anthropic for stream coverage; http_capture logs skipped streams to errors.jsonl so the gap is visible.
  • Non-httpx SDKs (older Cohere, boto3 with custom transport) need the per-SDK wrappers — HTTP capture won't see them.
  • LiteLLM community data can lag 24-48h on brand-new models. OpenRouter's API is truly live for anything it routes.

12. Testing

python3 -m unittest tests.test_cost_watchdog     # 73 tests
python3 scripts/code_audit.py test_risky_code.py # sample risks
python3 scripts/cost_watchdog.py report          # current spend summary
安全使用建议
This skill appears to implement what it claims and does not request unrelated credentials, but it can capture and log full LLM requests/responses and (if enabled) any Python httpx traffic. Before installing: 1) Review the code paths that enable global capture (install_global_capture, openclaw_tailer, and the SDK wrappers) to ensure you understand what is logged and where. 2) Set CW_LOG_DIR to a directory with restricted permissions and consider encrypting or rotating logs if they may contain secrets. 3) Prefer per-SDK wrappers (track_openai, track_anthropic) over the global httpx capture to reduce unintended data capture. 4) Run in an isolated environment (container or dedicated VM) if you will enable live pricing/network access. 5) If you need deterministic tests or to avoid network lookups, set CW_STATIC_ONLY=1 (the test suite already does this). 6) If you aren’t comfortable with the possibility of recording sensitive payloads, do not enable global capture or automatic tailing; otherwise the skill is coherent with its purpose. Additional code review is recommended if you will deploy this in production or on systems with sensitive data.
功能分析
Type: OpenClaw Skill Name: llm-cost-watchdog Version: 1.0.0 The skill bundle provides a comprehensive suite for monitoring LLM costs and enforcing budgets, but it employs several high-risk technical methods. Specifically, `scripts/http_capture.py` monkey-patches the `httpx` library globally to intercept and inspect network traffic for telemetry, and `scripts/detect_model.py` performs broad file system access to read session logs and settings from other applications (e.g., Claude Code in `~/.claude`). While these behaviors are aligned with the tool's stated purpose and no evidence of data exfiltration was found, the use of global library hooks and cross-application data access represents a significant security surface. The tool fetches pricing data from legitimate endpoints at `githubusercontent.com` (LiteLLM) and `openrouter.ai`.
能力标签
cryptocan-make-purchasesrequires-sensitive-credentials
能力评估
Purpose & Capability
Name/description match what the package implements: pricing loaders, trackers, SDK wrappers, AST audit, visualizer, and budget enforcement. It does not request unrelated credentials or external OS-level access in the registry metadata. The included scripts and references (pricing.md, trackers, audit) are proportionate to the stated feature set.
Instruction Scope
SKILL.md and scripts instruct the agent to: tail OpenClaw session JSONL, wrap provider SDK clients (track_openai, track_anthropic, etc.), and optionally install a global httpx capture (install_global_capture()). These mechanisms write to ~/.cost-watchdog/usage.jsonl and errors.jsonl and may record full prompts, completions, and HTTP payloads. Capturing arbitrary httpx traffic can include non-LLM endpoints or secrets in request bodies/headers; the README/tests also reference regenerating pricing via live network calls. Tests expect a LICENSE file that isn't listed in the manifest — a small inconsistency. Overall the actions are explainable for cost-tracking but broaden the data surface (sensitive content may be logged) and grant broad discretion to the skill when enabled.
Install Mechanism
No external install spec or remote downloads are declared; the bundle contains Python scripts and docs. That reduces supply-chain concerns. The README suggests optional symlinking into global skill directories (user action). No use of URL-based extract/install was observed in metadata.
Credentials
No privileged environment variables or external credentials are required by the skill metadata. The skill uses its own env toggles (CW_PRICE_TTL_SECONDS, CW_OFFLINE, CW_STATIC_ONLY, CW_LOG_DIR, CW_BUDGET_USD) which are proportional to caching, offline/test mode, log location, and budget enforcement. It does not ask for unrelated service keys, which is appropriate.
Persistence & Privilege
always:false (not force-included). The skill writes logs to ~/.cost-watchdog by default and exposes wrappers that can patch SDK behavior at runtime (inject stream options, tee iterators). Those changes are expected for a monitoring/enforcement layer but increase the blast radius if the global httpx capture or automatic wrapper injection is enabled—review and restrict usage to controlled environments.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install llm-cost-watchdog
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /llm-cost-watchdog 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Cost Watchdog v1.0.0 — Initial Release - Tracks real-time LLM spending across 2600+ models and 30+ providers, enforcing budget ceilings to prevent runaway costs. - Detects runaway agent loops and unbounded API calls with AST-based code auditing. - Supports live model pricing, usage tracking, and budget enforcement via CLI and SDK wrappers. - Aggregates and reports spend by session, time window, and top models; provides alerts for silent errors and budgeting breaches. - Offers multi-layer detection of active models and robust offline/static fallbacks for pricing. - Includes unified CLI with commands for session tracking, auditing, pricing, alternatives, reporting, error review, and more.
元数据
Slug llm-cost-watchdog
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

LLM Cost Watchdog 是什么?

Monitors real-time LLM API costs, detects runaway loops, enforces budgets, audits code risk, and reports usage across multiple providers and models. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 108 次。

如何安装 LLM Cost Watchdog?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install llm-cost-watchdog」即可一键安装,无需额外配置。

LLM Cost Watchdog 是免费的吗?

是的,LLM Cost Watchdog 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

LLM Cost Watchdog 支持哪些平台?

LLM Cost Watchdog 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 LLM Cost Watchdog?

由 Nima Ansari(@nimaansari)开发并维护,当前版本 v1.0.0。

💬 留言讨论