Description

Prevent runaway API costs with 3-tier protection: caution, emergency, and hard cap. Works for ANY expensive API GPT-4, Claude Opus, Gemini, cloud services.

README (SKILL.md)

Cost Control 3-Tier API Spend Protection

Name: Cost Control
Author: theshadowrose

Prevent runaway API costs with 3-tier protection: caution, emergency, and hard cap. Works for ANY expensive API GPT-4, Claude Opus, Gemini, cloud services.

Prevent runaway API costs. Works for ANY expensive API (GPT-4, Claude Opus, Gemini, cloud services).

Cost Control System is a production-grade cost monitoring framework with tiered responses, external watchdog, and kill switch.

Originally built to prevent catastrophic AI API spend after a real runaway cost incident, now generalized for any expensive API that charges per request/token/compute.

The Problem

You're running a system that makes API calls to an expensive service (OpenAI GPT-4, Anthropic Claude Opus, Google Gemini, cloud compute, etc.).

One bug, one infinite loop, one configuration error → hundreds of dollars burned in minutes.

Manual monitoring doesn't work. You need automated cost controls with failsafes.

The Solution

Cost Control System implements 3-tier protection:

Tier 1: Caution Mode (Warning)

When cost velocity exceeds threshold (e.g., $3/15min):

Slow down check intervals
Log warnings
Continue operating but throttled

Tier 2: Emergency Mode (Stop)

When cost velocity is dangerous (e.g., $5/15min):

Stop all API calls
Enter maintenance mode
Preserve existing state
Alert operator

Tier 3: Kill Switch (External Watchdog)

Independent process monitors cost separately:

If Tier 2 fails to stop runaway spend
Kill the entire process
Prevent restart until manually cleared
Last line of defense

Quick Start (3 Steps)

1. Install

pip install cost-control-system
# OR copy cost_control.py to your project

2. Integrate with Your API Client

from cost_control import CostTracker

tracker = CostTracker(
    cost_caution_15min=3.00,      # Warning at $3/15min
    cost_emergency_15min=5.00,    # Emergency at $5/15min
    cost_daily_cap=25.00,         # Hard cap at $25/day
)

# Before each API call, check if allowed
allowed, reason = tracker.is_call_allowed()
if not allowed:
    print(f"API call blocked: {reason}")
    # Handle emergency mode (alert, exit, etc.)
    return

# Make your API call
response = your_expensive_api_call()

# Record actual cost
input_tokens = response.usage.input_tokens
output_tokens = response.usage.output_tokens
tracker.record_call(input_tokens, output_tokens, request_id="req-123")

3. Deploy External Watchdog (Optional but Recommended)

# Run external watchdog every 2 minutes (via cron)
*/2 * * * * cd /your/project && python3 cost_watchdog.py >> logs/watchdog.log 2>&1

Time to integrate: 15-20 minutes.

Use Cases

AI API Calls (GPT-4, Claude Opus, Gemini)

Prevent runaway token usage from:

Infinite loops generating responses
Misconfigured context windows (100K+ tokens per call)
Prompt injection causing excessive output

Cloud Compute (AWS Lambda, GCP Functions, Azure)

Monitor spend on:

Serverless function invocations
Compute hours (EC2, VMs)
Database queries (expensive per-query pricing)

Third-Party APIs (Stripe, Twilio, SendGrid)

Track costs for:

SMS/email sends
Payment processing fees
Data enrichment APIs

Research/ML Training

Control costs for:

GPU compute time
Model training runs
Batch inference jobs

Any API with measurable per-request or per-unit costs benefits from this.

Architecture

3-Tier Response System

Normal → Caution (Tier 1) → Emergency (Tier 2) → Kill (Tier 3)
  ↓          ↓                    ↓                  ↓
  OK      Slow down          Stop calls      Kill process

Tier 1: Caution Mode

Trigger: 15-minute cost > $3 (configurable)
Response:

Slow down check intervals (reduce API call frequency)
Log warnings
Continue operating

Exit: 15-minute cost \x3C $2 for 5 minutes

Tier 2: Emergency Mode

Trigger: 15-minute cost > $5 OR daily cost > $25 (configurable)
Response:

Block all API calls (is_call_allowed() returns False)
Write emergency flag file
Send alerts (Discord, Slack, email, etc.)
System remains alive but paused

Exit: Manual intervention required (rm state/cost_emergency.flag)

Tier 3: Kill Switch (External Watchdog)

Trigger: Hourly cost > $12 OR daily cost > $30 (configurable, higher than Tier 2)
Response:

Kill the entire process (SIGTERM → SIGKILL)
Write emergency flag (prevents restart)
Independent of main process (can't be bypassed)

Purpose: Catch cases where Tier 2 fails (e.g., bug in cost tracker itself)

Configuration

Basic Config (Minimal Setup)

from cost_control import CostTracker

tracker = CostTracker(
    # Tier 1 (Caution)
    cost_caution_15min=3.00,
    
    # Tier 2 (Emergency)
    cost_emergency_15min=5.00,
    cost_daily_cap=25.00,
    
    # State persistence
    state_dir="./state",
)

Advanced Config (All Options)

tracker = CostTracker(
    # Thresholds
    cost_caution_15min=3.00,
    cost_emergency_15min=5.00,
    cost_daily_cap=25.00,
    max_cost_per_call=0.50,           # Single-call sanity check
    
    # Recovery
    caution_recovery_threshold=2.00,  # Exit caution when \x3C$2/15min
    caution_recovery_duration=300,    # Must stay below for 5 min
    
    # State
    state_dir="./state",
    cost_log_file="cost_log.jsonl",
    
    # Alerting
    alert_callback=my_alert_function,
)

Pricing Config (Your API)

# Define your API's pricing
tracker.set_pricing(
    input_price_per_unit=15.00,   # e.g., $15 per million input tokens
    output_price_per_unit=75.00,  # e.g., $75 per million output tokens
    unit_size=1_000_000,          # e.g., per million tokens
)

# For non-token APIs (e.g., per-request pricing):
tracker.set_pricing(
    fixed_cost_per_call=0.01,     # $0.01 per API call
)

See config_example.py for complete examples.

API Reference

Core Methods

`record_call(input_units, output_units, request_id=None)`

Record an API call with actual usage.

Args:

input_units (int): Input units consumed (e.g., tokens, requests)
output_units (int): Output units consumed
request_id (str, optional): Identifier for logging

Returns: True if within limits, False if emergency triggered

`is_call_allowed()`

Check if API calls are currently allowed.

Returns: (allowed: bool, reason: str)

allowed=True, reason="ok" → Safe to call API
allowed=False, reason="cost_emergency_mode" → Blocked

Usage:

allowed, reason = tracker.is_call_allowed()
if not allowed:
    handle_emergency(reason)
    return

`get_stats()`

Get current cost statistics.

Returns: Dict with keys:

{
    "cost_5min": 0.45,
    "cost_15min": 1.23,
    "cost_1hour": 3.45,
    "cost_daily": 12.34,
    "calls_1hour": 23,
    "calls_session": 156,
    "caution_mode": False,
    "emergency_mode": False,
}

`clear_emergency()`

Clear emergency mode (manual intervention).

Usage:

tracker.clear_emergency()
# Or: rm state/cost_emergency.flag

Properties

caution_mode (bool) — Whether in Tier 1 caution
emergency_mode (bool) — Whether in Tier 2 emergency
cost_15min (float) — Rolling 15-minute cost
cost_daily (float) — Total cost today (UTC)

External Watchdog Setup

1. Copy Watchdog Script

cp cost_watchdog.py /your/project/

2. Configure Watchdog

Edit cost_watchdog.py:

KILL_THRESHOLD_HOURLY = 12.00  # $12/hour (higher than Tier 2)
KILL_THRESHOLD_DAILY = 30.00   # $30/day (higher than Tier 2)

3. Deploy via Cron

crontab -e

# Add this line (run every 2 minutes)
*/2 * * * * cd /your/project && python3 cost_watchdog.py >> logs/watchdog.log 2>&1

4. Verify Watchdog is Running

tail -f logs/watchdog.log
# Should see entries every 2 minutes

Kill Switch Usage

Manual Kill Switch (Immediate Stop)

python3 kill_switch.py enable
# Stops all API calls immediately

Check Status

python3 kill_switch.py status

Resume After Emergency

python3 kill_switch.py disable
rm state/cost_emergency.flag  # Clear emergency flag
# Restart your application

Testing

Unit Tests

pytest tests/test_cost_control.py

Integration Test (Simulate Runaway)

from cost_control import CostTracker

tracker = CostTracker(
    cost_emergency_15min=1.00,  # Low threshold for testing
)

# Simulate 50 expensive calls
for i in range(50):
    allowed, reason = tracker.is_call_allowed()
    if not allowed:
        print(f"Blocked after {i} calls: {reason}")
        break
    
    # Simulate $0.10 per call
    tracker.record_call(10000, 50000, request_id=f"test-{i}")
    # Should trigger emergency after ~10 calls

# Verify emergency mode
assert tracker.emergency_mode
print("✅ Emergency mode triggered successfully")

Limitations

See LIMITATIONS.md for details, including:

Clock skew between system and API
Delayed cost reporting (API quotas update hourly)
Watchdog reliability (cron timing jitter)
Multi-process coordination

Performance

Memory

~100 bytes per recorded call
Rolling window of last 5000 calls ≈ 500KB memory
Daily total persists across restarts

CPU

Cost calculation: O(1) per call
Threshold checks: O(1)
Rolling window: O(N) where N = calls in window (typically \x3C500)

I/O

State saves: atomic writes on emergency only
Cost log: append-only, rotates at 10K entries

Troubleshooting

Problem: Emergency mode triggered but spend seems normal

Cause: Clock skew — system time doesn't match API billing time
Fix: Sync system clock with NTP (ntpdate or chrony)

Problem: Watchdog doesn't kill runaway process

Cause: Cron not running or watchdog script errors
Fix: Check cron logs (/var/log/syslog), verify script permissions

Problem: Frequent caution mode cycling

Cause: Threshold too low for your use case
Fix: Raise cost_caution_15min to match normal usage peaks

Real-World Example: Runaway API Burn Incident

What happened:

A trading bot had dozens of orphaned positions
Each position triggered an expensive AI API call every few minutes
Costs compounded rapidly — over $100/hour burn rate
Ran for over an hour before manual discovery = $150+ burn

How Cost Control would have stopped it:

Tier 1 (~5 min): Caution at velocity threshold → slow down checks
Tier 2 (~8 min): Emergency at higher threshold → stop all API calls
Tier 3 (~10 min): Watchdog kill at hourly cap → process terminated

Result: Max damage $5-8 instead of $150+

Migration Guide

From No Cost Tracking

Calculate baseline cost (1 day of normal usage)
Set cost_caution_15min = 2x baseline peak
Set cost_emergency_15min = 3x baseline peak
Set cost_daily_cap = acceptable daily budget
Deploy watchdog with higher thresholds (2x emergency)
Run parallel for 1-2 days (log only, no blocking)
Enable blocking once validated

From Manual Monitoring

Export historical cost data to CSV
Analyze 95th percentile cost per 15-minute window
Set thresholds based on percentiles
Integrate tracker with existing alerting
Gradually enable tiers (start with Tier 1 only)

Contributing

Cost Control System is open source and community-maintained.

Report Issues

Bugs, false positives, performance problems
Integration confusion
Feature requests (multi-currency, better alerting, etc.)

Submit Improvements

Additional API pricing examples
Watchdog reliability improvements
Dashboard/visualization tools

Contribution guidelines: Open an issue or PR on GitHub. We review everything.

Philosophy: Why Open Source?

Cost Control solves a universal infrastructure problem — preventing runaway API spend. This affects everyone building on paid APIs.

This is a goodwill project to:

Prevent the $153 incident from happening to others
Prove the vibe coding methodology (adaptive systems that heal themselves)
Build reputation for future paid Shadow Rose products

Support & Community

Get Help

📖 Documentation: You're reading it
💬 Discord: https://discord.com/invite/clawd (OpenClaw community)
🐦 Twitter: https://x.com/TheShadowyRose
🐛 Issues: [GitHub link]

Support Development

Cost Control is free, but if it saved you money:

☕ Ko-fi: https://ko-fi.com/theshadowrose

Donations support ongoing testing, docs, and new features.

License

MIT License — use freely, credit appreciated.

See LICENSE for details.

What's Next?

Planned Features

Multi-currency support — Track costs in multiple currencies
Budget forecasting — Predict end-of-day/month spend
Granular alerting — Per-endpoint or per-user cost tracking
Dashboard — Web UI for real-time cost monitoring

Want to work on these? Open an issue or reach out on Discord.

Credits

Created by: Shadow Rose
Extracted from: Production trading bot (battle-tested in live operation)
Philosophy: Vibe coding — infrastructure that protects itself
License: MIT

Built from a real runaway cost incident — learned the hard way so you don't have to.

Final Thought

Every system that calls expensive APIs will eventually have a runaway cost incident.

The question is: does your system detect and stop it in 5 minutes, or do you find out when the $1000 bill arrives?

Cost Control ensures you find out in 5 minutes.

Ship it. Break it. Tell us what you find.

— Shadow Rose
February 2026

☕ Support: https://ko-fi.com/theshadowrose
🐦 Follow: https://x.com/TheShadowyRose
💬 Community: https://discord.com/invite/clawd

⚠️ Disclaimer

This software is provided "AS IS", without warranty of any kind, express or implied.

USE AT YOUR OWN RISK.

The author(s) are NOT liable for any damages, losses, or consequences arising from the use or misuse of this software — including but not limited to financial loss, data loss, security breaches, business interruption, or any indirect/consequential damages.
This software does NOT constitute financial, legal, trading, or professional advice.
Users are solely responsible for evaluating whether this software is suitable for their use case, environment, and risk tolerance.
No guarantee is made regarding accuracy, reliability, completeness, or fitness for any particular purpose.
The author(s) are not responsible for how third parties use, modify, or distribute this software after purchase.

By downloading, installing, or using this software, you acknowledge that you have read this disclaimer and agree to use the software entirely at your own risk.

TRADING DISCLAIMER: This software is a tool, not a trading system. It does not make trading decisions for you and does not guarantee profits. Trading cryptocurrency, stocks, or any financial instrument carries significant risk of loss. You can lose some or all of your investment. Past performance of any system or methodology is not indicative of future results. Never trade with money you cannot afford to lose. Consult a qualified financial advisor before making investment decisions.

Support & Links


🐛 Bug Reports	[email protected]
☕ Ko-fi	ko-fi.com/theshadowrose
🛒 Gumroad	shadowyrose.gumroad.com
🐦 Twitter	@TheShadowyRose
🐙 GitHub	github.com/TheShadowRose
🧠 PromptBase	promptbase.com/profile/shadowrose

Built with OpenClaw — thank you for making this possible.

🛠️ Need something custom? Custom OpenClaw agents & skills starting at $500. If you can describe it, I can build it. → Hire me on Fiverr

Usage Guidance

This package appears to implement exactly what it claims: a local cost tracker, an external watchdog that can kill a process, and a manual kill switch. Before installing or deploying: - Run code review and test in staging; verify record_call() is invoked reliably by your app. - Ensure the state directory (default ./state) is writable only by the intended user and not writable by untrusted users to prevent tampering (someone creating cost_emergency.flag or KILL_SWITCH will block API calls). - If using the watchdog, make sure your application writes state/app.pid with its own PID and that the PID path in cost_watchdog.py points to that file. Be careful: pointing the watchdog to the wrong PID file can kill other processes. - Prefer running the watchdog under a dedicated, unprivileged cron/systemd service and restrict who can edit its config or state files. - Consider adding safeguards: validate PID ownership (compare uid), check process command line/name before killing, enforce file permissions, and test kill/restart procedures. - Understand limitations in LIMITATIONS.md: file-based state is single-process, clock skew and delayed billing can cause false positives/negatives, and emergency recovery is manual. If you cannot enforce these deployment constraints or secure the state files, exercise caution — the feature that kills processes is necessary for the purpose but can be disruptive if misconfigured.

Capability Analysis

Type: OpenClaw Skill Name: cost-control Version: 1.0.3 The bundle is a legitimate utility designed to prevent runaway API costs through a 3-tier protection system (Caution, Emergency, and Kill Switch). The core logic in cost_control.py and the external cost_watchdog.py script function as described, using local file-based state tracking and standard process signals (os.kill) to manage spend limits. There is no evidence of malicious intent, data exfiltration, or unauthorized remote access; the high-risk capability of process termination is explicitly documented as the 'Tier 3' safety mechanism.

Capability Assessment

✓ Purpose & Capability

Name/description (cost control for expensive APIs) matches the provided code and SKILL.md. All included files (CostTracker, watchdog, kill switch, config examples) are directly relevant to implementing a 3‑tier cost control system. There are no unrelated credentials, network endpoints, or binaries requested.

ℹ Instruction Scope

Runtime instructions focus on integration (call is_call_allowed() before calls, record_call() after, deploy watchdog via cron). The docs explicitly instruct creating a PID file and state directory and writing/reading flag files. This is expected for the design, but the SKILL.md/code instruct the agent/operator to create or rely on on-disk artifacts (state/, state/app.pid, cost_emergency.flag, KILL_SWITCH). Those files are central to behavior and can be misused or misconfigured if not secured.

✓ Install Mechanism

No install spec is present (instruction-only). Code is provided as Python files that the user places in their project or pip can install; no external downloads from untrusted URLs or package installs are embedded in the skill metadata.

✓ Credentials

The skill requests no environment variables or credentials. It uses only local filesystem and process signaling, which is proportionate to a local kill-switch/watchdog design. There are no unexplained secret or network access requirements.

ℹ Persistence & Privilege

The skill does not request 'always:true' and allows normal opt-in/autonomous invocation. However, the external watchdog will (if configured) send SIGTERM/SIGKILL to the PID read from state/app.pid and write an emergency flag file that prevents restart until manual clearing. This behavior is coherent with the purpose but is high-impact: misconfigured PID paths or tampered state files could cause unintended process termination or denial of service. The implementation does not validate PID ownership, process identity, or require elevated permissions — responsibility for safe deployment falls to the operator.

Version History

v1.0.3

## Changelog for version 1.0.3 - No file changes detected; functionality, documentation, and configuration remain unchanged from v1.0.2. - Version number updated, but the SKILL.md documentation and codebase stay the same as the previous release.

v1.0.2

## Version 1.0.2 Changelog - Updated version number in SKILL.md to 1.0.2. - No other changes made; documentation and features remain the same.

v1.0.1

- Fixed character encoding issues in the skill name and description (removed replacement characters). - Updated version in metadata from 1.0.0 to 1.0.1. - No functional code or documentation changes beyond improved text presentation.

v1.0.0

Initial upload

Metadata

Slug cost-control

Version 1.0.3

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 4

Frequently Asked Questions

What is Cost Control?

Prevent runaway API costs with 3-tier protection: caution, emergency, and hard cap. Works for ANY expensive API GPT-4, Claude Opus, Gemini, cloud services. It is an AI Agent Skill for Claude Code / OpenClaw, with 387 downloads so far.

How do I install Cost Control?

Run "/install cost-control" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Cost Control free?

Yes, Cost Control is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Cost Control support?

Cost Control is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Cost Control?

It is built and maintained by Shadow Rose (@theshadowrose); the current version is v1.0.3.

More Skills

Cost Control

Cost Control 3-Tier API Spend Protection

Prevent runaway API costs. Works for ANY expensive API (GPT-4, Claude Opus, Gemini, cloud services).

The Problem

The Solution

Tier 1: Caution Mode (Warning)

Tier 2: Emergency Mode (Stop)

Tier 3: Kill Switch (External Watchdog)

Quick Start (3 Steps)

1. Install

2. Integrate with Your API Client

3. Deploy External Watchdog (Optional but Recommended)

Use Cases

AI API Calls (GPT-4, Claude Opus, Gemini)

Cloud Compute (AWS Lambda, GCP Functions, Azure)

Third-Party APIs (Stripe, Twilio, SendGrid)

Research/ML Training

Architecture

3-Tier Response System

Tier 1: Caution Mode

Tier 2: Emergency Mode

Tier 3: Kill Switch (External Watchdog)

Configuration

Basic Config (Minimal Setup)

Advanced Config (All Options)

Pricing Config (Your API)

API Reference

Core Methods

record_call(input_units, output_units, request_id=None)

is_call_allowed()

get_stats()

clear_emergency()

Properties

External Watchdog Setup

1. Copy Watchdog Script

2. Configure Watchdog

3. Deploy via Cron

4. Verify Watchdog is Running

Kill Switch Usage

Manual Kill Switch (Immediate Stop)

Check Status

Resume After Emergency

Testing

Unit Tests

Integration Test (Simulate Runaway)

Limitations

Performance

Memory

CPU

I/O

Troubleshooting

Problem: Emergency mode triggered but spend seems normal

Problem: Watchdog doesn't kill runaway process

Problem: Frequent caution mode cycling

Real-World Example: Runaway API Burn Incident

Migration Guide

From No Cost Tracking

From Manual Monitoring

Contributing

Report Issues

Submit Improvements

Philosophy: Why Open Source?

Support & Community

Get Help

Support Development

License

What's Next?

Planned Features

Credits

Final Thought

⚠️ Disclaimer

Support & Links

What is Cost Control?

How do I install Cost Control?

Is Cost Control free?

Which platforms does Cost Control support?

Who created Cost Control?

💬 Comments

`record_call(input_units, output_units, request_id=None)`

`is_call_allowed()`

`get_stats()`

`clear_emergency()`