← Back to Skills Marketplace
ngocgd

Anthropic Token Optimizer

by ngocgd · GitHub ↗ · v1.3.0 · MIT-0
cross-platform ✓ Security Clean
112
Downloads
0
Stars
0
Active Installs
4
Versions
Install in OpenClaw
/install anthropic-token-optimizer
Description
Reduce Anthropic API costs (cache read, compaction, context bloat) for OpenClaw agents. Use when users ask about token optimization, reducing API costs, cach...
README (SKILL.md)

Anthropic Token Optimizer

Minimize Anthropic API costs without sacrificing quality. Focus on cache read reduction — the #1 cost driver for long sessions with Opus.

Anthropic Pricing Reference

Model Input Cache Write Cache Read Output
Haiku $0.80 $1.00 $0.08 $4
Sonnet $3 $3.75 $0.30 $15
Opus $15 $18.75 $1.50 $75

(per MTok — multiply by millions of tokens)

Key insight: Cache read = context_size × num_turns. 200k context × 50 turns on Opus = ~$15 just cache reads.


Part 1: Config Optimizations (openclaw.json)

1. Compaction model — use cheaper model (highest ROI ✅)

"compaction": {
  "mode": "safeguard",
  "model": "anthropic/claude-sonnet-4-20250514"
}

Sonnet summarizes at 5x less cost than Opus. Quality comparable for summaries.

2. Context pruning — trim old tool results

"contextPruning": { "mode": "cache-ttl", "ttl": "1h" }
TTL Behavior Verdict
1h Trims after 1 hour ✅ Best balance
30m Moderate OK for active sessions
5m Aggressive ⚠️ Loses tool results

3. Cache retention — keep "long"

"params": { "cacheRetention": "long", "context1m": true }

"long" = fewer cache writes ($18.75/MTok on Opus!). Don't switch to "default".

4. Cache TTL heartbeat alignment

Anthropic cache expires after ~1h idle. Heartbeat at 55min keeps it warm:

"heartbeat": { "every": "55m" }

Prevents expensive cache re-writes when agent resumes after idle.

5. Bootstrap size limits — cap workspace injection

"bootstrapMaxChars": 8000,
"bootstrapTotalMaxChars": 30000

Check current injection: /context list. Prevents oversized files from inflating every turn.


Part 2: Workspace Hygiene (biggest long-term win)

File budgets

File Target Review cycle
AGENTS.md \x3C500 tokens (~2KB) Monthly
SOUL.md \x3C250 tokens (~1KB) Rarely
TOOLS.md \x3C500 tokens (~2KB) When tools change
MEMORY.md \x3C400 tokens (~1.5KB) Weekly prune
HEARTBEAT.md \x3C400 tokens (~1.5KB) When done
memory/YYYY-MM-DD.md \x3C30 lines Collapse at EOD
Total injected \x3C2,800 tokens

Reduction tactics

  • Merge redundant files (IDENTITY.md + USER.md → SOUL.md)
  • Move non-essential docs to subfolders (docs/, notes/) — not auto-injected
  • Collapse exploration into decisions: keep "what we decided", delete "how we got here"
  • Prune ghost context: references to old paths, removed tools, fixed bugs
  • Deduplicate: info in SOUL.md shouldn't repeat in MEMORY.md or AGENTS.md

Daily memory rules

Write: decisions + why (1 line), new tools/config, lessons, user preferences. Skip: exploration steps, command outputs, things already in MEMORY.md, delivered content. Format: Bullets, not paragraphs. One fact per line.


Part 3: Behavioral Patterns

6. /compact after each topic (most effective manual action)

/compact Focus on [topic summary]

7. /new when switching topics entirely

Context resets to 0. Don't carry 200k into unrelated work.

8. Subagents for tool-heavy work

Spawn subagents (cheaper model) for: codebase grep, reading 5+ files, research/web fetch. Tool results stay isolated.

9. Tool output discipline

  • Truncate: | head -20, | jq '.key'
  • Request only needed fields from APIs
  • Never paste full JSON when you need one value
  • Output >50 lines → summarize, don't quote

10. File loading discipline

  • Startup: only today + yesterday memory files
  • Read SKILL.md only when task needs that skill
  • Don't re-read files already in context

Part 4: Context Budgeting

Information partitioning

Budget Content
10% Task instructions + constraints
40% Recent 5-10 turns of dialogue
20% Decision logs ("tried X, failed because Y")
20% High-relevance MEMORY.md snippets
10% Tool schemas + system prompt

Compaction survival

Before compaction hits, critical state must be captured:

  1. WAL Protocol: On corrections, decisions, specific values → write to SESSION-STATE.md before responding
  2. Working buffer: At 60%+ context → append exchange summaries to memory/working-buffer.md
  3. Recovery: After compaction, read buffer + session state first. Never ask "where were we?"

Session lifespan

After 85% context or 3+ compactions → start fresh with /new. Good MEMORY.md means minimal context loss.


Part 5: Codebase Map Caching (Atris Pattern)

Problem: Every code review or exploration session re-reads the same files, burning tokens on repeated grep/read calls. A 500-file project can cost 50k+ tokens just for navigation.

Solution: Generate a persistent codebase map (atris/MAP.md) once, reuse across sessions.

Setup (one-time per project)

# Install atris skill
npx clawhub@latest install atris

# Or manually: create atris/ folder in project root, then scan
rg "^(export|function|class|const|def |async def |router\.|app\.)" \
  --line-number -g "!node_modules" -g "!.git" -g "!dist" -g "!.env*"

MAP.md structure

# MAP.md — [Project] Navigation Guide
> Last updated: YYYY-MM-DD

## Quick Reference
- `src/index.ts:1` — App entry point
- `src/routes/auth.ts:15` — POST /login handler
- `src/models/user.ts:8` — User schema

### Feature: Authentication
- **Entry:** `src/auth/login.ts:45-89` (handleLogin)
- **Validation:** `src/auth/validate.ts:12` (validateToken)
- **Routes:** `src/routes/auth.ts:5-28`

### Feature: Billing
- **Controller:** `src/controllers/billing.ts:20`
- **Service:** `src/services/stripe.ts:1-45`

MAP-first rule

Before searching codebase:

  1. Read atris/MAP.md — found? Go directly to file:line
  2. Not found? Search with rg, then add result to MAP.md

Map gets smarter every session. Never let a discovery go unrecorded.

Keeping fresh

  • New file → add to relevant section
  • Deleted file → remove from map
  • Major refactor → regenerate affected sections only
  • Small updates, not full regeneration

Token savings

Codebase Without map With map Savings
Small (50 files) ~5k tokens/explore ~1k 80%
Medium (200 files) ~20k tokens/explore ~3k 85%
Large (500+ files) ~50k tokens/explore ~5k 90%

Diagnostics

/context list    → token count per injected file
/context detail  → full breakdown (tools, skills, system)
/usage tokens    → append token count to every reply
/usage cost      → cumulative cost summary
/status          → model, context %, cost estimate

Decision Matrix

Situation Action
Session >100k context /compact immediately
Switching topics /new or /compact
Reading 5+ files Spawn subagent
Compaction cost high Set compaction model to Sonnet
Daily cost >$10 Audit session count, compact more
Cache writes spiking Heartbeat ≤55min, keep cacheRetention: long
Workspace injection >20KB Merge/move files, set bootstrapMaxChars
Context >85% /new — start fresh

Impact Summary

Technique Savings UX Impact
Compaction model = Sonnet ~80% compaction cost None
Workspace file budgets ~30-50% base cost None
/compact after topics ~40-60% cache read Manual step
Cache TTL heartbeat (55m) ~20-30% cache writes None
Bootstrap size limits ~20-30% base cost None
Subagent delegation ~30% cache read Better (parallel)
Tool output discipline ~10-20% per turn Requires habit
/new for new topics ~100% (reset) Lose old context
Codebase map (Atris) ~80-90% code exploration One-time setup

Credits

Incorporates ideas from: openclaw-token-optimizer, context-slimmer, context-budgeting, compaction-survival, context-hygiene, atris (codebase map caching).

Usage Guidance
This is an instruction-only optimization guide (no code shipped, no credentials requested). It's coherent with its purpose, but before applying recommendations: (1) back up openclaw.json and workspace files before making config changes; (2) verify Anthropic pricing/metrics against official docs—numbers in the guide may be estimates; (3) confirm availability of any referenced tools (npx, rg/ripgrep, clawhub) before running suggested commands; (4) be cautious about persisting sensitive data to MEMORY.md or MAP.md (avoid storing PII/credentials); and (5) note that following the npx/clawhub suggestion will install third-party code from npm — inspect that package before running it.
Capability Analysis
Type: OpenClaw Skill Name: anthropic-token-optimizer Version: 1.3.0 The 'anthropic-token-optimizer' skill bundle provides legitimate documentation and configuration strategies for reducing Anthropic API costs. It includes helpful advice on context pruning, compaction models, and workspace hygiene, along with standard shell commands (e.g., using 'ripgrep' in SKILL.md) to create codebase maps for efficiency. No evidence of malicious intent, data exfiltration, or harmful prompt injection was found.
Capability Assessment
Purpose & Capability
Name and description claim token-cost/compaction/workspace guidance and the SKILL.md contains only guidance and concrete config/file-management suggestions that match that purpose. No unrelated credentials, binaries, or system-level access are requested.
Instruction Scope
Instructions focus on config changes (openclaw.json), workspace hygiene, compaction/heartbeat patterns, and creating/maintaining files (MEMORY.md, SESSION-STATE.md, atris/MAP.md). This is within scope, but the doc also suggests running tools/commands (npx, rg, clawhub, grep) and using filesystem commands like `/context list` which are not declared in the registry metadata — users should expect the agent or operator to run these or ensure those binaries exist. The skill will recommend writing and reading local workspace files (intended behavior) — verify no sensitive secrets will be persisted accidentally.
Install Mechanism
There is no install spec in the registry (instruction-only). The SKILL.md suggests a manual npx-based installation step for the optional 'atris' pattern (npx clawhub@latest install atris). That is a manual command that would fetch code from npm when run; the skill itself does not automatically download or install anything. Users should be aware that following that suggestion will run a network-installed package.
Credentials
The skill requests no environment variables, credentials, or config paths. All recommended actions are local config edits and file hygiene; the lack of requested secrets is proportional to the described functionality.
Persistence & Privilege
Skill is user-invocable, not always-on, and does not request persistent privileges or modify other skills. It advises creating/maintaining local files but does not itself persist or alter agent-wide settings via an install spec.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install anthropic-token-optimizer
  3. After installation, invoke the skill by name or use /anthropic-token-optimizer
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.3.0
v1.3.0: Added codebase map caching (Atris pattern) — scan once, navigate forever. 80-90% token savings on code exploration. MAP-first rule prevents repeated grep/read.
v1.2.0
v1.2.0: Merged best practices from 6 community skills — workspace file budgets, context budgeting (info partitioning), compaction survival (WAL protocol + working buffer), tool output discipline, file loading discipline, daily memory rules, session lifespan guidance.
v1.1.0
v1.1.0: Added cache TTL heartbeat alignment, bootstrap size limits, OpenClaw diagnostics commands, fork credit. Enhanced pricing table with Output costs.
v1.0.0
Initial release: cache read optimization, compaction model config, behavioral patterns
Metadata
Slug anthropic-token-optimizer
Version 1.3.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 4
Frequently Asked Questions

What is Anthropic Token Optimizer?

Reduce Anthropic API costs (cache read, compaction, context bloat) for OpenClaw agents. Use when users ask about token optimization, reducing API costs, cach... It is an AI Agent Skill for Claude Code / OpenClaw, with 112 downloads so far.

How do I install Anthropic Token Optimizer?

Run "/install anthropic-token-optimizer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Anthropic Token Optimizer free?

Yes, Anthropic Token Optimizer is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Anthropic Token Optimizer support?

Anthropic Token Optimizer is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Anthropic Token Optimizer?

It is built and maintained by ngocgd (@ngocgd); the current version is v1.3.0.

💬 Comments