Description

Reduce Anthropic API costs (cache read, compaction, context bloat) for OpenClaw agents. Use when users ask about token optimization, reducing API costs, cach...

README (SKILL.md)

Anthropic Token Optimizer

Name: Anthropic Token Optimizer
Author: ngocgd

Minimize Anthropic API costs without sacrificing quality. Focus on cache read reduction — the #1 cost driver for long sessions with Opus.

Anthropic Pricing Reference

Model	Input	Cache Write	Cache Read	Output
Haiku	$0.80	$1.00	$0.08	$4
Sonnet	$3	$3.75	$0.30	$15
Opus	$15	$18.75	$1.50	$75

(per MTok — multiply by millions of tokens)

Key insight: Cache read = context_size × num_turns. 200k context × 50 turns on Opus = ~$15 just cache reads.

Part 1: Config Optimizations (openclaw.json)

1. Compaction model — use cheaper model (highest ROI ✅)

"compaction": {
  "mode": "safeguard",
  "model": "anthropic/claude-sonnet-4-20250514"
}

Sonnet summarizes at 5x less cost than Opus. Quality comparable for summaries.

2. Context pruning — trim old tool results

"contextPruning": { "mode": "cache-ttl", "ttl": "1h" }

TTL	Behavior	Verdict
`1h`	Trims after 1 hour	✅ Best balance
`30m`	Moderate	OK for active sessions
`5m`	Aggressive	⚠️ Loses tool results

3. Cache retention — keep "long"

"params": { "cacheRetention": "long", "context1m": true }

"long" = fewer cache writes ($18.75/MTok on Opus!). Don't switch to "default".

4. Cache TTL heartbeat alignment

Anthropic cache expires after ~1h idle. Heartbeat at 55min keeps it warm:

"heartbeat": { "every": "55m" }

Prevents expensive cache re-writes when agent resumes after idle.

5. Bootstrap size limits — cap workspace injection

"bootstrapMaxChars": 8000,
"bootstrapTotalMaxChars": 30000

Check current injection: /context list. Prevents oversized files from inflating every turn.

Part 2: Workspace Hygiene (biggest long-term win)

File budgets

File	Target	Review cycle
AGENTS.md	\x3C500 tokens (~2KB)	Monthly
SOUL.md	\x3C250 tokens (~1KB)	Rarely
TOOLS.md	\x3C500 tokens (~2KB)	When tools change
MEMORY.md	\x3C400 tokens (~1.5KB)	Weekly prune
HEARTBEAT.md	\x3C400 tokens (~1.5KB)	When done
memory/YYYY-MM-DD.md	\x3C30 lines	Collapse at EOD
Total injected	\x3C2,800 tokens

Reduction tactics

Merge redundant files (IDENTITY.md + USER.md → SOUL.md)
Move non-essential docs to subfolders (docs/, notes/) — not auto-injected
Collapse exploration into decisions: keep "what we decided", delete "how we got here"
Prune ghost context: references to old paths, removed tools, fixed bugs
Deduplicate: info in SOUL.md shouldn't repeat in MEMORY.md or AGENTS.md

Daily memory rules

Write: decisions + why (1 line), new tools/config, lessons, user preferences. Skip: exploration steps, command outputs, things already in MEMORY.md, delivered content. Format: Bullets, not paragraphs. One fact per line.

Part 3: Behavioral Patterns

6. `/compact` after each topic (most effective manual action)

/compact Focus on [topic summary]

7. `/new` when switching topics entirely

Context resets to 0. Don't carry 200k into unrelated work.

8. Subagents for tool-heavy work

Spawn subagents (cheaper model) for: codebase grep, reading 5+ files, research/web fetch. Tool results stay isolated.

9. Tool output discipline

Truncate: | head -20, | jq '.key'
Request only needed fields from APIs
Never paste full JSON when you need one value
Output >50 lines → summarize, don't quote

10. File loading discipline

Startup: only today + yesterday memory files
Read SKILL.md only when task needs that skill
Don't re-read files already in context

Part 4: Context Budgeting

Information partitioning

Budget	Content
10%	Task instructions + constraints
40%	Recent 5-10 turns of dialogue
20%	Decision logs ("tried X, failed because Y")
20%	High-relevance MEMORY.md snippets
10%	Tool schemas + system prompt

Compaction survival

Before compaction hits, critical state must be captured:

WAL Protocol: On corrections, decisions, specific values → write to SESSION-STATE.md before responding
Working buffer: At 60%+ context → append exchange summaries to memory/working-buffer.md
Recovery: After compaction, read buffer + session state first. Never ask "where were we?"

Session lifespan

After 85% context or 3+ compactions → start fresh with /new. Good MEMORY.md means minimal context loss.

Part 5: Codebase Map Caching (Atris Pattern)

Problem: Every code review or exploration session re-reads the same files, burning tokens on repeated grep/read calls. A 500-file project can cost 50k+ tokens just for navigation.

Solution: Generate a persistent codebase map (atris/MAP.md) once, reuse across sessions.

Setup (one-time per project)

# Install atris skill
npx clawhub@latest install atris

# Or manually: create atris/ folder in project root, then scan
rg "^(export|function|class|const|def |async def |router\.|app\.)" \
  --line-number -g "!node_modules" -g "!.git" -g "!dist" -g "!.env*"

MAP.md structure

# MAP.md — [Project] Navigation Guide
> Last updated: YYYY-MM-DD

## Quick Reference
- `src/index.ts:1` — App entry point
- `src/routes/auth.ts:15` — POST /login handler
- `src/models/user.ts:8` — User schema

### Feature: Authentication
- **Entry:** `src/auth/login.ts:45-89` (handleLogin)
- **Validation:** `src/auth/validate.ts:12` (validateToken)
- **Routes:** `src/routes/auth.ts:5-28`

### Feature: Billing
- **Controller:** `src/controllers/billing.ts:20`
- **Service:** `src/services/stripe.ts:1-45`

MAP-first rule

Before searching codebase:

Read atris/MAP.md — found? Go directly to file:line
Not found? Search with rg, then add result to MAP.md

Map gets smarter every session. Never let a discovery go unrecorded.

Keeping fresh

New file → add to relevant section
Deleted file → remove from map
Major refactor → regenerate affected sections only
Small updates, not full regeneration

Token savings

Codebase	Without map	With map	Savings
Small (50 files)	~5k tokens/explore	~1k	80%
Medium (200 files)	~20k tokens/explore	~3k	85%
Large (500+ files)	~50k tokens/explore	~5k	90%

Diagnostics

/context list    → token count per injected file
/context detail  → full breakdown (tools, skills, system)
/usage tokens    → append token count to every reply
/usage cost      → cumulative cost summary
/status          → model, context %, cost estimate

Decision Matrix

Situation	Action
Session >100k context	`/compact` immediately
Switching topics	`/new` or `/compact`
Reading 5+ files	Spawn subagent
Compaction cost high	Set compaction model to Sonnet
Daily cost >$10	Audit session count, compact more
Cache writes spiking	Heartbeat ≤55min, keep `cacheRetention: long`
Workspace injection >20KB	Merge/move files, set `bootstrapMaxChars`
Context >85%	`/new` — start fresh

Impact Summary

Technique	Savings	UX Impact
Compaction model = Sonnet	~80% compaction cost	None
Workspace file budgets	~30-50% base cost	None
`/compact` after topics	~40-60% cache read	Manual step
Cache TTL heartbeat (55m)	~20-30% cache writes	None
Bootstrap size limits	~20-30% base cost	None
Subagent delegation	~30% cache read	Better (parallel)
Tool output discipline	~10-20% per turn	Requires habit
`/new` for new topics	~100% (reset)	Lose old context
Codebase map (Atris)	~80-90% code exploration	One-time setup

Credits

Incorporates ideas from: openclaw-token-optimizer, context-slimmer, context-budgeting, compaction-survival, context-hygiene, atris (codebase map caching).

Usage Guidance

This is an instruction-only optimization guide (no code shipped, no credentials requested). It's coherent with its purpose, but before applying recommendations: (1) back up openclaw.json and workspace files before making config changes; (2) verify Anthropic pricing/metrics against official docs—numbers in the guide may be estimates; (3) confirm availability of any referenced tools (npx, rg/ripgrep, clawhub) before running suggested commands; (4) be cautious about persisting sensitive data to MEMORY.md or MAP.md (avoid storing PII/credentials); and (5) note that following the npx/clawhub suggestion will install third-party code from npm — inspect that package before running it.

Capability Analysis

Type: OpenClaw Skill Name: anthropic-token-optimizer Version: 1.3.0 The 'anthropic-token-optimizer' skill bundle provides legitimate documentation and configuration strategies for reducing Anthropic API costs. It includes helpful advice on context pruning, compaction models, and workspace hygiene, along with standard shell commands (e.g., using 'ripgrep' in SKILL.md) to create codebase maps for efficiency. No evidence of malicious intent, data exfiltration, or harmful prompt injection was found.

Capability Assessment

✓ Purpose & Capability

Name and description claim token-cost/compaction/workspace guidance and the SKILL.md contains only guidance and concrete config/file-management suggestions that match that purpose. No unrelated credentials, binaries, or system-level access are requested.

ℹ Instruction Scope

Instructions focus on config changes (openclaw.json), workspace hygiene, compaction/heartbeat patterns, and creating/maintaining files (MEMORY.md, SESSION-STATE.md, atris/MAP.md). This is within scope, but the doc also suggests running tools/commands (npx, rg, clawhub, grep) and using filesystem commands like `/context list` which are not declared in the registry metadata — users should expect the agent or operator to run these or ensure those binaries exist. The skill will recommend writing and reading local workspace files (intended behavior) — verify no sensitive secrets will be persisted accidentally.

ℹ Install Mechanism

There is no install spec in the registry (instruction-only). The SKILL.md suggests a manual npx-based installation step for the optional 'atris' pattern (npx clawhub@latest install atris). That is a manual command that would fetch code from npm when run; the skill itself does not automatically download or install anything. Users should be aware that following that suggestion will run a network-installed package.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. All recommended actions are local config edits and file hygiene; the lack of requested secrets is proportional to the described functionality.

✓ Persistence & Privilege

Skill is user-invocable, not always-on, and does not request persistent privileges or modify other skills. It advises creating/maintaining local files but does not itself persist or alter agent-wide settings via an install spec.

Version History

v1.3.0

v1.3.0: Added codebase map caching (Atris pattern) — scan once, navigate forever. 80-90% token savings on code exploration. MAP-first rule prevents repeated grep/read.

v1.2.0

v1.2.0: Merged best practices from 6 community skills — workspace file budgets, context budgeting (info partitioning), compaction survival (WAL protocol + working buffer), tool output discipline, file loading discipline, daily memory rules, session lifespan guidance.

v1.1.0

v1.1.0: Added cache TTL heartbeat alignment, bootstrap size limits, OpenClaw diagnostics commands, fork credit. Enhanced pricing table with Output costs.

v1.0.0

Initial release: cache read optimization, compaction model config, behavioral patterns

Metadata

Slug anthropic-token-optimizer

Version 1.3.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 4

Frequently Asked Questions

What is Anthropic Token Optimizer?

Reduce Anthropic API costs (cache read, compaction, context bloat) for OpenClaw agents. Use when users ask about token optimization, reducing API costs, cach... It is an AI Agent Skill for Claude Code / OpenClaw, with 112 downloads so far.

How do I install Anthropic Token Optimizer?

Run "/install anthropic-token-optimizer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Anthropic Token Optimizer free?

Yes, Anthropic Token Optimizer is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Anthropic Token Optimizer support?

Anthropic Token Optimizer is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Anthropic Token Optimizer?

It is built and maintained by ngocgd (@ngocgd); the current version is v1.3.0.

More Skills

Anthropic Token Optimizer

Anthropic Token Optimizer

Anthropic Pricing Reference

Part 1: Config Optimizations (openclaw.json)

1. Compaction model — use cheaper model (highest ROI ✅)

2. Context pruning — trim old tool results

3. Cache retention — keep "long"

4. Cache TTL heartbeat alignment

5. Bootstrap size limits — cap workspace injection

Part 2: Workspace Hygiene (biggest long-term win)

File budgets

Reduction tactics

Daily memory rules

Part 3: Behavioral Patterns

6. /compact after each topic (most effective manual action)

7. /new when switching topics entirely

8. Subagents for tool-heavy work

9. Tool output discipline

10. File loading discipline

Part 4: Context Budgeting

Information partitioning

Compaction survival

Session lifespan

Part 5: Codebase Map Caching (Atris Pattern)

Setup (one-time per project)

MAP.md structure

MAP-first rule

Keeping fresh

Token savings

Diagnostics

Decision Matrix

Impact Summary

Credits

What is Anthropic Token Optimizer?

How do I install Anthropic Token Optimizer?

Is Anthropic Token Optimizer free?

Which platforms does Anthropic Token Optimizer support?

Who created Anthropic Token Optimizer?

💬 Comments

6. `/compact` after each topic (most effective manual action)

7. `/new` when switching topics entirely