The Complete Claude Guide: From API to Production AI Agents
The most comprehensive Claude technical book: covering all model selection, Messages API with 27 parameters, Extended/Adaptive Thinking, Tool Use/Computer Use, MCP protocol, Managed Agents, Claude Code (Skills/Plugin/Hooks/Sub-agents), Plugin development and publishing, Bedrock/Vertex/Azure multi-cloud integration, Admin API enterprise management, and Constitutional AI safety compliance. 80 chapters โ the definitive guide for developers moving from 'calling the API' to 'building production AI systems'.
80
Chapters
Free
Forever
Table of Contents
Ch01
Constitutional AI and Anthropic's Safety Philosophy: Why Claude Is Different from Other LLMs
Constitutional AI mechanics, RLHF vs RLAIF, Anthropic's safety-first philosophy, Model Spec and RSP impacts on developers
Ch02
Model Family Deep Dive: Opus 4.7 / Sonnet 4.6 / Haiku 4.5 Selection Decision Tree
Capability benchmarks, context windows, pricing, knowledge cutoffs, and use-case comparisons for all three models, with a complete selection decision tree
Ch03
Token Economics: Precise Calculation and Cost Estimation for Input/Output/Thinking/Cache Tokens
Billing rules for each token type, MTok pricing table, cost estimation formulas, monthly bill control strategies, and the key mechanism of cache reads not counting toward rate limits
Ch04
Service Tiers and Rate Limits: Complete Guide to Priority / Standard / Batch Three-Tier SLA
Tier 1-4 promotion conditions, three-dimensional rate limits (RPM/ITPM/OTPM), Priority Tier SLA 99.5% application process, and Batch tier 50% discount mechanism
Ch05
Seven-Language SDK Complete Guide: Python / TypeScript / Java / Go / C# / Ruby / PHP
Installation, authentication, basic calls, async streaming, and error handling patterns for all seven official SDKs, with production best practices for each language
Ch06
OpenAI-Compatible Endpoint and Zero-Change Migration: Complete Guide from GPT-4 to Claude
Anthropic's OpenAI-compatible API endpoint config, parameter mapping tables, incompatibility handling, migration checklist, and cost comparison calculations
Ch07
Messages API Complete Parameter Reference: All 27 Parameters, Defaults, and Best Practices
Line-by-line breakdown of all parameters including model, max_tokens, messages, system, temperature, top_p, top_k, stop_sequences, stream, thinking, tools, tool_choice, metadata, service_tier, inference_geo
Ch08
Multi-Turn Conversation Design: Context Trimming, State Management and 200K Window Optimization
Conversation history trimming algorithms, sliding window strategies, key information retention, assistant prefill migration, and token budget allocation across turns
Ch09
System Prompt Engineering: Complete Methodology for Role Injection, Constraint Design and Persona Consistency
System prompt mechanics, Operator/User permission hierarchy, role definition templates, constraint writing patterns, and multilingual persona consistency strategies
Ch10
Streaming Output: SSE Protocol, Resumable Connections and Front-End Real-Time Rendering in Practice
SSE protocol deep dive, async stream implementations in Python/TypeScript, six content_block_delta types, resumable connection design, and real-time rendering in React/Vue
Ch11
Prompt Caching Deep Dive: 5-Minute/1-Hour TTL, Four Breakpoints, Complete Strategy for 90% Cost Savings
Two caching implementations (automatic/explicit), TTL pricing comparison, four breakpoint design, cache invalidation triggers, usage metric interpretation, and throughput boost from cache reads not counting toward ITPM
Ch12
Structured Outputs: JSON Schema Enforcement, Pydantic/Zod Integration and Production Pitfalls
Two implementation approaches (output_config vs SDK parse()), supported JSON Schema features, handling unsupported features, incompatibility with Citations, 24-hour Grammar cache, and PHI restrictions in HIPAA contexts
Ch13
Token Counting API + Batch API: Free Count Estimation and 50% Cost Savings with Batch Processing
Free counting, rate limits, and accuracy notes for Token Counting API; workflow, JSONL result format, 100K request batch limit, 300K output beta header, and 29-day data retention for Batch API
Ch14
Prefill Deprecation Migration + Effort Parameter: Two Must-Know Changes When Upgrading to Claude 4.x
Why prefill returns 400 in Claude 4.6+, migration alternatives using Structured Outputs and System Prompts, and the five effort levels (low/medium/high/xhigh/max) vs budget_tokens differences
Ch15
Extended Thinking Deep Dive: budget_tokens, display Modes and Multi-Turn Propagation Mechanics
Complete Extended Thinking configuration, summarized vs omitted display mode comparison, thinking block propagation rules in multi-turn dialogs, tool use compatibility constraints, and Interleaved Thinking beta
Ch16
Adaptive Thinking: Opus 4.7 Adaptive Reasoning and Interleaved Thinking in Practice
How Adaptive Thinking works (model autonomously determines reasoning depth), fundamental differences from Extended Thinking, Interleaved Thinking between tool calls, and special token billing rules for display:summarized
Ch17
Long Context Strategies: Handling 1M Token Windows, 100-Page PDFs and 600 Images
Opus 4.7's 1M context new tokenizer, PDF limits (max 50/request), image token billing, chunked summarization for ultra-long documents, and combining Memory Tool with Compaction
Ch18
Complete Tool Use Guide: Single Tool, Parallel Calls, Tool Chain Design Patterns and Token Overhead Calculation
Three elements of tool definitions, four tool_choice strategies, parallel tool call design, tool chain orchestration, 346/313 token system prompt overhead, and strict:true structured tool calls
Ch19
Tool Search Tool: BM25 + Regex Dual-Engine Dynamic Loading for 10,000-Tool Libraries
How defer_loading:true works, BM25 semantic search vs regex exact match, organizational strategies for large tool libraries, token savings calculation, and combination with Programmatic Tool Calling
Ch20
Advisor Tool: Executor + Advisor Dual-Model Single API Call for High-Quality Long-Range Tasks
Advisor Tool architecture (Haiku/Sonnet executor + Opus advisor), advisor_20260301 tool type configuration, applicable scenarios (code review/security analysis/complex decisions), and cost-quality tradeoff analysis
Ch21
Server Tools in Practice: The web_search / web_fetch / code_execution Trio
Configuration for all three Server Tools, web_search additional billing, web_fetch content extraction strategies, code_execution container environment and supported languages, and mixing Server and Client Tools
Ch22
Programmatic Tool Calling: Direct Tool Calls Inside Code Execution Containers to Reduce Round-Trips
allowed_callers configuration, tool call execution flow inside containers, actual benefits of reducing API round-trips, performance comparison with regular tool calls, and applicable scenarios and limitations
Ch23
Citations API: Source Attribution for Document QA Systems and Handling Incompatibility with Structured Outputs
citations:enabled configuration, three document type citation formats, special rule for cited_text not counting tokens, and solutions for mutual exclusion with Structured Outputs
Ch24
Files API: Cross-Request PDF/Image Reuse, 500MB File Handling and ZDR Limitations
Five Files API operations, supported file types and corresponding content blocks, 500MB single-file and 500GB organization-total limits, and workarounds for Bedrock/Vertex incompatibility
Ch25
Computer Use: Complete Practical Guide and Security Protection for Screenshot Control, Browser Automation and Desktop Operations
All operation types for computer_20251124 tool, enable_zoom feature, 466-735 token overhead, minimal-privilege VM configuration, Prompt Injection protection, and combining with bash/str_replace_editor
Ch26
Memory Tool: Complete Mechanism of memory_20250818 for Cross-Session Persistent Memory
Memory Tool mechanics (reading/writing the /memories directory), six operation commands, Python BetaAbstractMemoryTool custom backends, and three storage options (filesystem/database/cloud)
Ch27
Context Editing + Compaction: Complete Strategy for Selective History Clearing and Server-Side Auto-Summarization
Three clearing types in Context Editing, compact-2026-01-12 beta server summarization, Memory Tool and Compaction combination patterns, and complete solution for long conversation cost control
Ch28
Multi-Session Software Development Pattern: Three-Phase Architecture of Initializer / Subsequent / End-of-session
Design rationale for the three-phase pattern, Initializer session responsibilities, Subsequent session state restoration, standardized End-of-session update process, and integration with Claude Code CLAUDE.md
Ch29
Claude Managed Agents Overview: Three-Layer Architecture of Sessions / Agents / Environments APIs
Managed Agents as a standalone product line, fundamental differences from the regular Messages API, managed-agents-2026-04-01 beta header, and responsibility boundaries of the three-layer API
Ch30
Sessions API Deep Dive: Persistent Session State Management and Multi-Turn Agent Task Orchestration
Complete lifecycle of Sessions API (create/query/update/terminate), session state persistence, cross-request context maintenance, session timeout and recovery strategies, and integration with Memory Tool
Ch31
Agents API: Managed Agent Lifecycle Control and Concurrent Agent Orchestration
Agent definitions in Agents API, start/pause/resume/terminate operations, resource isolation for concurrent agents, inter-agent communication, and parent-child agent task decomposition patterns
Ch32
Environments API: Containerized Execution Environment Configuration and Persistent Workspaces
Container configuration in Environments API (dependency installation/environment variables/filesystem), persistent workspace lifecycle management, isolation strategies for multi-agent shared environments, and relationship with the code_execution tool
Ch33
Managed Agents in Production: Error Recovery, Monitoring and Cost Control
Error classification and recovery for Managed Agents, task timeout handling, agent state monitoring and alerting, cost attribution for Sessions/Agents/Environments, and production best practices checklist
Ch34
MCP Protocol Internals: Host / Client / Server Triangular Architecture and JSON-RPC 2.0 Message Format
MCP design motivation, Host/Client/Server role responsibilities, JSON-RPC 2.0 message structure, Stdio vs Streamable HTTP transport options, and protocol version history
Ch35
MCP Three Core Primitives: Design Specifications and Implementation Patterns for Tools / Resources / Prompts
Tools list/call interface specs, Resources subscription and push mechanics, Prompts template parameterization, Sampling client primitive, Elicitation user input requests, and Notifications server-side push
Ch36
Building Your Own MCP Server: Complete Implementation with Python/TypeScript SDK and Debug Deployment
MCP SDK installation and initialization, implementation code for all three primitive types, MCP Inspector debugging tool, npm/PyPI packaging, and Docker containerized deployment
Ch37
Official MCP Server Ecosystem: Core Servers Including filesystem / postgres / redis / puppeteer
Overview of modelcontextprotocol/servers repository, configuration and usage of core servers, and Claude Desktop local MCP configuration file format
Ch38
Messages API Direct Remote MCP Connection: mcp_servers Parameter in Practice with OAuth Authentication
mcp-client-2025-11-20 beta header, mcp_servers configuration format, MCP Toolset tool filtering, OAuth token passing, and current limitations (Tools only/HTTP only/no Bedrock support)
Ch39
Claude Code Installation and Core Workflows: CLI / Desktop / VS Code / JetBrains Across All Platforms
Four installation methods, cross-platform feature differences, core workflows (task decomposition/file editing/code review/test running), and authentication options (claude.ai account vs API Key)
Ch40
CLAUDE.md System: Complete Design of Three-Layer Configuration, Auto Memory and Path-Specific Rules
Priority and use cases for global/project/local CLAUDE.md layers, Auto Memory learning mechanism, path-specific rules, and CLAUDE.md best practice templates
Ch41
Slash Commands Complete Reference: All Built-in Commands and Bundled Skills Usage
Complete breakdown of all built-in slash commands, Bundled Skills descriptions and parameters, and command argument syntax
Ch42
Skills Development: SKILL.md Complete Parameters, Dynamic Context Injection and Invocation Control Matrix
All 20+ SKILL.md frontmatter parameters, !`command` dynamic context injection, invocation control matrix with disable-model-invocation/user-invocable, and skill content lifecycle with post-compaction retention
Ch43
Hooks System: Complete Guide to 26 Events, 5 Hook Types and stdin-stdout Protocol
26 Hook event categories (session/turn/tool/filesystem/agent lifecycle), five hook types (command/http/mcp_tool/prompt/agent), exit code semantics, and PreToolUse fine-grained interception and input modification
Ch44
Sub-agents Multi-Agent Collaboration: Definitions, Built-in Types and Agent Teams Concurrent Orchestration
Complete sub-agent Markdown frontmatter parameters, built-in agent types (Explore/Plan/general-purpose), Agent Teams P2P communication, and isolation:worktree code isolation
Ch45
Permission System: 5-Layer Priority, Rule Syntax and Complete Sandbox Configuration Guide
Five-layer permission priority (managed/CLI/local/project/user), rule syntax for Bash/Read/Edit/WebFetch/Skill/MCP/Agent, Glob pattern permissions, and macOS Sandbox process-level isolation configuration
Ch46
Claude Code Analytics API: Team Usage Tracking and Productivity Metrics Monitoring
Admin API Key acquisition and usage, per-user/workspace Claude Code usage tracking, productivity metrics (code generation volume/acceptance rate/time saved), and relationship with Usage Report API
Ch47
Claude Code SDK Mode: --print, JSON Output, Programmatic Invocation and Automation Integration
--print non-interactive mode, JSON output format, subprocess claude CLI invocation, programmatic task submission and result retrieval, and automation integration patterns with CI systems
Ch48
Claude Code + CI/CD: PR Auto-Review, Issue Handling and Complete GitHub Actions Configuration
Claude Code configuration in GitHub Actions, PR auto-trigger review workflow, Issue auto-handling and label classification, code quality gate integration, GitLab CI/CD configuration differences, and secure API Key management
Ch49
Plugin vs Skill: Core Differences, Capability Boundaries and Ten Plugin-Exclusive Capabilities
Namespace differences, distribution method comparison, and Plugin's ten exclusive capabilities (MCP bundling/Hook injection/LSP/Monitor/PATH injection/themes/dependency management/userConfig collection/message channels/persistent data)
Ch50
plugin.json Complete Parameter Reference: Directory Structure, Versioning Strategy and userConfig Sensitive Data Collection
Complete plugin.json schema, plugin directory structure overview (most common .claude-plugin placement error), explicit version vs Commit SHA strategy, sensitive:true keychain storage for userConfig, and CLAUDE_PLUGIN_ROOT environment variables
Ch51
Plugin Bundling MCP Servers: Install-and-Launch Implementation and Dynamic userConfig Configuration
.mcp.json configuration within plugins, collecting API tokens via userConfig and injecting into MCP environment variables, multiple MCP server dependency declarations, and automatic MCP server shutdown on plugin uninstall
Ch52
Hooks in Plugins: Injecting Lifecycle Events for Deterministic Engineering Control
hooks/hooks.json configuration within plugins, priority relationship with global hooks, CLAUDE_PLUGIN_ROOT usage in hook scripts, and practical cases (dangerous command interception/auto-formatting/audit logging/external approval system integration)
Ch53
LSP Server Plugins: Integrating Nine Official Language Servers and Custom LSP Configuration
Complete .lsp.json configuration format, nine official LSP plugins, and capability differences between LSP and grep-based code navigation
Ch54
Background Monitor: Implementation Mechanism for Claude to Proactively Respond to Log and State Changes
monitors/monitors.json configuration format, always vs on-skill-invoke startup modes, Monitor stdout as Claude notification mechanism, and production use cases (deployment monitoring/error log tracking/CI status awareness)
Ch55
Private/Team Marketplace: marketplace.json and Five Plugin Source Types
Complete marketplace.json format, five source types, team marketplace publish and management commands, and allowCrossMarketplaceDependenciesOn cross-marketplace dependencies
Ch56
Submitting to the Official Marketplace: claude-plugins-official Review Process and Analysis of 101 Official Plugins
Dual submission channels for the official marketplace, review criteria and common rejection reasons, categorized analysis of 101 official plugins, reserved name list, and plugin growth strategies
Ch57
Claude.ai Connector Development: Complete Remote MCP + OAuth 2.1 Integration Guide
Core differences between Connector and Plugin, Streamable HTTP implementation, complete OAuth 2.1 + PKCE flow with sequence diagram, Dynamic Client Registration, claude.ai callback URL, tool result size limits, and submission to Connectors Directory
Ch58
Browser Extension Development: Manifest V3 + Service Worker for Secure Claude API Calls
MV3 architecture and Service Worker cross-origin advantages, secure API Key storage in chrome.storage.local, Content Script Prompt Injection risk (23.6% attack success rate research), and Native Messaging Host communication with Claude Code extension
Ch59
Amazon Bedrock (Mantle) Integration: Complete Guide for SigV4 Authentication, Regional Endpoints and Quota Requests
Bedrock Mantle new endpoint format, three authentication methods (SigV4/IAM role/Bearer Token), global vs regional endpoints (10% surcharge), default 2M TPM quota application, and workarounds for unsupported features
Ch60
Google Vertex AI Integration: Multi-Region Endpoints, Data Residency and Feature Differences from Direct API
Vertex AI global/EU/US/AP multi-region endpoint configuration, feature differences from direct API (no Structured Outputs), IAM authentication, and Vertex AI SDK vs anthropic[vertex] selection
Ch61
Azure AI Foundry Integration: Deployment Configuration, Structured Outputs Beta and Enterprise Entra ID Integration
Claude deployment configuration on Azure AI Foundry, Structured Outputs Beta support, complete Microsoft Entra ID OAuth integration for Remote MCP, and feature comparison table with Bedrock/Vertex
Ch62
LangChain / LlamaIndex / Vercel AI SDK: Three Framework Integration Practice and Performance Comparison
Code practices for all three integrations, framework differences in Streaming/Tool Calling/Structured Output support, and a complete Next.js + Vercel AI SDK RAG example
Ch63
n8n / Make / Zapier Zero-Code Workflow Integration: Automation Scenarios and Webhook Trigger Design
Claude node configuration comparison across three platforms, common automation scenarios (email processing/content generation/data extraction), Webhook trigger design and error handling, and capability boundary with custom MCP Servers
Ch64
Admin API: Complete Enterprise Management Guide for Organizations / Members / Workspaces / API Keys
sk-ant-admin key acquisition and scope, organization member management (GET/POST/DELETE), invite management, workspace creation and member assignment, API key listing and status management, and five role types
Ch65
Usage & Cost API: Usage Reports, Cost Tracking and Integration with Datadog and Other Observability Platforms
Time bucket aggregation (1m/1h/1d) for usage reports, grouping and filtering by model/workspace/API key/service tier, USD cost details in cost reports, and official integrations with Datadog/Grafana Cloud/Honeycomb/CloudZero/Vantage
Ch66
Data Residency and Compliance: Complete Enterprise Configuration for inference_geo / ZDR / HIPAA
inference_geo parameter (us/global) 10% surcharge mechanism, ZDR coverage scope and exclusions, HIPAA BAA acquisition process, PHI restrictions in schema definitions, and HIPAA-unsupported feature list
Ch67
SSO/SAML + SCIM: Enterprise Integration with Okta / Microsoft Entra ID / Google Workspace
WorkOS as Anthropic's identity provider architecture, SAML 2.0/OIDC configuration steps, specific configs for major IdPs (Okta/Entra ID/Google Workspace/JumpCloud), and SCIM automated user management with role mapping
Ch68
Model Spec and Responsible Scaling Policy: Impact on Developer System Design
Model Spec four-level priority (safety > ethics > Anthropic principles > helpfulness), Operator/User permission hierarchy in system design, RSP AI Safety Level impacts on model capability limits, and developer compliance checklist
Ch69
Content Policy and Usage Guidelines: Absolute Prohibitions, High-Risk Use Cases and Operator Permission Boundaries
Five categories of absolute prohibitions, human review requirements for high-risk use cases, expandable vs non-expandable Operator permission boundaries, and violation handling process
Ch70
Prompt Injection Defense: Attack Vectors, Input Sanitization, Built-in Classifiers and Defense Architecture Design
Direct vs indirect injection attack mechanics, 23.6% attack success rate case analysis for Content Scripts, Claude's built-in Prompt Injection classifier, input sanitization strategies, and system-level defense architecture (sandbox isolation/whitelisting/double confirmation)
Ch71
Production Best Practices: Multi-Key Rotation, Exponential Backoff, Circuit Breaker Design and Reliability Architecture
Multi-API Key priority rotation strategy, exponential backoff for 429/quota errors, circuit breaker pattern to prevent cascading failures, request queue design, fallback strategies (Haiku model fallback), and health check with SLA monitoring
Ch72
Complete Cost Optimization Playbook: Combined Strategies for Model Routing, Caching and Batch Processing with ROI Calculation
Three-layer cost optimization framework, dynamic routing decision tree for Haiku/Sonnet/Opus, cache hit rate optimization techniques, Batch scenario identification and scheduling, and complete monthly cost control calculation model
Ch73
Workbench + Evaluation Frameworks: Prompt Generator, Datadog Monitoring and Systematic Eval Methodology
Console Workbench's three tools (Prompt Generator/Improver/Templates), Datadog/Grafana Cloud official integration configuration, A/B testing framework design, and systematic Eval methodology (test set construction/scoring criteria/regression detection)
Ch74
Claude-Specific Prompt Techniques: Practical Handbook for XML Tags, Document Positioning and Parallel Tool Templates
Correct XML tag usage (structured I/O/output control/role separation), why document-first positioning improves performance by 30%, multi-document <documents> structure, quote-then-analyze pattern, and parallel tool call prompt templates
Ch75
Model Behavior Tuning: Opus 4.7 Literal Execution, Default Style Override and Subagent Frequency Control
Design rationale and strategies for Opus 4.7's literal instruction execution, overriding Opus 4.7/4.6 default frontend style (cream color/serif fonts), controlling subagent auto-spawning frequency, and prompt adaptation differences across models
Ch76
Long Document Processing Prompt Best Practices: Chunking Strategies, Recursive Summarization and RAG Retrieval Augmentation
Document chunking algorithms (fixed-size/semantic/recursive), hierarchical compression via recursive summarization, RAG retrieval strategies (vector/BM25/hybrid), Files API integration, and prompt design for multi-document fusion summarization
Ch77
Fast Mode in Practice: 2.5x Speed, 6x Price and Scenario Judgment for Opus 4.6 High-Speed Inference
fast-mode-2026-02-01 beta application process, speed:fast parameter configuration, $30/$150 per MTok cost-benefit analysis, Fast Mode suitable scenarios (real-time conversation/low-latency requirements/user wait sensitivity), and latency-cost comparison with Haiku
Ch78
Case Study 1: Enterprise Code Review System (Complete Architecture with Plugin + Hook + LSP + CI/CD)
Requirements analysis (multilingual codebase/team standards enforcement/PR pipeline integration), Plugin design (LSP for semantic analysis/Hooks for dangerous operation interception/Skills for review templates), GitHub Actions integration, and cost and efficiency metrics
Ch79
Case Study 2: Enterprise Knowledge Base Agent (Complete Implementation with RAG + Memory Tool + Managed Agents)
Knowledge base architecture design (document indexing/vector storage/incremental updates), Memory Tool for user preferences and history, Managed Agents Sessions API for long-term state, multi-agent division (retrieval/generation/quality agents), and production deployment architecture
Ch80
Case Study 3: Multilingual Content Production Pipeline (Batch + Structured Outputs + Connector Automation)
Business scenario for global content production (10 languages/1000 articles daily/quality consistency), Batch API for bulk translation and localization, Structured Outputs for enforced formatting, Connector-triggered auto-publishing to CMS, and optimization path reducing cost from $0.8 to $0.08 per article