功能描述

Use when building AI-powered features with the Claude API or Anthropic SDK — structured outputs, tool calling, streaming, multi-provider routing, multi-agent...

使用说明 (SKILL.md)

AI Integration Skill

Name: Ai Integration
Author: jimmy974

Comprehensive patterns for integrating the Anthropic Claude API into production systems — from basic API calls to full multi-agent orchestration with state management, memory, and evaluation.

When to Use This Skill

Activate when:

Building a Claude API integration or wrapper
Implementing structured outputs, tool calling, or streaming
Setting up multi-provider LLM routing (LiteLLM, fallbacks)
Designing multi-agent orchestration or agentic loops
Implementing RAG or persistent agent memory
Evaluating LLM output quality or building evals
Deploying agents to Next.js, Python FastAPI, or Docker

Don't use this skill for:

Kubernetes/Terraform config unrelated to AI infra
General React/Next.js features not involving LLM calls

Core Principles

1. Single Agent vs Multi-Agent

Pattern	When to Use	Cost
Single agent	Linear tasks, simple I/O, \x3C5 steps	Low
Subagent delegation	Parallel tasks, specialized expertise needed	Medium
Multi-agent swarm	Complex autonomous workflows, >10 steps	High — budget like a team

Infrastructure math (2026): Multi-agent compute costs jump ~3x when moving from single to orchestrated swarms. Budget before you build.

2. Agent Communication Patterns

Hub-and-spoke (most common): Orchestrator delegates to specialist agents.

orchestrator
  ├── researcher-agent   (web search, docs)
  ├── coder-agent        (code generation, tests)
  └── reviewer-agent     (quality, security check)

Pipeline: Output of one agent is input to next (linear, predictable).

Swarm: Agents with shared memory, no single orchestrator. Use for exploration tasks.

3. Context Window Management

import anthropic

client = anthropic.Anthropic()

def sliding_window(messages: list[dict], max_tokens: int = 150_000) -> list[dict]:
    """Drop oldest messages to stay within token budget."""
    # Rough estimate: 1 token ≈ 4 chars
    while len(messages) > 2:
        total = sum(len(m["content"]) // 4 for m in messages)
        if total \x3C= max_tokens:
            break
        messages = messages[1:]  # drop oldest non-system message
    return messages

def summarize_history(messages: list[dict]) -> list[dict]:
    """Compress old turns into a summary to reclaim context budget."""
    if len(messages) \x3C= 4:
        return messages
    history = "\
".join(f"{m['role']}: {m['content']}" for m in messages[:-2])
    summary = client.messages.create(
        model="claude-haiku-4-5", max_tokens=512,
        messages=[{"role": "user", "content": f"Summarize concisely:\
{history}"}],
    ).content[0].text
    return [{"role": "user", "content": f"[Prior context]\
{summary}"}] + messages[-2:]

Structured Outputs

Pydantic binding with `instructor` (recommended)

import anthropic
import instructor
from pydantic import BaseModel

class Entity(BaseModel):
    name: str
    type: str       # person | org | location | concept
    description: str

class ExtractionResult(BaseModel):
    entities: list[Entity]
    summary: str

# instructor patches the client — returns validated Pydantic models
client = instructor.from_anthropic(anthropic.Anthropic())

result: ExtractionResult = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": f"Extract all entities from:\
{text}"}],
    response_model=ExtractionResult,
)
print(result.entities[0].name)   # fully typed, validated

Schema enforcement without instructor (TypeScript)

import Anthropic from "@anthropic-ai/sdk";
import { z } from "zod";

const client = new Anthropic();

const EntitySchema = z.object({
  entities: z.array(z.object({ name: z.string(), type: z.string() })),
  summary: z.string(),
});

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [{
    role: "user",
    content: `Extract entities. Respond ONLY with valid JSON matching this schema:
{"entities": [{"name": string, "type": string}], "summary": string}

Text: ${inputText}`,
  }],
});

const parsed = EntitySchema.parse(JSON.parse(response.content[0].text));

Tool Calling (Function Calling)

Parallel tool calls + agentic loop (TypeScript)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const tools: Anthropic.Tool[] = [
  { name: "search_web", description: "Search the web",
    input_schema: { type: "object" as const, properties: { query: { type: "string" } }, required: ["query"] } },
  { name: "read_file", description: "Read a local file",
    input_schema: { type: "object" as const, properties: { path: { type: "string" } }, required: ["path"] } },
];

async function runAgentLoop(userMessage: string): Promise\x3Cstring> {
  const messages: Anthropic.MessageParam[] = [{ role: "user", content: userMessage }];

  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-6", max_tokens: 4096,
      tools, tool_choice: { type: "auto" },  // or { type: "tool", name: "search_web" }
      messages,
    });

    if (response.stop_reason === "end_turn") {
      return response.content.filter((b) => b.type === "text").map((b) => b.text).join("");
    }

    // Claude may call multiple tools in parallel — handle all at once
    const toolUses = response.content.filter((b) => b.type === "tool_use");
    const toolResults = await Promise.all(
      toolUses.map(async (tu) => ({
        type: "tool_result" as const,
        tool_use_id: (tu as Anthropic.ToolUseBlock).id,
        content: await executeTool((tu as Anthropic.ToolUseBlock).name, (tu as Anthropic.ToolUseBlock).input),
      }))
    );

    messages.push({ role: "assistant", content: response.content });
    messages.push({ role: "user", content: toolResults });
  }
}

Streaming Responses

Python streaming

import anthropic

client = anthropic.Anthropic()

# Stream text
with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": prompt}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
    final = stream.get_final_message()
    print(f"\
[{final.usage.input_tokens} in / {final.usage.output_tokens} out tokens]")

TypeScript streaming

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const stream = await client.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 4096,
  stream: true,
  messages: [{ role: "user", content: prompt }],
});

for await (const chunk of stream) {
  if (chunk.type === "content_block_delta" && chunk.delta.type === "text_delta") {
    process.stdout.write(chunk.delta.text);
  }
}

Next.js SSE streaming route

// app/api/chat/route.ts
import Anthropic from "@anthropic-ai/sdk";
import { NextRequest } from "next/server";

const client = new Anthropic();

export async function POST(req: NextRequest) {
  const { messages } = await req.json();
  const stream = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 2048,
    stream: true,
    messages,
  });

  const encoder = new TextEncoder();
  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        if (chunk.type === "content_block_delta" && chunk.delta.type === "text_delta") {
          controller.enqueue(encoder.encode(chunk.delta.text));
        }
      }
      controller.close();
    },
  });

  return new Response(readable, {
    headers: { "Content-Type": "text/plain; charset=utf-8" },
  });
}

Multi-Provider Routing (LiteLLM)

from litellm import completion, Router

# Provider-agnostic call — same interface for Claude, OpenAI, Gemini
def llm_call(messages: list[dict], model: str = "claude-sonnet-4-6") -> str:
    response = completion(
        model=model,       # "claude-sonnet-4-6" | "gpt-4o" | "gemini/gemini-1.5-pro"
        messages=messages,
        max_tokens=1024,
    )
    return response.choices[0].message.content

# Automatic fallback: try primary model, fall back on error
response = completion(
    model="claude-opus-4-6",
    messages=messages,
    fallbacks=["claude-sonnet-4-6", "gpt-4o"],
    max_tokens=1024,
)

# Cost-aware routing: route by quality tier
router = Router(model_list=[
    {"model_name": "fast",    "litellm_params": {"model": "claude-haiku-4-5"}},
    {"model_name": "smart",   "litellm_params": {"model": "claude-sonnet-4-6"}},
    {"model_name": "premium", "litellm_params": {"model": "claude-opus-4-6"}},
])

# Pick tier based on task complexity
tier = "fast" if simple_task else "smart"
response = router.completion(model=tier, messages=messages)
print(response.choices[0].message.content)

Prompt Versioning

import hashlib

# Version-pinned prompt registry — pin versions to prevent silent regressions
PROMPTS = {
    "summarize:v1": "Summarize in {max_words} words:\
{text}",
    "summarize:v2": "Create a {max_words}-word summary focusing on key decisions:\
{text}",
}

def run_prompt(key: str, **kwargs) -> str:
    template = PROMPTS[key]
    hash_id = hashlib.sha256(template.encode()).hexdigest()[:8]
    # Log key + hash for reproducibility and A/B analysis
    print(f"[prompt] key={key} hash={hash_id}")
    return llm_call([{"role": "user", "content": template.format(**kwargs)}])

Multi-Agent Orchestration

Orchestrator pattern (Python)

import anthropic, asyncio

client = anthropic.Anthropic()

AGENTS = {
    "planner":     "Break this task into subtasks. Output JSON: {\"research_tasks\": [], \"code_tasks\": []}",
    "researcher":  "Research the provided topics. Be concise and factual.",
    "coder":       "Write clean, tested Python code for the provided specs.",
    "synthesizer": "Combine these results into a final cohesive answer.",
}

def call_agent(role: str, content: str, model: str = "claude-sonnet-4-6") -> str:
    resp = client.messages.create(
        model=model,
        max_tokens=2048,
        system=AGENTS[role],
        messages=[{"role": "user", "content": content}],
    )
    return resp.content[0].text

async def orchestrate(task: str) -> str:
    """Hub-and-spoke orchestrator: plan → parallel execute → synthesize."""
    import json
    plan = json.loads(call_agent("planner", task))

    research, code = await asyncio.gather(
        asyncio.to_thread(call_agent, "researcher", str(plan.get("research_tasks", []))),
        asyncio.to_thread(call_agent, "coder",      str(plan.get("code_tasks", []))),
    )
    return call_agent("synthesizer", f"Research:\
{research}\
\
Code:\
{code}")

Agent Memory Patterns

Medium-term: SQLite (cross-session)

import sqlite3, json

conn = sqlite3.connect("agent_memory.db")
conn.execute("CREATE TABLE IF NOT EXISTS memory (key TEXT PRIMARY KEY, value TEXT, updated_at TEXT)")

def remember(key: str, value: dict):
    conn.execute("INSERT OR REPLACE INTO memory VALUES (?, ?, datetime('now'))", [key, json.dumps(value)])
    conn.commit()

def recall(key: str) -> dict | None:
    row = conn.execute("SELECT value FROM memory WHERE key=?", [key]).fetchone()
    return json.loads(row[0]) if row else None

Long-term: Vector DB (semantic search / RAG)

from qdrant_client import QdrantClient
import anthropic

qdrant = QdrantClient(":memory:")
claude = anthropic.Anthropic()

def rag_query(query: str, context_collection: str = "memory") -> str:
    hits = qdrant.search(collection_name=context_collection,
                         query_vector=get_embedding(query), limit=5)
    context = "\
".join(h.payload["text"] for h in hits)
    response = claude.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system=f"Answer using this context:\
{context}",
        messages=[{"role": "user", "content": query}],
    )
    return response.content[0].text

LLM Evaluation

Basic eval harness

import anthropic

def run_eval(cases: list[tuple[str, str]], system_prompt: str) -> dict:
    """cases: list of (input, expected_output) tuples."""
    client = anthropic.Anthropic()
    results = {"pass": 0, "fail": 0, "score": 0.0, "cases": []}
    for inp, expected in cases:
        actual = client.messages.create(
            model="claude-haiku-4-5", max_tokens=512,  # use cheap model for evals
            system=system_prompt,
            messages=[{"role": "user", "content": inp}],
        ).content[0].text.strip()
        passed = actual == expected
        results["pass" if passed else "fail"] += 1
        results["cases"].append({"input": inp, "actual": actual, "pass": passed})
    results["score"] = results["pass"] / len(cases)
    return results

LLM-as-judge

import json

def llm_judge(question: str, answer: str, rubric: str) -> dict:
    response = client.messages.create(
        model="claude-haiku-4-5",
        messages=[{"role": "user", "content": f"""Rate this answer 1-5.

Question: {question}
Answer: {answer}
Rubric: {rubric}

Output JSON: {{"score": int, "reasoning": str}}"""}],
        max_tokens=256,
    )
    return json.loads(response.content[0].text)

Production Deployment

Error handling and retries

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ maxRetries: 3, timeout: 60_000 });

try {
  const response = await client.messages.create({ /* ... */ });
} catch (err) {
  if (err instanceof Anthropic.RateLimitError) {
    const retryAfter = Number(err.headers?.["retry-after"] ?? 30) * 1000;
    await new Promise((r) => setTimeout(r, retryAfter));
  } else if (err instanceof Anthropic.APIConnectionError) {
    // network issue — SDK will auto-retry up to maxRetries
  }
}

Cost tracking

def track_cost(response: anthropic.types.Message) -> float:
    PRICES = {
        "claude-opus-4-6":    (0.015, 0.075),   # (input, output) per 1k tokens
        "claude-sonnet-4-6":  (0.003, 0.015),
        "claude-haiku-4-5":   (0.00025, 0.00125),
    }
    model = response.model
    if model not in PRICES:
        return 0.0
    in_cost  = response.usage.input_tokens  / 1000 * PRICES[model][0]
    out_cost = response.usage.output_tokens / 1000 * PRICES[model][1]
    return in_cost + out_cost

Prompt Engineering

Chain-of-thought: Prefix with Think step by step: or enumerate reasoning steps before the answer.
Output format pinning: Specify format in system prompt AND show a concrete example. Never rely on defaults for structured data.
Temperature: 0 = deterministic (evals, extraction) | 0.3-0.7 = balanced | 1.0 = creative/diverse.

Related Skills

temporal-testing — test async agent workflows
browser-automation — give agents web browsing capability
frontend-design — build AI-powered Next.js UIs
data-analysis-report — agent-driven data analysis pipelines
llm-observability — trace and monitor LLM calls in production

安全使用建议

This is an authored guide for building Anthropic/Claude integrations and is broadly coherent, but it leaves two practical security questions unanswered: (1) it doesn't declare required API keys or credentials (you will almost certainly need an Anthropic API key and possibly other provider credentials), and (2) its tool-calling examples include a read_file tool and open tool definitions that — if implemented without safeguards — allow agents to read arbitrary local files or call external endpoints. Before installing or enabling this skill: 1) confirm with the publisher what credentials are required and how they should be provided/stored; 2) if you implement any tools the skill suggests, enforce strict input validation and path whitelisting (deny access to /etc, home/.ssh, vaults, etc.); 3) avoid giving an autonomous agent unrestricted filesystem or network access — prefer manual review or tightly scoped tools; 4) review the full SKILL.md for any other implicit behaviors (streaming, multi-provider routing) and only enable the parts you need. If the publisher can supply an updated SKILL.md that explicitly lists required env vars and documents safe tool constraints, my confidence in this assessment would increase and many concerns would be resolved.

功能分析

Type: OpenClaw Skill Name: ai-integration Version: 1.0.0 The skill bundle provides standard educational patterns and boilerplate code for integrating the Anthropic Claude API, including structured outputs, tool calling, and multi-agent orchestration. The code snippets in SKILL.md use legitimate libraries (Anthropic SDK, instructor, LiteLLM, Qdrant) and follow industry best practices for context management and error handling. No evidence of data exfiltration, malicious execution, or prompt injection attacks was found.

能力评估

ℹ Purpose & Capability

The name/description (Anthropic/Claude integration, structured outputs, tool calling, multi-agent) aligns with the SKILL.md content and code examples. However, the skill does not declare any required environment variables (e.g., Anthropic API keys) or primary credential even though the examples assume an Anthropic client and multi-provider routing; that omission is notable and reduces clarity about what secrets the integration will need.

⚠ Instruction Scope

The runtime instructions include examples that define tools such as a read_file tool (path parameter) and web-search tools and show agentic loops that can call tools autonomously. Those examples implicitly permit reading arbitrary local files and making external calls unless implementers add constraints; the SKILL.md does not explicitly instruct limiting file paths, validating inputs, or preventing sensitive-data reads, which is scope creep relative to a simple integration guide.

✓ Install Mechanism

This is an instruction-only skill with no install spec and no code files. That minimizes direct install risk because nothing is downloaded or written by the skill itself.

⚠ Credentials

The skill describes using Anthropic, LiteLLM, and other providers but declares no environment variables or primary credential. Real integrations will require API keys or credentials; the absence of declared env vars is an inconsistency that makes it unclear what secrets the agent or developer must supply and how they will be used.

✓ Persistence & Privilege

always is false and the skill does not request persistent or system-wide modifications. Autonomous invocation (model-invocation not disabled) is the default; it is only a concern combined with the instruction scope issues (tooling that can read local files).

版本历史

v1.0.0

ai-integration v1.0.0 - Initial release offering patterns and best practices for integrating the Claude API and Anthropic SDK. - Covers single-agent, subagent, and multi-agent orchestration for Next.js, Python, and TypeScript stacks. - Provides guidance for structured outputs (Pydantic, Zod), tool/function calling, streaming, agent memory, context management, and RAG pipelines. - Includes practical code examples for API integration, tool calling, streaming responses, schema enforcement, and agentic loops. - Highlights multi-provider LLM routing, evaluation techniques, and production deployment scenarios.

元数据

Slug ai-integration

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Ai Integration 是什么？

Use when building AI-powered features with the Claude API or Anthropic SDK — structured outputs, tool calling, streaming, multi-provider routing, multi-agent... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 99 次。

如何安装 Ai Integration？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ai-integration」即可一键安装，无需额外配置。

Ai Integration 是免费的吗？

是的，Ai Integration 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Ai Integration 支持哪些平台？

Ai Integration 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Ai Integration？

由 Jimmy974（@jimmy974）开发并维护，当前版本 v1.0.0。

Ai Integration