Chapter 58

Browser Extension Development: Manifest V3 + Service Worker for Secure Claude API Calls

Chapter 58: From Skill to Plugin: Upgrade Paths and Capability Leaps

58.1 The Essential Difference Between Skills and Plugins

In the Claude ecosystem, Skills and Plugins are often used interchangeably, but they represent different capability levels and implementation complexities.

A Skill is a lightweight, stateless capability extension. At its core, it is a set of carefully crafted prompt templates that tell Claude how to handle specific task types—translation, code review, sentiment analysis, format conversion. A Skill makes no external API calls, persists no data, and completes all logic within Claude's reasoning process.

Skill characteristics:
- No external API calls
- Stateless (no persistence between requests)
- Pure prompt-driven
- Fast to deploy, no infrastructure needed
- Capability boundary: Claude's language understanding and generation

A Plugin is a heavyweight extension with genuine tool-calling capability. It equips Claude with real tools—querying databases, calling REST APIs, reading and writing files, executing code. Plugins are injected via the tools parameter in the API call; Claude decides during reasoning when to call which tool and incorporates the tool results into its continued reasoning.

Plugin characteristics:
- Can call any external system
- Stateful (tools can persist data)
- Tool schema + execution logic
- Requires server/serverless infrastructure
- Capability boundary: anything accessible via API

When your Skill starts hitting requirements like "I need real-time data," "I need to save results," or "I need to interact with an external system," that is your signal to upgrade to a Plugin.

58.2 Capability Assessment Matrix Before Upgrading

Not every Skill needs to become a Plugin. Use this matrix to evaluate your scenario:

Dimension	Better as Skill	Better as Plugin
Data source	Claude's training knowledge	Real-time or private data
Operation type	Pure analysis/generation	Operations with side effects
State requirements	No cross-conversation state needed	Needs to read/write persistent storage
Accuracy requirements	Probabilistic LLM errors acceptable	Precise structured data required
Latency sensitivity	LLM inference latency acceptable	Extremely latency-sensitive (consider caching)
Maintenance cost	Low (prompt only)	High (must maintain tool services)

Typical upgrade triggers:

User asks "What's today's weather?" — needs real-time data → Plugin
User asks "Look up my order" — needs private database → Plugin
User says "Send this report to the CEO" — needs side-effect operation → Plugin
User asks "Review this code" — pure analysis → Skill is sufficient

58.3 Gradual Upgrade Path

The upgrade from Skill to Plugin doesn't have to be a one-step "big refactor." A gradual path is recommended to reduce risk.

Phase 1: Skill + Manual Tools (Validating Requirements)

Before implementing a real Plugin, validate requirements with a "pseudo Plugin": have Claude output structured "tool call instructions" in its response, then parse and manually execute them in your application code.

# Phase 1: Pseudo Plugin for validation
import anthropic
import json

client = anthropic.Anthropic()

SYSTEM_PROMPT = """You are an order assistant. When users ask about orders,
output the following JSON format, and I'll execute the query for you:

<tool_call>
{
  "action": "query_order",
  "order_id": "the order ID provided by the user"
}
</tool_call>

After I return results, provide the final response."""

def pseudo_plugin_flow(user_message: str, mock_order_data: dict) -> str:
    # Round 1: Claude outputs tool call instruction
    response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=512,
        system=SYSTEM_PROMPT,
        messages=[{"role": "user", "content": user_message}]
    )
    
    text = response.content[0].text
    
    if "<tool_call>" in text:
        start = text.index("<tool_call>") + len("<tool_call>")
        end = text.index("</tool_call>")
        tool_call = json.loads(text[start:end].strip())
        
        # Execute manually (using mock data here)
        result = mock_order_data.get(tool_call["order_id"], "Order not found")
        
        # Round 2: Feed results back to Claude
        final_response = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=512,
            system=SYSTEM_PROMPT,
            messages=[
                {"role": "user", "content": user_message},
                {"role": "assistant", "content": text},
                {"role": "user", "content": f"Query result: {result}"}
            ]
        )
        return final_response.content[0].text
    
    return text

The purpose of this phase is to validate business logic, not to pursue engineering quality. If users genuinely need the feature, proceed to Phase 2.

Phase 2: Native Tool Use (Real Plugin-ification)

# Phase 2: Native Tool Use
import anthropic
from database import get_order_by_id

client = anthropic.Anthropic()

TOOLS = [
    {
        "name": "query_order",
        "description": "Query order details by order ID",
        "input_schema": {
            "type": "object",
            "properties": {
                "order_id": {
                    "type": "string",
                    "description": "Unique order identifier"
                }
            },
            "required": ["order_id"]
        }
    }
]

def tool_use_flow(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]
    
    while True:
        response = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=1024,
            tools=TOOLS,
            messages=messages
        )
        
        if response.stop_reason == "end_turn":
            return response.content[0].text
        
        if response.stop_reason == "tool_use":
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = get_order_by_id(block.input["order_id"])
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result)
                    })
            
            messages.extend([
                {"role": "assistant", "content": response.content},
                {"role": "user", "content": tool_results}
            ])

Key changes from Phase 1 to Phase 2:

<tool_call> XML tags → official tools parameter
Manual JSON parsing → native tool_use block structure
Two independent requests → continuous requests within native loop
Mock data → real database calls

Phase 3: Production Hardening

After native Tool Use is working, a series of production hardening steps are needed:

# Phase 3: Production-hardened Plugin
import anthropic
import logging
import time
from functools import wraps

logger = logging.getLogger(__name__)

def with_retry(max_attempts: int = 3, delay: float = 1.0):
    """Tool call retry decorator with exponential backoff"""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            last_error = None
            for attempt in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    last_error = e
                    if attempt < max_attempts - 1:
                        logger.warning(f"Tool call failed (attempt {attempt+1}): {e}")
                        time.sleep(delay * (2 ** attempt))
            raise last_error
        return wrapper
    return decorator

class ProductionPlugin:
    def __init__(self, tools: list, tool_handlers: dict):
        self.client = anthropic.Anthropic()
        self.tools = tools
        self.tool_handlers = tool_handlers
        self.max_iterations = 10

    @with_retry(max_attempts=3)
    def _call_tool(self, tool_name: str, tool_input: dict) -> str:
        handler = self.tool_handlers.get(tool_name)
        if not handler:
            raise ValueError(f"Unknown tool: {tool_name}")
        try:
            result = handler(**tool_input)
            return json.dumps(result)
        except Exception as e:
            logger.error(f"Tool {tool_name} failed: {e}", exc_info=True)
            return json.dumps({"error": str(e), "tool": tool_name})

    def run(self, user_message: str, system: str = "") -> dict:
        messages = [{"role": "user", "content": user_message}]
        iterations = 0
        all_tool_calls = []

        while iterations < self.max_iterations:
            iterations += 1
            response = self.client.messages.create(
                model="claude-opus-4-5",
                max_tokens=2048,
                system=system,
                tools=self.tools,
                messages=messages
            )

            if response.stop_reason == "end_turn":
                final_text = next(
                    (b.text for b in response.content if hasattr(b, "text")), ""
                )
                return {"response": final_text, "tool_calls": all_tool_calls, "iterations": iterations}

            if response.stop_reason == "tool_use":
                tool_results = []
                for block in response.content:
                    if block.type == "tool_use":
                        start_time = time.time()
                        result_str = self._call_tool(block.name, block.input)
                        elapsed = time.time() - start_time
                        all_tool_calls.append({
                            "name": block.name, "input": block.input,
                            "elapsed_ms": int(elapsed * 1000)
                        })
                        tool_results.append({
                            "type": "tool_result", "tool_use_id": block.id,
                            "content": result_str
                        })
                messages.extend([
                    {"role": "assistant", "content": response.content},
                    {"role": "user", "content": tool_results}
                ])

        raise RuntimeError(f"Exceeded max iterations ({self.max_iterations})")

58.4 Capability Leaps: New Paradigms Unlocked by Plugins

Upgrading from Skill to Plugin is not merely "adding tool calls"—it unlocks entirely new interaction paradigms.

Leap 1: From Static Knowledge to Real-Time Awareness

REALTIME_TOOLS = [
    {
        "name": "get_stock_price",
        "description": "Get real-time price for a stock symbol",
        "input_schema": {
            "type": "object",
            "properties": {
                "symbol": {"type": "string", "description": "Stock ticker, e.g. AAPL"},
                "currency": {"type": "string", "enum": ["USD", "HKD", "CNY"], "default": "USD"}
            },
            "required": ["symbol"]
        }
    }
]

Leap 2: From One-Off Analysis to Continuous Tracking

TRACKING_TOOLS = [
    {
        "name": "create_tracker",
        "description": "Create a persistent tracking job",
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "condition": {"type": "string", "description": "Natural language description of trigger condition"},
                "notify_channel": {"type": "string", "enum": ["email", "slack", "webhook"]}
            },
            "required": ["name", "condition", "notify_channel"]
        }
    }
]

Leap 3: From Passive Response to Active Execution

The most powerful capability leap is Plugins transforming Claude from "assistant that answers questions" to "agent that executes tasks":

# With these tools, user can simply say:
# "Research competitor XYZ's recent activity, write a report, and email it to my boss"
# Claude will autonomously: search → read pages → synthesize → write doc → send email

AGENT_TOOLS = [
    {"name": "web_search", ...},
    {"name": "read_url", ...},
    {"name": "write_document", ...},
    {"name": "send_email", ...}
]

58.5 Common Pitfalls During Upgrade

Pitfall 1: Wrong Tool Granularity

# Wrong: tool granularity too coarse (one tool does too much)
{
    "name": "handle_customer_request",
    "description": "Handle customer requests including order queries, address changes, refunds, etc."
    # Claude cannot make precise choices with such a broad tool
}

# Right: each tool has a single responsibility
[
    {"name": "get_order", "description": "Query order details"},
    {"name": "update_shipping_address", "description": "Update shipping address"},
    {"name": "create_refund_request", "description": "Submit a refund request"}
]

Pitfall 2: Ignoring Idempotency

# Dangerous: non-idempotent tool causes duplicate execution on retry
def send_notification(user_id: str, message: str) -> dict:
    notification_service.send(user_id, message)  # Sends twice on retry!
    return {"sent": True}

# Safe: idempotent design using a request key
def send_notification(user_id: str, message: str, idempotency_key: str) -> dict:
    if notification_service.already_sent(idempotency_key):
        return {"sent": True, "duplicate": True}
    notification_service.send(user_id, message, key=idempotency_key)
    return {"sent": True, "duplicate": False}

Pitfall 3: Not Handling Tool Failures

def execute_tool_safely(tool_name: str, tool_input: dict) -> str:
    try:
        result = tool_handlers[tool_name](**tool_input)
        return json.dumps({"success": True, "data": result})
    except PermissionError as e:
        return json.dumps({"success": False, "error": "permission_denied", "message": str(e)})
    except ValueError as e:
        return json.dumps({"success": False, "error": "invalid_input", "message": str(e)})
    except Exception as e:
        logger.error(f"Unexpected error in tool {tool_name}", exc_info=True)
        return json.dumps({"success": False, "error": "internal_error",
                          "message": "Operation temporarily unavailable, please retry"})

58.6 From Single Plugin to Plugin Ecosystem

A mature Plugin architecture is not a massive monolithic toolset, but a composition of multiple focused, reusable Plugin modules.

class PluginRegistry:
    def __init__(self):
        self._plugins: dict[str, dict] = {}

    def register(self, plugin_id: str, tools: list, handlers: dict,
                 permissions: list[str] = None):
        self._plugins[plugin_id] = {
            "tools": tools,
            "handlers": handlers,
            "permissions": permissions or []
        }

    def get_tools_for_user(self, user_id: str, user_roles: list[str]) -> tuple:
        """Dynamically return available tools based on user permissions"""
        available_tools = []
        available_handlers = {}

        for plugin_id, plugin in self._plugins.items():
            required = set(plugin["permissions"])
            if not required or required.intersection(user_roles):
                available_tools.extend(plugin["tools"])
                available_handlers.update(plugin["handlers"])

        return available_tools, available_handlers

# Usage
registry = PluginRegistry()
registry.register("orders", ORDER_TOOLS, ORDER_HANDLERS, permissions=["customer_service"])
registry.register("analytics", ANALYTICS_TOOLS, ANALYTICS_HANDLERS, permissions=["analyst", "admin"])
registry.register("search", SEARCH_TOOLS, SEARCH_HANDLERS, permissions=[])  # Available to all

tools, handlers = registry.get_tools_for_user("user_001", ["customer_service"])
plugin = ProductionPlugin(tools=tools, tool_handlers=handlers)

Summary

The upgrade from Skill to Plugin represents a critical leap for Claude applications from "intelligent text processing" to "real-world system agency." The recommended upgrade path follows three phases: pseudo-Plugin for requirements validation, native Tool Use implementation, and production hardening. The three dimensions of capability leaps are: static knowledge to real-time awareness, one-off analysis to continuous tracking, and passive response to active execution. Key design principles include: single-responsibility tools, idempotency guarantees, and explicit error propagation. When a single Plugin cannot meet complex scenarios, the Plugin Registry pattern enables permission-based dynamic tool composition, building a scalable Plugin ecosystem.

Rate this chapter

4.5 / 5 (3 ratings)