第 31 章

Skill 输入输出契约设计

第31章:Skill 输入输出契约设计

一个 Skill 的质量,很大程度上取决于其接口设计的严格程度。糟糕的输入验证会让 Skill 在意外输入下崩溃;模糊的输出格式会让调用方无法可靠地处理结果。本章系统讲解 Skill 输入输出契约的设计原则,帮助你构建健壮、可预期、可演进的 Skill 接口。


31.1 输入 Schema 设计原则

原则一:最小必需原则

只将真正必需的参数设为 required,其余提供合理默认值。过多的必需参数会降低 Skill 的易用性:

// 错误示例:过度 required ❌
{
  "type": "object",
  "required": ["topic", "time_range", "max_articles", "language", "format", "save_file"],
  "properties": { ... }
}

// 正确示例:最小必需 ✓
{
  "type": "object",
  "required": ["topic"],
  "properties": {
    "topic": {
      "type": "string",
      "description": "The news topic to research"
    },
    "time_range": {
      "type": "string",
      "enum": ["today", "24h", "this_week"],
      "default": "today",
      "description": "Time range for news search"
    },
    "max_articles": {
      "type": "integer",
      "minimum": 1,
      "maximum": 10,
      "default": 5
    }
  }
}

原则二:类型约束要精确

避免使用宽泛的 string 类型表达有限集合或数值:

{
  "properties": {
    // 错误:用 string 表达枚举 ❌
    "format": {
      "type": "string",
      "description": "Output format: prose, bullets, or structured"
    },
    
    // 正确:使用 enum ✓
    "format": {
      "type": "string",
      "enum": ["prose", "bullets", "structured"],
      "default": "prose"
    },
    
    // 错误:缺少数值约束 ❌
    "timeout_seconds": {
      "type": "integer"
    },
    
    // 正确:带范围约束 ✓
    "timeout_seconds": {
      "type": "integer",
      "minimum": 5,
      "maximum": 120,
      "default": 30
    }
  }
}

原则三:提供丰富的 description

description 字段直接影响 LLM 是否能正确填充参数。优秀的 description 包含:格式说明、示例值、边界条件:

{
  "properties": {
    // 差:过于简单 ❌
    "date_range": {
      "type": "string",
      "description": "Date range"
    },
    
    // 好:包含格式、示例、约束 ✓
    "date_range": {
      "type": "string",
      "description": "Date range in ISO 8601 format. Use 'YYYY-MM-DD/YYYY-MM-DD' for a range, or 'YYYY-MM-DD' for a single day. Examples: '2024-11-01/2024-11-30', '2024-11-20'. Maximum range: 365 days.",
      "pattern": "^\\d{4}-\\d{2}-\\d{2}(/\\d{4}-\\d{2}-\\d{2})?$",
      "examples": ["2024-11-20", "2024-11-01/2024-11-30"]
    }
  }
}

原则四:使用 additionalProperties: false 防止意外参数

{
  "type": "object",
  "required": ["topic"],
  "properties": { ... },
  "additionalProperties": false  // 拒绝未声明的参数
}

完整的输入 Schema 示例

# Skill 输入 Schema 完整示例:Daily News Digest
NEWS_DIGEST_INPUT_SCHEMA = {
    "$schema": "https://json-schema.org/draft/2020-12",
    "type": "object",
    "title": "DailyNewsDigestInput",
    "description": "Input parameters for the Daily News Digest skill",
    "required": ["topics"],
    "additionalProperties": False,
    "properties": {
        "topics": {
            "type": "array",
            "description": "List of news topics to research. Each topic should be a specific subject, event, or entity (not a question). Examples: ['AI regulation', 'Tesla stock'], ['climate summit 2024']",
            "items": {
                "type": "string",
                "minLength": 2,
                "maxLength": 100
            },
            "minItems": 1,
            "maxItems": 5,
            "examples": [
                ["AI regulation in Europe"],
                ["quantum computing", "space exploration"]
            ]
        },
        "time_range": {
            "type": "string",
            "enum": ["today", "24h", "this_week", "this_month"],
            "default": "today",
            "description": "Time range for news articles. 'today' = past 24 hours, 'this_week' = past 7 days."
        },
        "max_articles_per_topic": {
            "type": "integer",
            "minimum": 1,
            "maximum": 10,
            "default": 5,
            "description": "Maximum number of articles to include per topic."
        },
        "output_language": {
            "type": "string",
            "description": "Language code for the output. Use 'auto' to detect from user's message. Examples: 'en', 'zh', 'es', 'auto'",
            "default": "auto",
            "pattern": "^(auto|[a-z]{2})$"
        },
        "output_format": {
            "type": "string",
            "enum": ["prose", "bullets", "structured", "brief"],
            "default": "structured",
            "description": "Output format. 'prose': narrative paragraphs; 'bullets': bullet-point lists; 'structured': headers + bullets; 'brief': one sentence per article."
        },
        "save_to_file": {
            "type": "boolean",
            "default": False,
            "description": "If true, save the digest to a Markdown file. Requires write_file tool."
        },
        "file_path": {
            "type": "string",
            "description": "File path for saving the digest. Only used if save_to_file is true. If not specified, auto-generates: 'news_digest_YYYY-MM-DD.md'",
            "pattern": "^[^<>:\"\\|?*]+\\.md$"
        }
    },
    "if": {
        "properties": {"save_to_file": {"const": True}}
    },
    "then": {
        "properties": {
            "file_path": {
                "type": "string"
            }
        }
    }
}

31.2 输出格式规范

结构化输出 vs 自然语言输出

场景 推荐格式 原因
Skill 被其他 Agent 调用 结构化(JSON/YAML) 机器可解析,避免歧义
Skill 直接呈现给用户 自然语言 + Markdown 可读性好
混合场景 带 Schema 的结构化 Markdown 两者兼顾
错误响应 始终结构化 便于错误处理

结构化输出 Schema

# 新闻摘要输出 Schema
NEWS_DIGEST_OUTPUT_SCHEMA = {
    "$schema": "https://json-schema.org/draft/2020-12",
    "type": "object",
    "title": "DailyNewsDigestOutput",
    "required": ["status", "generated_at", "topics_covered"],
    "properties": {
        "status": {
            "type": "string",
            "enum": ["success", "partial_success", "failed"],
            "description": "Overall execution status"
        },
        "generated_at": {
            "type": "string",
            "format": "date-time",
            "description": "ISO 8601 timestamp of digest generation"
        },
        "topics_covered": {
            "type": "array",
            "items": {
                "type": "object",
                "required": ["topic", "article_count", "articles"],
                "properties": {
                    "topic": {"type": "string"},
                    "article_count": {"type": "integer"},
                    "summary": {
                        "type": "string",
                        "description": "2–3 sentence executive summary"
                    },
                    "articles": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "required": ["title", "source", "url"],
                            "properties": {
                                "title": {"type": "string"},
                                "source": {"type": "string"},
                                "url": {"type": "string", "format": "uri"},
                                "published_at": {
                                    "type": "string",
                                    "description": "Publication date or relative time"
                                },
                                "key_points": {
                                    "type": "array",
                                    "items": {"type": "string"},
                                    "maxItems": 5
                                },
                                "sentiment": {
                                    "type": "string",
                                    "enum": ["positive", "negative", "neutral", "mixed"]
                                }
                            }
                        }
                    }
                }
            }
        },
        "metadata": {
            "type": "object",
            "properties": {
                "search_queries_used": {
                    "type": "array",
                    "items": {"type": "string"}
                },
                "sources_searched": {"type": "integer"},
                "file_saved": {
                    "type": ["string", "null"],
                    "description": "File path if saved, null otherwise"
                }
            }
        },
        "errors": {
            "type": "array",
            "items": {
                "type": "object",
                "required": ["code", "message"],
                "properties": {
                    "code": {"type": "string"},
                    "message": {"type": "string"},
                    "recoverable": {"type": "boolean"}
                }
            }
        }
    }
}

31.3 参数验证与类型约束

Python 验证实现

"""Skill 输入验证器"""
import jsonschema
from datetime import datetime
from typing import Any

class SkillInputValidator:
    """使用 JSON Schema 验证 Skill 输入"""
    
    def __init__(self, schema: dict):
        self.schema = schema
        # 创建验证器(启用格式检查)
        self.validator = jsonschema.Draft202012Validator(
            schema,
            format_checker=jsonschema.FormatChecker()
        )
    
    def validate(self, data: dict) -> tuple[bool, list[str]]:
        """
        验证输入数据。
        返回:(is_valid, error_messages)
        """
        errors = list(self.validator.iter_errors(data))
        
        if not errors:
            return True, []
        
        error_messages = []
        for error in errors:
            path = " -> ".join(str(p) for p in error.absolute_path)
            msg = f"{'[' + path + '] ' if path else ''}{error.message}"
            error_messages.append(msg)
        
        return False, error_messages
    
    def validate_and_raise(self, data: dict):
        """验证失败则抛出带详细信息的异常"""
        is_valid, errors = self.validate(data)
        if not is_valid:
            raise SkillInputError(
                "Input validation failed",
                errors=errors
            )
    
    def coerce_and_validate(self, data: dict) -> dict:
        """尝试类型强制转换后再验证(处理 LLM 常见的类型错误)"""
        coerced = self._coerce_types(data)
        self.validate_and_raise(coerced)
        return coerced
    
    def _coerce_types(self, data: dict) -> dict:
        """
        处理 LLM 常见的类型错误:
        - 将字符串数字转为整数/浮点数
        - 将字符串布尔值转为 bool
        - 将单个值转为单元素数组
        """
        result = dict(data)
        
        schema_props = self.schema.get("properties", {})
        
        for key, value in data.items():
            prop_schema = schema_props.get(key, {})
            expected_type = prop_schema.get("type")
            
            if expected_type == "integer" and isinstance(value, str):
                try:
                    result[key] = int(value)
                except ValueError:
                    pass  # 让 jsonschema 验证失败并给出清晰错误
            
            elif expected_type == "boolean" and isinstance(value, str):
                if value.lower() in ("true", "yes", "1"):
                    result[key] = True
                elif value.lower() in ("false", "no", "0"):
                    result[key] = False
            
            elif expected_type == "array" and not isinstance(value, list):
                result[key] = [value]  # 单值 → 单元素数组
        
        return result


class SkillInputError(Exception):
    """Skill 输入验证错误"""
    def __init__(self, message: str, errors: list[str] = None):
        super().__init__(message)
        self.errors = errors or []
    
    def to_response(self) -> dict:
        return {
            "status": "failed",
            "error": {
                "code": "INVALID_INPUT",
                "message": str(self),
                "details": self.errors
            }
        }


# 使用示例
validator = SkillInputValidator(NEWS_DIGEST_INPUT_SCHEMA)

# 测试正常输入
valid_input = {"topics": ["AI regulation"], "max_articles_per_topic": 3}
is_valid, errors = validator.validate(valid_input)
# is_valid = True, errors = []

# 测试 LLM 常见错误(整数写成字符串)
llm_typo_input = {"topics": ["AI news"], "max_articles_per_topic": "5"}
coerced = validator.coerce_and_validate(llm_typo_input)
# coerced["max_articles_per_topic"] == 5 (int)

# 测试缺少必需参数
missing_required = {"time_range": "today"}
is_valid, errors = validator.validate(missing_required)
# is_valid = False
# errors = ["'topics' is a required property"]

31.4 错误响应格式标准

标准错误响应结构

from enum import Enum
from dataclasses import dataclass, asdict
from typing import Optional, Any

class ErrorCode(str, Enum):
    # 输入类错误
    INVALID_INPUT = "INVALID_INPUT"
    MISSING_REQUIRED_PARAM = "MISSING_REQUIRED_PARAM"
    PARAM_OUT_OF_RANGE = "PARAM_OUT_OF_RANGE"
    
    # 工具类错误
    TOOL_NOT_AVAILABLE = "TOOL_NOT_AVAILABLE"
    TOOL_EXECUTION_FAILED = "TOOL_EXECUTION_FAILED"
    TOOL_TIMEOUT = "TOOL_TIMEOUT"
    
    # 外部服务类错误
    SEARCH_API_ERROR = "SEARCH_API_ERROR"
    FETCH_FAILED = "FETCH_FAILED"
    RATE_LIMIT_EXCEEDED = "RATE_LIMIT_EXCEEDED"
    
    # Skill 内部错误
    INTERNAL_ERROR = "INTERNAL_ERROR"
    PARTIAL_SUCCESS = "PARTIAL_SUCCESS"

@dataclass
class SkillError:
    code: ErrorCode
    message: str
    recoverable: bool
    details: Optional[dict] = None
    suggested_action: Optional[str] = None

@dataclass 
class SkillResponse:
    status: str  # "success" | "partial_success" | "failed"
    data: Optional[Any] = None
    errors: Optional[list[SkillError]] = None
    metadata: Optional[dict] = None
    
    def to_dict(self) -> dict:
        result = {"status": self.status}
        if self.data is not None:
            result["data"] = self.data
        if self.errors:
            result["errors"] = [
                {
                    "code": e.code.value,
                    "message": e.message,
                    "recoverable": e.recoverable,
                    **({"details": e.details} if e.details else {}),
                    **({"suggested_action": e.suggested_action} if e.suggested_action else {})
                }
                for e in self.errors
            ]
        if self.metadata:
            result["metadata"] = self.metadata
        return result

# 错误响应示例
def handle_skill_error(exception: Exception, context: dict) -> SkillResponse:
    """将异常转换为标准错误响应"""
    
    if isinstance(exception, SkillInputError):
        return SkillResponse(
            status="failed",
            errors=[SkillError(
                code=ErrorCode.INVALID_INPUT,
                message="Input parameters are invalid",
                recoverable=True,
                details={"validation_errors": exception.errors},
                suggested_action="Please check the parameter types and values"
            )]
        )
    
    elif isinstance(exception, TimeoutError):
        return SkillResponse(
            status="failed",
            errors=[SkillError(
                code=ErrorCode.TOOL_TIMEOUT,
                message=f"Operation timed out after {context.get('timeout', '?')}s",
                recoverable=True,
                suggested_action="Try again or reduce the number of topics"
            )]
        )
    
    elif isinstance(exception, RateLimitError):
        return SkillResponse(
            status="failed",
            errors=[SkillError(
                code=ErrorCode.RATE_LIMIT_EXCEEDED,
                message="Search API rate limit exceeded",
                recoverable=True,
                details={"retry_after_seconds": exception.retry_after},
                suggested_action=f"Please wait {exception.retry_after}s before retrying"
            )]
        )
    
    else:
        return SkillResponse(
            status="failed",
            errors=[SkillError(
                code=ErrorCode.INTERNAL_ERROR,
                message="An unexpected error occurred",
                recoverable=False,
                details={"exception_type": type(exception).__name__}
            )]
        )

31.5 版本兼容性约束

语义化版本(Semver)规则

版本号格式:MAJOR.MINOR.PATCH

MAJOR(主版本):不兼容的 API 变更
  - 删除或重命名必需参数
  - 更改输出 Schema 的 required 字段
  - 更改工具依赖(删除或更换工具)
  示例:1.0.0 → 2.0.0

MINOR(次版本):向后兼容的新功能
  - 添加新的可选参数(有合理默认值)
  - 在输出中添加新的可选字段
  - 添加新的 enum 值
  示例:1.0.0 → 1.1.0

PATCH(补丁版本):向后兼容的 bug 修复
  - 修复参数验证逻辑
  - 改进错误消息
  - 性能优化
  示例:1.0.0 → 1.0.1
# 版本兼容性检查
from packaging.version import Version
from packaging.specifiers import SpecifierSet

def check_skill_compatibility(
    skill_hermes_requirement: str,  # 如 ">=3.0.0,<4.0.0"
    current_hermes_version: str     # 如 "3.1.2"
) -> tuple[bool, str]:
    """检查 Skill 与当前 Hermes 版本是否兼容"""
    spec = SpecifierSet(skill_hermes_requirement)
    current = Version(current_hermes_version)
    
    if current in spec:
        return True, ""
    else:
        return False, (
            f"Skill requires Hermes {skill_hermes_requirement}, "
            f"but current version is {current_hermes_version}"
        )

# 输入 Schema 的版本演进策略
class SkillInputSchemaV2:
    """
    Schema 版本迁移示例:
    v1.0.0: topics 是 string(单个主题)
    v2.0.0: topics 是 array(多主题,不兼容变更)
    
    迁移处理:检测旧格式并自动升级
    """
    
    @staticmethod
    def migrate_from_v1(v1_input: dict) -> dict:
        """将 v1 格式的输入迁移到 v2 格式"""
        v2_input = dict(v1_input)
        
        # v1 的 topics 是 string,v2 是 array
        if isinstance(v1_input.get("topics"), str):
            v2_input["topics"] = [v1_input["topics"]]
        
        return v2_input

31.6 实际 Skill API 契约示例(JSON Schema)

完整的 API 契约文档

{
  "$schema": "https://json-schema.org/draft/2020-12",
  "title": "Daily News Digest Skill API Contract",
  "description": "Complete API contract for the Daily News Digest skill v1.0.0",
  "version": "1.0.0",
  
  "input": {
    "$ref": "#/definitions/DailyNewsDigestInput"
  },
  
  "output": {
    "oneOf": [
      {"$ref": "#/definitions/SuccessResponse"},
      {"$ref": "#/definitions/ErrorResponse"}
    ]
  },
  
  "definitions": {
    "DailyNewsDigestInput": {
      "type": "object",
      "required": ["topics"],
      "additionalProperties": false,
      "properties": {
        "topics": {
          "type": "array",
          "items": {"type": "string", "minLength": 2, "maxLength": 100},
          "minItems": 1,
          "maxItems": 5
        },
        "time_range": {
          "type": "string",
          "enum": ["today", "24h", "this_week", "this_month"],
          "default": "today"
        },
        "max_articles_per_topic": {
          "type": "integer",
          "minimum": 1,
          "maximum": 10,
          "default": 5
        },
        "output_language": {
          "type": "string",
          "default": "auto"
        },
        "output_format": {
          "type": "string",
          "enum": ["prose", "bullets", "structured", "brief"],
          "default": "structured"
        },
        "save_to_file": {
          "type": "boolean",
          "default": false
        }
      }
    },
    
    "SuccessResponse": {
      "type": "object",
      "required": ["status", "generated_at", "topics_covered"],
      "properties": {
        "status": {"const": "success"},
        "generated_at": {"type": "string", "format": "date-time"},
        "topics_covered": {
          "type": "array",
          "items": {
            "type": "object",
            "required": ["topic", "summary", "articles"],
            "properties": {
              "topic": {"type": "string"},
              "summary": {"type": "string"},
              "articles": {
                "type": "array",
                "items": {
                  "type": "object",
                  "required": ["title", "source", "url"],
                  "properties": {
                    "title": {"type": "string"},
                    "source": {"type": "string"},
                    "url": {"type": "string", "format": "uri"},
                    "published_at": {"type": "string"},
                    "key_points": {
                      "type": "array",
                      "items": {"type": "string"}
                    }
                  }
                }
              }
            }
          }
        }
      }
    },
    
    "ErrorResponse": {
      "type": "object",
      "required": ["status", "errors"],
      "properties": {
        "status": {"enum": ["failed", "partial_success"]},
        "errors": {
          "type": "array",
          "items": {
            "type": "object",
            "required": ["code", "message", "recoverable"],
            "properties": {
              "code": {"type": "string"},
              "message": {"type": "string"},
              "recoverable": {"type": "boolean"},
              "suggested_action": {"type": "string"}
            }
          }
        }
      }
    }
  }
}

31.7 小结

Skill 输入输出契约是 Skill 质量的基础保障:


思考题

  1. additionalProperties: false 会拒绝所有未声明的参数。但在某些场景下,Skill 可能需要透传未知参数给底层工具。如何在严格验证和灵活扩展之间找到平衡?

  2. 本章建议对 LLM 的类型错误做"友好强制转换"(如字符串 "5" → 整数 5)。这种做法会不会掩盖真正的错误?如何区分"友好纠错"和"错误容忍"的边界?

  3. 设计一个版本迁移策略:当你的 Skill 从 v1(topics 为 string)升级到 v2(topics 为 array)时,已经在生产中使用 v1 的用户会受到什么影响?如何在 ClawHub 上优雅处理这种不兼容变更?

  4. JSON Schema 的 description 字段对人类和 LLM 都很重要,但两者的需求可能不同。人类需要完整的文档,LLM 需要简洁准确的提示。如何在一个 description 中同时满足两者?

本章评分
4.6  / 5  (3 评分)

💬 留言讨论