第 30 章

开发第一个 Skill：从零到发布

第30章：开发第一个 Skill：从零到发布

理论终究需要落地。本章通过完整的实战演练，带你从零开始开发一个"每日新闻摘要"Skill，覆盖项目初始化、SKILL.md 编写、工具调用集成、本地测试、调试技巧，直到发布到 ClawHub 的每一个步骤。完成本章后，你将拥有一个可以真正运行并分享的 Skill。

30.1 项目规划：每日新闻摘要 Skill

功能设计

我们要构建的 Skill：Daily News Digest

功能描述：
  每天自动抓取指定主题的新闻，生成结构化摘要，
  支持多语言输出，可保存为 Markdown 文件。

核心能力：
  1. 从多个新闻源抓取当日新闻
  2. 按主题/重要性分类排序
  3. 生成带摘要、来源、时间的结构化报告
  4. 可选：保存到本地文件

所需工具：
  - web_search（搜索新闻）
  - fetch_url（获取文章内容）
  - write_file（可选，保存报告）
  - get_current_time（获取当前时间）

目标文件结构

daily-news-digest/
├── SKILL.md
├── skill.json
├── tools/
│   ├── __init__.py
│   ├── news_fetcher.py
│   └── digest_formatter.py
├── prompts/
│   └── system_fragment.md
├── tests/
│   ├── test_news_fetcher.py
│   ├── test_formatter.py
│   └── fixtures/
│       └── sample_news_response.json
├── requirements.txt
└── README.md

30.2 项目初始化

步骤 1：使用 Hermes CLI 创建 Skill 骨架

# 安装 Hermes CLI（如果尚未安装）
pip install hermes-cli

# 创建新 Skill 项目
hermes skill new daily-news-digest

# 输出：
# ✓ Created directory: daily-news-digest/
# ✓ Generated: SKILL.md (template)
# ✓ Generated: skill.json (template)
# ✓ Generated: tools/__init__.py
# ✓ Generated: tests/test_basic.py
# ✓ Generated: requirements.txt
# 
# Next steps:
#   cd daily-news-digest
#   Edit SKILL.md with your skill description
#   Run: hermes skill validate

cd daily-news-digest

步骤 2：设置 Python 虚拟环境

# 创建虚拟环境
python -m venv .venv
source .venv/bin/activate  # macOS/Linux
# 或 .venv\Scripts\activate  # Windows

# 安装基础依赖
pip install hermes-sdk anthropic pytest httpx python-dotenv
pip freeze > requirements.txt

步骤 3：配置环境变量

# 创建 .env 文件
cat > .env << 'EOF'
ANTHROPIC_API_KEY=your_api_key_here
SEARCH_API_KEY=your_search_api_key_here
HERMES_MODEL=claude-3-5-sonnet-20241022
EOF

30.3 编写 SKILL.md

这是最重要的文件，直接影响 Hermes 如何理解和使用你的 Skill：

---
id: daily-news-digest
version: 1.0.0
name: Daily News Digest
description: >
  Fetches today's top news on specified topics, generates structured 
  summaries with sources, and optionally saves to a Markdown file.
author: your-username
license: MIT
tags:
  - news
  - research
  - summarization
  - daily-digest
  - journalism
hermes_version: ">=3.0.0"
tools_required:
  - web_search
  - fetch_url
tools_optional:
  - write_file
  - get_current_time
language: en
created: 2024-11-20
updated: 2024-11-20
---

# Daily News Digest

## Overview

This skill enables Hermes to create a comprehensive daily news digest on any topic
or set of topics. It searches multiple sources, extracts key information, and
synthesizes a structured report with proper attribution.

**When to use this skill:**
- User says "give me today's news on X"
- User wants "a summary of what happened with X today/this week"
- User requests "daily briefing", "news digest", or "news roundup"
- User wants to stay updated on a specific topic without browsing

**Do NOT use this skill when:**
- User asks a specific factual question (use direct search instead)
- User wants analysis/opinion rather than news reporting
- User asks about events older than 2 weeks (use general research skill)
- User is in an offline environment

## Usage

### Invocation Phrases

Hermes will automatically invoke this skill for phrases like:
- "What's in the news about [topic] today?"
- "Give me a daily digest of [topic]"
- "Morning briefing on [topic]"
- "Catch me up on [topic]"

### Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `topics` | array[string] | Yes | — | List of news topics (1–5) |
| `time_range` | string | No | "today" | "today" / "this_week" / "24h" |
| `max_articles` | integer | No | 5 | Articles per topic (1–10) |
| `output_language` | string | No | auto | Output language, auto-detects from user |
| `save_to_file` | boolean | No | false | Save digest to Markdown file |
| `file_path` | string | No | auto | File path if save_to_file is true |

### Execution Process

Follow this exact process when this skill is invoked:

**Step 1: Topic Analysis**
- Extract the main topic(s) from user input
- Formulate 2 search queries per topic (one broad, one specific)
- Note the requested time range

**Step 2: News Search**
For each topic, call `web_search`:
- Query 1: "[topic] news today [current_year]"
- Query 2: "[topic] latest developments [current_month] [current_year]"

**Step 3: Article Selection**
- Select the top 3–5 most recent and authoritative results
- Prioritize: major news outlets > industry publications > blogs
- Skip: paywalled content (unless fetch works), social media, opinion pieces

**Step 4: Content Extraction**
For each selected article, call `fetch_url` to get full content.
If fetch fails, use the search result snippet.

**Step 5: Synthesis**
Create a structured digest with:
- Executive summary (2–3 sentences per topic)
- Key developments (bullet points)
- Notable quotes (if any)
- Sources with publication date

**Step 6: Output**
Format the digest in the user's language.
If `save_to_file` is true, call `write_file` with path "news_digest_[date].md".

## Examples

### Example 1: Single Topic Digest

**User input:**

Give me today's news about AI regulation in Europe.


**Search calls:**
```json
{"tool": "web_search", "input": {"query": "AI regulation Europe news today 2024"}}
{"tool": "web_search", "input": {"query": "EU AI Act latest developments November 2024"}}

Expected output structure:

# Daily News Digest: AI Regulation in Europe
*Generated: November 20, 2024*

## Executive Summary
The European Parliament is accelerating implementation of the EU AI Act, 
with new compliance deadlines announced for high-risk AI systems...

## Key Developments

### 1. EU AI Act Implementation Timeline
**Source:** Reuters | Nov 20, 2024
The European Commission confirmed that...

### 2. [Next article]
...

## Sources
1. [Reuters] EU AI Act Update - https://... - Nov 20, 2024
2. [FT] European AI Regulation - https://... - Nov 19, 2024

Example 2: Multi-Topic Digest with File Save

User input:

Morning briefing on: 1) crypto markets, 2) climate policy. Save it.

Process: Run two parallel topic searches, then combine into single digest, call write_file with path "morning_digest_2024-11-20.md"

Dependencies

Required Tools

web_search: Must support news-focused queries with date filters
- Recommended: Brave Search API, Bing News API, or SerpAPI News
- Minimum: Any search API returning titles, snippets, and URLs
fetch_url: Must extract main content from news article URLs
- Should handle JavaScript-rendered pages (or degrade gracefully)
- Timeout: 10 seconds per URL

Optional Tools

write_file: For saving digest to local filesystem
get_current_time: For accurate date context in search queries

Limitations

Cannot access paywalled content (NYT, WSJ premium, etc.)
Breaking news from the last 30 minutes may not appear in search results
Accuracy depends entirely on the quality of source material
For financial news, always remind users to verify before trading decisions
Rate limits on search API may restrict number of topics per call

Configuration

# hermes.yaml
skills:
  - id: daily-news-digest
    version: "^1.0"
    config:
      default_max_articles: 5
      trusted_sources:
        - reuters.com
        - bbc.com
        - apnews.com
        - techcrunch.com
      exclude_domains:
        - reddit.com

Changelog

1.0.0 (2024-11-20)

Initial release
Support for 1–5 topics per digest
Multi-language output
Optional file saving


---

## 30.4 实现工具调用集成

### tools/news_fetcher.py

```python
"""新闻抓取工具：封装搜索和内容提取逻辑"""
import httpx
import json
from datetime import datetime, timedelta
from typing import Optional
from dataclasses import dataclass

@dataclass
class NewsArticle:
    title: str
    url: str
    snippet: str
    source: str
    published_date: Optional[str]
    full_content: Optional[str] = None

class NewsFetcher:
    """封装新闻搜索和内容提取，提供给 Hermes Agent 的工具实现"""
    
    def __init__(self, search_api_key: str, timeout: int = 10):
        self.api_key = search_api_key
        self.timeout = timeout
        self.client = httpx.Client(timeout=timeout)
    
    def search_news(self, query: str, time_range: str = "today") -> list[NewsArticle]:
        """
        使用搜索 API 抓取新闻。
        这个函数对应 Hermes 工具调用中的 web_search。
        """
        # 构建时间范围过滤
        date_restrict = self._build_date_filter(time_range)
        
        # 调用搜索 API（以 Brave Search 为例）
        response = self.client.get(
            "https://api.search.brave.com/res/v1/news/search",
            params={
                "q": query,
                "count": 10,
                "freshness": date_restrict,
            },
            headers={
                "Accept": "application/json",
                "X-Subscription-Token": self.api_key
            }
        )
        response.raise_for_status()
        data = response.json()
        
        articles = []
        for result in data.get("results", []):
            articles.append(NewsArticle(
                title=result.get("title", ""),
                url=result.get("url", ""),
                snippet=result.get("description", ""),
                source=self._extract_domain(result.get("url", "")),
                published_date=result.get("age", "Unknown")
            ))
        
        return articles
    
    def fetch_article_content(self, url: str) -> Optional[str]:
        """
        获取文章全文。
        对应 Hermes 工具调用中的 fetch_url。
        """
        try:
            response = self.client.get(
                url,
                headers={"User-Agent": "HermesNewsDigest/1.0"}
            )
            response.raise_for_status()
            
            # 简单文本提取（生产中用 BeautifulSoup 或 trafilatura）
            from html.parser import HTMLParser
            
            class TextExtractor(HTMLParser):
                def __init__(self):
                    super().__init__()
                    self.text_parts = []
                    self.in_body = False
                
                def handle_data(self, data):
                    if data.strip():
                        self.text_parts.append(data.strip())
            
            extractor = TextExtractor()
            extractor.feed(response.text)
            full_text = " ".join(extractor.text_parts)
            
            # 截取前 2000 字符避免 Token 超限
            return full_text[:2000] if full_text else None
            
        except Exception as e:
            print(f"Failed to fetch {url}: {e}")
            return None
    
    def _build_date_filter(self, time_range: str) -> str:
        """将时间范围转换为 API 参数"""
        mapping = {
            "today": "pd",      # Past day
            "24h": "pd",
            "this_week": "pw",  # Past week
            "this_month": "pm"  # Past month
        }
        return mapping.get(time_range, "pd")
    
    def _extract_domain(self, url: str) -> str:
        """从 URL 提取域名作为来源名称"""
        try:
            from urllib.parse import urlparse
            return urlparse(url).netloc.replace("www.", "")
        except:
            return "Unknown"

tools/digest_formatter.py

"""新闻摘要格式化：将原始文章列表转换为结构化报告"""
from datetime import datetime
from typing import Optional
from .news_fetcher import NewsArticle

class DigestFormatter:
    """格式化新闻摘要输出"""
    
    def format_digest(
        self,
        topic: str,
        articles: list[NewsArticle],
        language: str = "en",
        max_articles: int = 5
    ) -> str:
        """生成完整的新闻摘要 Markdown"""
        
        selected = articles[:max_articles]
        today = datetime.now().strftime("%B %d, %Y")
        
        sections = [
            f"# Daily News Digest: {topic}",
            f"*Generated: {today}*",
            "",
            "## Executive Summary",
            self._generate_executive_summary(selected),
            "",
            "## Key Developments",
        ]
        
        for i, article in enumerate(selected, 1):
            sections.extend([
                f"### {i}. {article.title}",
                f"**Source:** {article.source} | {article.published_date}",
                "",
                article.full_content or article.snippet,
                "",
            ])
        
        sections.extend([
            "## Sources",
        ])
        
        for i, article in enumerate(selected, 1):
            sections.append(f"{i}. [{article.source}] {article.title} - {article.url} - {article.published_date}")
        
        return "\n".join(sections)
    
    def _generate_executive_summary(self, articles: list[NewsArticle]) -> str:
        """从文章列表生成执行摘要（在实际 Skill 中，由模型完成此步骤）"""
        # 注意：在真实的 Skill 执行中，这一步是由 Hermes/LLM 完成的
        # 这里的实现仅用于测试目的
        if not articles:
            return "No articles found for this topic today."
        
        sources = [a.source for a in articles[:3]]
        return f"Coverage from {', '.join(sources)} and other sources. See detailed sections below."
    
    def to_markdown_file(self, content: str, filepath: str) -> bool:
        """保存到 Markdown 文件"""
        try:
            with open(filepath, 'w', encoding='utf-8') as f:
                f.write(content)
            return True
        except Exception as e:
            print(f"Failed to save file: {e}")
            return False

30.5 本地测试

tests/test_news_fetcher.py

"""新闻抓取器单元测试"""
import pytest
from unittest.mock import Mock, patch, MagicMock
import json
from pathlib import Path
from tools.news_fetcher import NewsFetcher, NewsArticle

# 加载测试固定数据
FIXTURES_DIR = Path(__file__).parent / "fixtures"

@pytest.fixture
def sample_search_response():
    with open(FIXTURES_DIR / "sample_news_response.json") as f:
        return json.load(f)

@pytest.fixture
def fetcher():
    return NewsFetcher(search_api_key="test-key")

class TestNewsFetcher:
    
    def test_search_returns_articles(self, fetcher, sample_search_response):
        """测试搜索返回正确格式的文章列表"""
        with patch.object(fetcher.client, 'get') as mock_get:
            mock_response = Mock()
            mock_response.json.return_value = sample_search_response
            mock_response.raise_for_status = Mock()
            mock_get.return_value = mock_response
            
            articles = fetcher.search_news("AI regulation Europe")
            
            assert len(articles) > 0
            assert isinstance(articles[0], NewsArticle)
            assert articles[0].title != ""
            assert articles[0].url.startswith("http")
    
    def test_empty_search_returns_empty_list(self, fetcher):
        """测试空搜索结果的处理"""
        with patch.object(fetcher.client, 'get') as mock_get:
            mock_response = Mock()
            mock_response.json.return_value = {"results": []}
            mock_response.raise_for_status = Mock()
            mock_get.return_value = mock_response
            
            articles = fetcher.search_news("extremely_unlikely_nonexistent_topic_xyz")
            assert articles == []
    
    def test_fetch_article_handles_timeout(self, fetcher):
        """测试文章获取超时的优雅降级"""
        import httpx
        with patch.object(fetcher.client, 'get', side_effect=httpx.TimeoutException("timeout")):
            content = fetcher.fetch_article_content("https://example.com/article")
            assert content is None  # 超时时返回 None，不抛出异常
    
    @pytest.mark.parametrize("time_range,expected", [
        ("today", "pd"),
        ("24h", "pd"),
        ("this_week", "pw"),
        ("unknown_range", "pd"),  # 未知范围默认 today
    ])
    def test_date_filter_mapping(self, fetcher, time_range, expected):
        """测试时间范围参数映射"""
        result = fetcher._build_date_filter(time_range)
        assert result == expected

# fixtures/sample_news_response.json
SAMPLE_FIXTURE = {
    "results": [
        {
            "title": "EU AI Act: New Compliance Deadlines Announced",
            "url": "https://reuters.com/technology/eu-ai-act-2024",
            "description": "The European Commission today announced...",
            "age": "2 hours ago"
        },
        {
            "title": "European Parliament Debates AI Regulation",
            "url": "https://bbc.com/news/eu-parliament-ai",
            "description": "Members of the European Parliament...",
            "age": "5 hours ago"
        }
    ]
}

运行测试

# 运行所有测试
pytest tests/ -v

# 运行特定测试文件
pytest tests/test_news_fetcher.py -v

# 带覆盖率报告
pytest tests/ --cov=tools --cov-report=html

# 验证 Skill 格式
hermes skill validate
# 输出：
# ✓ SKILL.md format valid
# ✓ skill.json valid
# ✓ Required tools declared: web_search, fetch_url
# ✓ Examples present (2 found)
# ✓ Version format valid: 1.0.0
# ✓ All checks passed!

30.6 调试技巧

技巧 1：使用 Hermes 调试模式

# 开启详细日志，观察 Skill 注入和工具调用过程
import hermes
import logging

logging.basicConfig(level=logging.DEBUG)

agent = hermes.Agent(
    skills=["./daily-news-digest"],  # 加载本地 Skill
    debug=True,                       # 开启调试模式
    trace_tokens=True                 # 记录 Token 使用
)

# 调试模式输出示例：
# [DEBUG] Loading skill: daily-news-digest v1.0.0
# [DEBUG] Skill injected: 342 tokens added to system prompt
# [DEBUG] Tool called: web_search {"query": "AI regulation Europe news today 2024"}
# [DEBUG] Tool result: 2847 chars, 3 articles found
# [DEBUG] Token usage: input=8412, output=623, cache_hit=False

技巧 2：模拟工具调用（Mock）

# 在不消耗 API 额度的情况下测试 Skill 行为
from hermes.testing import SkillTestHarness

harness = SkillTestHarness(skill_path="./daily-news-digest")

# 注册 Mock 工具响应
harness.mock_tool("web_search", returns=[
    {"title": "AI News Today", "url": "https://example.com", "snippet": "..."},
])
harness.mock_tool("fetch_url", returns="Full article content here...")

# 运行 Skill
result = harness.run("Give me today's news about AI")
print(result)

# 断言输出包含预期内容
assert "AI" in result
assert "Sources" in result
assert harness.tool_call_count("web_search") >= 1

技巧 3：逐步调试工具调用链

# 使用 step-through 模式观察每步决策
harness.run_with_steps(
    "Morning briefing on crypto and AI",
    on_tool_call=lambda tool, args: print(f"→ Calling {tool}: {args}"),
    on_tool_result=lambda tool, result: print(f"← {tool} returned: {str(result)[:100]}"),
    on_model_think=lambda text: print(f"💭 {text[:200]}")
)

30.7 发布到 ClawHub

发布前检查清单

# 1. 验证 Skill 格式
hermes skill validate
# 必须全部通过 ✓

# 2. 运行完整测试套件
pytest tests/ -v --tb=short
# 必须全部通过 ✓

# 3. 检查 license 文件
ls LICENSE  # 必须存在

# 4. 检查 README.md
ls README.md  # 必须存在，且包含安装说明

# 5. 确认版本号
cat skill.json | python -m json.tool | grep version
# 确认为正确的版本号

# 6. 构建发布包
hermes skill build
# 输出：daily-news-digest-1.0.0.skill （ZIP 格式）

发布流程

# 方法 1：CLI 发布（推荐）
hermes skill publish

# 按提示操作：
# ClawHub username: your-username
# ClawHub API token: [输入 token]
# Publishing: daily-news-digest v1.0.0
# ✓ Validating...
# ✓ Uploading...
# ✓ Published! View at: https://clawhub.io/skills/your-username/daily-news-digest

# 方法 2：通过 GitHub 自动发布
# 在 GitHub Actions 中配置：
# .github/workflows/publish.yml

# .github/workflows/publish.yml
name: Publish to ClawHub

on:
  push:
    tags:
      - 'v*'

jobs:
  publish:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      
      - name: Install Hermes CLI
        run: pip install hermes-cli
      
      - name: Validate Skill
        run: hermes skill validate
      
      - name: Run Tests
        run: pytest tests/ -v
      
      - name: Publish to ClawHub
        env:
          CLAWHUB_TOKEN: ${{ secrets.CLAWHUB_TOKEN }}
        run: hermes skill publish --token $CLAWHUB_TOKEN

发布后验证

# 验证 Skill 已成功发布并可安装
hermes skill install your-username/daily-news-digest
hermes skill info daily-news-digest

# 在实际 Agent 中测试
hermes run --skill daily-news-digest "Today's AI news digest please"

30.8 小结

通过本章的完整实战，你已经掌握了开发 Hermes Skill 的完整流程：

规划阶段：明确功能边界、所需工具、目标用户
SKILL.md：这是 Skill 的灵魂，要投入最多精力确保 "When to use" 和"流程"部分的准确性
工具实现：将业务逻辑封装为清晰的 Python 模块，与工具调用接口对齐
测试：Mock 工具调用使测试可以离线运行，覆盖率目标 > 80%
调试：利用 debug 模式和 step-through 模式观察实际执行过程
发布：通过 CLI 或 CI/CD 自动化发布到 ClawHub

思考题

在 SKILL.md 的 "Do NOT use" 部分，我们建议"不要用此 Skill 回答 2 周以上的历史新闻"。如果用户问的是历史新闻，Hermes 应该怎么处理？是报错，还是降级使用通用搜索？你会如何在 SKILL.md 中表达这一点？
本章的 DigestFormatter._generate_executive_summary 说明"实际由 LLM 完成此步骤"。请重新设计这个流程——在 Skill 中，哪些步骤应该用代码实现，哪些应该让 LLM 完成？边界在哪里？
如果新闻摘要 Skill 在生产环境中被 1000 个用户同时使用，搜索 API 的速率限制会成为瓶颈。你会如何设计缓存层来缓解这一问题？缓存多久合适？
设计一个 Skill 的"安全审计"检查表——在发布到 ClawHub 之前，哪些安全问题必须被检查和排除？

本章评分

4.7 / 5 (3 评分)