功能描述

Daily tech news collection and distribution system. Automated methodology for collecting, curating, and distributing industry news via scheduled cron jobs. U...

使用说明 (SKILL.md)

Daily News Collector

Name: Daily News Collector
Author: lgy2020

Automated methodology for collecting, curating, and distributing industry news.

Architecture: Collect → Cache → Distribute

Separate collection (slow) from distribution (fast) using a file cache.

07:00  Collect news → Write to weekly file (slow, ~3-5 min)
08:00  Read file → Push to chat (instant, \x3C10 sec)

Key insight: Collection and distribution happen the same morning. News is at most ~1 hour old when delivered, not 9+ hours.

Setup Steps

1. Create Weekly File

Create a markdown file for the current week: weekly-news-YYYY-WNN.md

2. Collection Cron (daily, 07:00)

Schedule an isolated agentTurn cron job that:

Checks if today's report already exists in the weekly file (anti-duplication)
Searches news sources via two methods:
- Tavily API (~80% weight): Broad search across multiple keyword groups
- web_fetch (~20% weight): Deep crawl of core technical blogs
Selects top 10-15 stories, grouped by category
Writes formatted report to weekly file with today's date

3. Distribution Cron (daily, 08:00)

Schedule an isolated agentTurn cron job that:

Reads the weekly file
Finds today's report by date header
Pushes content to chat
If not found, notifies user instead of generating new content

Timing: 1 hour gap between collection and distribution ensures collection completes before push.

Search Strategy

Layer 1: AI Search API (Broad Discovery, ~80% weight)

Use Tavily API (or similar) with keyword groups tailored to your domain. Example for browser/AI news (7 groups, ~27 candidates):

Group 1: browser Chrome Firefox Safari Edge news 2026 (5 results, topic: news)
Group 2: AI machine learning LLM technology news March 2026 (5 results, topic: news)
Group 3: Local language keywords for regional coverage — e.g. 中国科技 AI 浏览器最新消息 (5 results, topic: news)
Group 4: Web standards W3C WHATWG V8 JavaScript new features 2026 (3 results, topic: news)
Group 5: Platform-specific keywords — e.g. Android Chrome mobile browser development 2026 (3 results, topic: news)
Group 6: Chinese AI media keywords — e.g. APPSO 机器之心量子位 AI 人工智能最新 (3 results, topic: news)
Group 7: Chinese tech industry keywords — e.g. 虎嗅雷科技科技行业消费电子 (3 results, topic: news)

See references/tavily-setup.md for Tavily API setup.

Layer 2: Core Blog Crawl (Deep Coverage, ~20% weight)

Use web_fetch to directly crawl authoritative blogs. These guarantee coverage of domain-specific news that AI search might miss.

Example sources for browser/AI domain:

WebKit Blog: https://webkit.org/blog/
V8 Blog: https://v8.dev/blog
Mozilla Hacks: https://hacks.mozilla.org/
Chromium Blog: https://blog.chromium.org/

Layer 3: Aggregator Check (Community Pulse)

Check community aggregators for trending discussions:

Hacker News: https://news.ycombinator.com

Three-Layer Information Source Model

Layer	Weight	Purpose	Speed	Depth
AI Search API	~80%	Broad discovery	Fast (1-3s/query)	Medium
Core blogs	~20%	Domain authority	Slow (5-10s/source)	Deep
Aggregators	Optional	Community trends	Fast	Shallow

Each layer should contribute 2-3 stories minimum to ensure balanced coverage.

Report Format

Title

## YYYY.M.D Report Title | Day N

Categories (ordered by priority)

Use domain-specific categories. Examples:

### 🔧 Browser Engine & Web Standards
### 🦊 Firefox / Mozilla
### 🤖 AI & Browser Tech
### 🇨🇳 Regional Tech
### 📱 Mobile / Web Dev

Story Format

N. emoji **Title** — Description (2-3 sentences with specific details like version numbers, data, impact)
   - Source: full clickable URL

Insights Section

#### 💡 Analyst Insights

💡 **Insight Title** — Analysis (2-3 sentences with actionable perspective)

Footer

*Sources: Source1 · Source2 · Source3*
*Collected: YYYY-MM-DD HH:MM TZ*

Anti-Duplication Rules

Critical: Multiple cron sessions may run simultaneously and cause conflicts.

Collection cron: Before writing, scan file for today's date header (## YYYY.M.D). If found, output "Report exists, skipping" and exit. Do NOT overwrite.
Distribution cron: Read-only. Never search, never write. If report missing, only notify user.
Strict division: Collection writes, distribution reads. Never cross.

Quality Control

Select for technical depth and impact, not quantity
"Better 8 great stories than 15 mediocre ones"
Cross-reference: prefer the original source when story appears in multiple feeds
Each story must have a clickable URL
Insights must add analysis, not just repeat the news

Tavily Search Script (tavily-search.js)

Save this as tavily-search.js and run with: node tavily-search.js "query" [max_results] [topic] [search_depth]

Requires TAVILY_API_KEY environment variable.

#!/usr/bin/env node
/**
 * Tavily Search — AI-optimized search API wrapper for news collection
 * Usage: node scripts/tavily-search.js "query" [max_results] [topic] [search_depth]
 *
 * Env: TAVILY_API_KEY required
 * Output: JSON with results array (title, url, content, score)
 */

const https = require('https');

const API_KEY = process.env.TAVILY_API_KEY;
if (!API_KEY) {
  console.error('Error: TAVILY_API_KEY environment variable not set');
  process.exit(1);
}

const query = process.argv[2];
const maxResults = parseInt(process.argv[3]) || 5;
const topic = process.argv[4] || 'general';
const searchDepth = process.argv[5] || 'basic';

if (!query) {
  console.error('Usage: node tavily-search.js "query" [max_results] [topic] [search_depth]');
  process.exit(1);
}

const payload = JSON.stringify({
  query,
  max_results: maxResults,
  topic,
  search_depth: searchDepth,
  include_answer: true,
  include_raw_content: false,
});

const req = https.request({
  hostname: 'api.tavily.com',
  path: '/search',
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${API_KEY}`,
  },
}, (res) => {
  let body = '';
  res.on('data', (chunk) => body += chunk);
  res.on('end', () => {
    try {
      const data = JSON.parse(body);
      const output = {
        query: data.query,
        answer: data.answer || null,
        results: (data.results || []).map(r => ({
          title: r.title,
          url: r.url,
          content: r.content?.substring(0, 500),
          score: r.score,
        })),
      };
      console.log(JSON.stringify(output, null, 2));
    } catch (e) {
      console.error('Parse error:', e.message);
      console.error('Raw:', body.substring(0, 500));
      process.exit(1);
    }
  });
});

req.on('error', (e) => {
  console.error('Request error:', e.message);
  process.exit(1);
});

req.write(payload);
req.end();

References

references/usage-guide.md — Beginner-friendly usage guide with FAQ
references/tavily-setup.md — Tavily API setup and configuration
references/sources.md — Curated information source list by domain

安全使用建议

This skill appears to implement a plausible daily-news workflow, but it has inconsistent metadata and asks the agent (via instructions) to: create cron jobs, write persistent files, and set a Tavily API key (a secret). Before installing: 1) Confirm you trust the Tavily service and the skill's source; 2) Prefer to create the cron jobs and environment variables yourself rather than giving the agent automatic permission to do so; 3) Run the workflow in a restricted environment or container that limits file/cron access; 4) If you must let the agent set up automation, supply a scoped API key (least privilege) and review any scripts it will run (tavily-search.js) for unexpected network endpoints; 5) Update the registry metadata to declare required env vars and config paths so the requested privileges are explicit. These steps will reduce the risk that the agent persists secrets or modifies system-level scheduler state without your full understanding.

功能分析

Type: OpenClaw Skill Name: info-stream Version: 1.0.0 The skill bundle implements a legitimate automated news aggregation and distribution system. It includes a Node.js script (tavily-search.js) that interfaces with the Tavily Search API and detailed instructions in SKILL.md for the agent to manage scheduled news collection and delivery via cron jobs. The methodology is well-structured, utilizing a three-layer source model (API search, technical blogs, and aggregators) and includes logic for deduplication. All requested capabilities, such as network access and file system operations, are strictly necessary for the stated purpose, and no indicators of malicious intent, data exfiltration, or harmful prompt injection were found.

能力评估

⚠ Purpose & Capability

Name/description promises automated daily news collection and distribution — that purpose is coherent with the SKILL.md workflow. However, the skill declares no required environment variables or config paths in the registry metadata while the runtime instructions and included script explicitly require a Tavily API key (TAVILY_API_KEY) and file write/read permissions for weekly-news files. The omission of these required capabilities in the metadata is an inconsistency.

⚠ Instruction Scope

SKILL.md instructs the agent to create two cron jobs (collection and distribution), write/read weekly markdown files, use web_fetch and an external Tavily API, and set/check an environment variable. Those actions are within the stated goal, but they also require filesystem and scheduler privileges and handling of an API secret. The instructions also promise the AI will 'set up your Tavily API Key as an environment variable' — that grants the agent discretion to persist a secret and modify system state, which is not explicitly declared or scoped.

✓ Install Mechanism

This is an instruction-only skill with no install spec or code files to be fetched/installed. That minimizes install-time risk (no downloaded executables). The included tavily-search.js is provided as a script in SKILL.md/references, not as an installer.

⚠ Credentials

Although the registry lists no required env vars, the tavily-search script and setup docs explicitly require TAVILY_API_KEY. The usage guide says the AI will set the environment variable for you. Requesting and persisting an external API key is reasonable for search integration, but the skill fails to declare this requirement up front. Also the skill suggests swapping in other search APIs (which would require other credentials) without enumerating them, increasing the secret surface.

⚠ Persistence & Privilege

The skill instructs creation of recurring cron jobs and writing persistent weekly files. While 'always' is false and autonomous invocation is allowed (normal), the skill still directs the agent to modify system scheduler state and store secrets as env vars. That is a material privilege escalation relative to a purely read-only assistant and should be consented to explicitly by an administrator or run in a sandbox.

版本历史

v1.0.0

- Initial release of Daily News Collector skill. - Automates daily collection, curation, and distribution of tech industry news using scheduled cron jobs. - Separates news collection (cached to file) from fast distribution (read from file then pushed to chat). - Utilizes a multi-layered news search strategy combining AI search APIs (e.g., Tavily), direct blog crawls, and community aggregators for balanced coverage. - Prevents duplication with anti-duplication checks for both collection and distribution processes. - Includes report formatting guidelines, technical quality control tips, and integration scripts for API-based news discovery.

元数据

Slug info-stream

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题