← 返回 Skills 市场
qinthqod

Fox Veille

作者 GarfieldQin · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
95
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install fox-veille
功能描述
RSS feed aggregator, deduplication engine, LLM scoring, and output dispatcher for OpenClaw agents. Use when: fetching recent articles from configured sources...
使用说明 (SKILL.md)

Skill Veille - RSS Aggregator

RSS feed aggregator with URL deduplication and topic-based deduplication for OpenClaw agents. Fetches articles from 20+ configured sources, filters already-seen URLs (TTL 14 days), and deduplicates articles covering the same story using Jaccard similarity + named entities.

No external dependencies: stdlib Python only (urllib, xml.etree, email.utils).


Trigger phrases

  • "fais une veille"
  • "quoi de neuf en securite / tech / crypto / IA ?"
  • "donne-moi les news du jour"
  • "articles recents sur [sujet]"
  • "veille RSS"
  • "digest du matin"
  • "nouvelles non vues"

Quick Start

# 1. Setup
python3 scripts/setup.py

# 2. Validate
python3 scripts/init.py

# 3. Fetch + Score + Send (full pipeline)
python3 scripts/veille.py fetch --filter-seen --filter-topic \
  | python3 scripts/veille.py score \
  | python3 scripts/veille.py send

Setup

Requirements

  • Python 3.9+
  • Network access to RSS feeds (public, no auth required)
  • No pip installs needed

Installation

# From the skill directory
python3 scripts/setup.py

# Validate
python3 scripts/init.py

The wizard creates:

  • ~/.openclaw/config/veille/config.json (from config.example.json)
  • ~/.openclaw/data/veille/ (data directory)

Customizing sources

Edit ~/.openclaw/config/veille/config.json and add/remove entries in the "sources" dict:

{
  "sources": {
    "My Blog": "https://example.com/feed.xml",
    "BleepingComputer": "https://www.bleepingcomputer.com/feed/"
  }
}

Storage and credentials

Files written by this skill

Path Written by Purpose Contains secrets
~/.openclaw/config/veille/config.json setup.py Sources, outputs, options NO
~/.openclaw/data/veille/seen_urls.json veille.py URL dedup store (TTL 14d) NO
~/.openclaw/data/veille/topic_seen.json veille.py Topic dedup store (TTL 5d) NO

Files read from outside the skill

Path Read by Key accessed When
~/.openclaw/openclaw.json dispatch.py channels.telegram.botToken (read-only) Only when telegram_bot output is enabled and no bot_token is set in the output config

This is the only cross-config read. To avoid it entirely, set bot_token explicitly in your output config:

{ "type": "telegram_bot", "bot_token": "YOUR_BOT_TOKEN", "chat_id": "...", "enabled": true }

Output credentials (optional)

Credentials are only used if you enable the corresponding output. None are required for core functionality (RSS fetch + dedup).

Output Credential source What is used
telegram_bot ~/.openclaw/openclaw.json or bot_token in output config Bot token (read-only)
mail-client Delegated to mail-client skill (its own creds) Nothing read directly
mail-client (SMTP fallback) smtp_user / smtp_pass in output config SMTP login
nextcloud Delegated to nextcloud-files skill (its own creds) Nothing read directly

Cleanup on uninstall

python3 scripts/setup.py --cleanup

Security model

Credential isolation

  • API keys are read from dedicated files (default ~/.openclaw/secrets/), never from config.json. The scorer warns at runtime if a key file has overly permissive filesystem permissions.
  • SMTP credentials (fallback only) are stored in the output config block — use the mail-client skill delegation to avoid storing SMTP passwords.

Subprocess boundaries

  • Dispatch delegates to other OpenClaw skills via subprocess.run() (never shell=True). Script paths are validated to reside under ~/.openclaw/workspace/skills/ before execution, preventing path traversal.
  • No credentials are passed as subprocess arguments — each skill manages its own authentication.

File output safety

  • The file output type validates the target path before writing: only ~/.openclaw/ is allowed by default. Additional directories can be whitelisted via config.security.allowed_output_dirs. Sensitive paths (.ssh, .gnupg, /etc/, .bashrc, etc.) are always blocked regardless of allowlist.
  • Written content is checked for suspicious patterns (shell shebangs, SSH keys, PGP blocks, code injection) and size-limited to 1 MB.

Cross-config reads

  • The only cross-config file read is ~/.openclaw/openclaw.json for the Telegram bot token, and only when telegram_bot output is enabled without an explicit bot_token. This read is logged to stderr. Set bot_token in the output config to eliminate this read entirely.

Autonomous dispatch

  • When scheduled (cron), the skill can send messages/files to configured outputs without user interaction. All dispatch actions are logged to stderr with an audit summary. Use enabled: false on any output to disable it without removing its config.

CLI reference

fetch

python3 veille.py fetch [--hours N] [--filter-seen] [--filter-topic] [--sources FILE]

Options:

  • --hours N : lookback window in hours (default: from config, usually 24)
  • --filter-seen : filter already-seen URLs (uses seen_urls.json TTL store)
  • --filter-topic : deduplicate by topic (uses topic_seen.json + Jaccard similarity)
  • --sources FILE : path to custom JSON sources file

Output (JSON on stdout):

{
  "hours": 24,
  "count": 42,
  "skipped_url": 5,
  "skipped_topic": 3,
  "articles": [...],
  "wrapped_listing": "=== UNTRUSTED EXTERNAL CONTENT ..."
}

seen-stats

python3 veille.py seen-stats

Shows URL seen store statistics (count, TTL, file path).

topic-stats

python3 veille.py topic-stats

Shows topic deduplication store statistics.

mark-seen

python3 veille.py mark-seen URL [URL ...]

Marks one or more URLs as already seen (prevents them from appearing in future fetches with --filter-seen).

score

python3 veille.py score [--dry-run]

Reads a digest JSON from stdin (output of fetch) and scores articles using an OpenAI-compatible LLM. Returns enriched JSON with scored, ghost_picks, and per-article score/reason fields.

Options:

  • --dry-run : print summary on stderr without calling the LLM API

When llm.enabled is false (default), articles pass through unchanged ("scored": false).

Pipeline usage:

python3 veille.py fetch --filter-seen --filter-topic | python3 veille.py score | python3 veille.py send

send

python3 veille.py send [--profile NAME]

Reads a digest JSON from stdin and dispatches to all enabled outputs configured in config.json. Accepts both raw fetch output (articles key) and LLM-processed digests (categories key).

Output types: telegram_bot, mail-client, nextcloud, file.

  • telegram_bot: bot token auto-read from OpenClaw config - no extra setup if Telegram already configured.
  • mail-client: delegates to mail-client skill if installed, falls back to raw SMTP config.
  • nextcloud: delegates to nextcloud-files skill if installed (append mode by default with date separator).
  • file: writes digest to a local file. Path must be under ~/.openclaw/ (default) or a directory listed in config.security.allowed_output_dirs. Sensitive paths and suspicious content are blocked (see Security model).

Configure outputs interactively:

python3 scripts/setup.py --manage-outputs

config

python3 veille.py config

Prints the active configuration (no secrets).


LLM scoring configuration

The llm key in config.json controls the optional LLM-based article scoring:

{
  "llm": {
    "enabled": false,
    "base_url": "https://api.openai.com/v1",
    "api_key_file": "~/.openclaw/secrets/openai_api_key",
    "model": "gpt-4o-mini",
    "top_n": 10,
    "ghost_threshold": 5
  }
}
Key Default Description
enabled false Enable LLM scoring (requires API key)
base_url https://api.openai.com/v1 OpenAI-compatible API endpoint
api_key_file ~/.openclaw/secrets/openai_api_key Path to file containing the API key
model gpt-4o-mini Model to use for scoring
top_n 10 Max articles to send to LLM per batch
ghost_threshold 5 Score threshold for ghost_picks (blog-worthy articles)

Scoring rules:

  • Only the first top_n articles are sent to the LLM. Articles beyond top_n are excluded from the digest entirely. fetch returns articles sorted by date desc, so top_n selects the most recent ones. Increase top_n to evaluate more articles per run (higher token cost).
  • Score >= ghost_threshold : added to ghost_picks list
  • Score >= 3 : kept in articles list
  • Score \x3C= 2 : excluded from output
  • Articles are sorted by score (descending)

When disabled, the score subcommand passes data through unchanged.

Nextcloud output mode

The nextcloud output now defaults to append mode with a date separator. Each dispatch adds content below a ## YYYY-MM-DD HH:MM header, preserving previous entries.

Set "mode": "overwrite" in the output config to restore the old behavior:

{ "type": "nextcloud", "path": "/Veille/digest.md", "mode": "overwrite" }

File output configuration

The file output writes digests to the local filesystem. By default, only paths under ~/.openclaw/ are allowed. To authorize additional directories, use config.security.allowed_output_dirs:

{
  "security": {
    "allowed_output_dirs": [
      "~/Documents/veille",
      "/srv/digests"
    ]
  }
}

Blocked paths (always rejected, even if inside an allowed directory): .ssh, .gnupg, .config/systemd, crontab, /etc/, .bashrc, .profile, .bash_profile, .zshrc, .env

Content validation — written content is rejected if it:

  • Exceeds 1 MB
  • Contains shell shebangs (#!/), SSH keys, PGP blocks, or code injection patterns (eval(, exec(, __import__(, import os, import subprocess)

All blocked attempts are logged to stderr with the reason.


Templates (agent usage)

Basic digest

# In agent tool call:
result = exec("python3 scripts/veille.py fetch --hours 24 --filter-seen --filter-topic")
data = json.loads(result.stdout)
# data["wrapped_listing"] is ready for LLM prompt injection
# data["count"] = number of new articles
# data["articles"] = list of article dicts

Prompt template

You are a news analyst. Here are today's articles:

{data["wrapped_listing"]}

Please summarize the 5 most important stories, focusing on security and tech.

Agent workflow example

1. Call veille fetch --filter-seen --filter-topic
2. Pipe through veille score (LLM scoring, if enabled)
3. If count > 0: pass wrapped_listing to LLM for analysis
4. LLM produces digest summary
5. Pipe through veille send (dispatches to configured outputs)

Pipeline (CLI)

python3 scripts/veille.py fetch --filter-seen --filter-topic \
  | python3 scripts/veille.py score \
  | python3 scripts/veille.py send

Filtering by keyword (post-fetch)

data = json.loads(fetch_output)
security_articles = [
    a for a in data["articles"]
    if any(kw in a["title"].lower() for kw in ["cve", "vuln", "patch", "breach"])
]

Ideas

  • Add keyword-based filtering (--keywords security,cve,linux)
  • Add per-source TTL override in config
  • Export digest as HTML or Markdown
  • Schedule with cron: 0 8 * * * python3 veille.py fetch --filter-seen --filter-topic
  • Weight articles by source tier for LLM prioritization
  • Add OPML import/export for source list management
  • Integrate with ntfy or Telegram for real-time alerts on high-priority articles

Combine with

  • mail-client : send the digest by email after fetching

    veille fetch --filter-seen | ... | mail-client send
    
  • nextcloud-files : archive the daily digest as a Markdown file

    veille fetch --filter-seen | jq .wrapped_listing -r > /tmp/digest.md
    nextcloud-files upload /tmp/digest.md /Digests/$(date +%Y-%m-%d).md
    

Troubleshooting

See references/troubleshooting.md for detailed troubleshooting steps.

Common issues:

  • No articles returned: check --hours value, verify feed URLs in config
  • XML parse error on a feed: some feeds use non-standard XML; the skill skips broken items silently
  • All articles filtered as seen: run seen-stats to check store size; reset with rm seen_urls.json
  • Import error: ensure you run veille.py from its directory or via full path
  • File output blocked: path is outside ~/.openclaw/ — add the target directory to config.security.allowed_output_dirs (see File output configuration)
安全使用建议
This skill appears internally consistent with its stated purpose. Before installing: 1) Inspect setup.py (it can create a cron job) and confirm you want scheduled autonomous dispatch. 2) If you enable Telegram output and want to avoid any cross-config reads, set bot_token explicitly in the skill output config (SKILL.md shows how). 3) If you enable LLM scoring, place the API key in a dedicated file (e.g. ~/.openclaw/secrets/openai_api_key) and check its filesystem permissions. 4) Note the minor administrative mismatch between registry metadata and _meta.json (owner/version); verify the skill source/repo you install from. 5) Review any other installed skills before allowing this skill to delegate to them (dispatch validates script paths, but it will run scripts under ~/.openclaw/workspace/skills/). If you want extra caution, run python3 scripts/init.py to validate network access and python3 scripts/veille.py fetch --hours 24 in a non-production environment to observe behaviour before enabling outputs.
功能分析
Type: OpenClaw Skill Name: fox-veille Version: 1.0.0 The fox-veille skill is a well-engineered RSS aggregator that demonstrates a high level of security awareness. It implements several proactive defenses, including SSRF protection in `veille.py` (blocking private IP ranges and localhosts), XXE prevention in XML parsing, and path traversal checks in `dispatch.py` when calling other skills. Notably, it includes robust protections against prompt injection by wrapping untrusted RSS content in security markers and performs content/path validation in the `file` output handler to prevent writing dangerous payloads (like shebangs or SSH keys) to sensitive system locations.
能力评估
Purpose & Capability
Name/description (RSS aggregator, dedup, LLM scoring, dispatch) matches the provided code and runtime instructions. Files show feed fetching, TTL-based seen-store, topic dedup, optional LLM scoring (reads an API key file), and dispatchers for Telegram/email/Nextcloud/file—all appropriate for the stated purpose. Minor metadata mismatch: _meta.json ownerId/version differs from registry metadata, which is an administrative inconsistency but not a functional mismatch.
Instruction Scope
SKILL.md and CLI instruct only feed fetching, scoring, deduplication, and dispatch. The only cross-config read documented is ~/.openclaw/openclaw.json for a Telegram bot token when telegram output is enabled; scorer reads an API key file only when LLM scoring is enabled. No instructions asking to read unrelated secrets or broad system state. Scheduled/autonomous dispatch is documented (cron), which is expected for a digest skill.
Install Mechanism
No install spec (instruction-only); code is included in the skill bundle. There are no network downloads or arbitrary extract/install steps in the manifest. Running occurs locally via provided Python scripts, consistent with the project's description.
Credentials
The skill declares no required environment variables. It reads local config/data under ~/.openclaw and an optional LLM API key file (default ~/.openclaw/secrets/openai_api_key) only when scoring is enabled. The only cross-config read is the Telegram bot token from ~/.openclaw/openclaw.json when telegram output is enabled and no bot_token is provided—this behaviour is documented. Overall credential access is proportional to functionality.
Persistence & Privilege
always:false (no forced inclusion). The skill can create a scheduled job (cron) via setup.py and perform autonomous dispatch when scheduled; this is documented. Delegation to other skills uses subprocess.run() with path validation to ~/.openclaw/workspace/skills/, which limits but does not eliminate risk if other installed skills are malicious. No evidence the skill modifies other skills' configs or elevates privileges.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install fox-veille
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /fox-veille 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
fox-veille 1.0.0 - Initial Release - Introduces a standalone RSS aggregator, deduplication engine, LLM scoring, and output dispatcher for OpenClaw agents. - Fetches articles from 20+ sources, filters already-seen URLs and deduplicates by topic using Jaccard similarity and named entities. - Supports scoring articles with an LLM (OpenAI-compatible), and dispatching digests to Telegram, email (via mail-client), Nextcloud (via nextcloud-files), or local files. - Provides a CLI with commands for fetching, scoring, and sending digests, plus stats and manual controls for seen-topic stores. - No external dependencies; uses Python 3.9+ standard library only. - Includes robust security/isolation measures for credentials, subprocesses, and file outputs.
元数据
Slug fox-veille
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Fox Veille 是什么?

RSS feed aggregator, deduplication engine, LLM scoring, and output dispatcher for OpenClaw agents. Use when: fetching recent articles from configured sources... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 95 次。

如何安装 Fox Veille?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install fox-veille」即可一键安装,无需额外配置。

Fox Veille 是免费的吗?

是的,Fox Veille 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Fox Veille 支持哪些平台?

Fox Veille 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Fox Veille?

由 GarfieldQin(@qinthqod)开发并维护,当前版本 v1.0.0。

💬 留言讨论