Fox Veille
/install fox-veille
Skill Veille - RSS Aggregator
RSS feed aggregator with URL deduplication and topic-based deduplication for OpenClaw agents. Fetches articles from 20+ configured sources, filters already-seen URLs (TTL 14 days), and deduplicates articles covering the same story using Jaccard similarity + named entities.
No external dependencies: stdlib Python only (urllib, xml.etree, email.utils).
Trigger phrases
- "fais une veille"
- "quoi de neuf en securite / tech / crypto / IA ?"
- "donne-moi les news du jour"
- "articles recents sur [sujet]"
- "veille RSS"
- "digest du matin"
- "nouvelles non vues"
Quick Start
# 1. Setup
python3 scripts/setup.py
# 2. Validate
python3 scripts/init.py
# 3. Fetch + Score + Send (full pipeline)
python3 scripts/veille.py fetch --filter-seen --filter-topic \
| python3 scripts/veille.py score \
| python3 scripts/veille.py send
Setup
Requirements
- Python 3.9+
- Network access to RSS feeds (public, no auth required)
- No pip installs needed
Installation
# From the skill directory
python3 scripts/setup.py
# Validate
python3 scripts/init.py
The wizard creates:
~/.openclaw/config/veille/config.json(fromconfig.example.json)~/.openclaw/data/veille/(data directory)
Customizing sources
Edit ~/.openclaw/config/veille/config.json and add/remove entries in the "sources" dict:
{
"sources": {
"My Blog": "https://example.com/feed.xml",
"BleepingComputer": "https://www.bleepingcomputer.com/feed/"
}
}
Storage and credentials
Files written by this skill
| Path | Written by | Purpose | Contains secrets |
|---|---|---|---|
~/.openclaw/config/veille/config.json |
setup.py |
Sources, outputs, options | NO |
~/.openclaw/data/veille/seen_urls.json |
veille.py |
URL dedup store (TTL 14d) | NO |
~/.openclaw/data/veille/topic_seen.json |
veille.py |
Topic dedup store (TTL 5d) | NO |
Files read from outside the skill
| Path | Read by | Key accessed | When |
|---|---|---|---|
~/.openclaw/openclaw.json |
dispatch.py |
channels.telegram.botToken (read-only) |
Only when telegram_bot output is enabled and no bot_token is set in the output config |
This is the only cross-config read. To avoid it entirely, set bot_token explicitly in your output config:
{ "type": "telegram_bot", "bot_token": "YOUR_BOT_TOKEN", "chat_id": "...", "enabled": true }
Output credentials (optional)
Credentials are only used if you enable the corresponding output. None are required for core functionality (RSS fetch + dedup).
| Output | Credential source | What is used |
|---|---|---|
telegram_bot |
~/.openclaw/openclaw.json or bot_token in output config |
Bot token (read-only) |
mail-client |
Delegated to mail-client skill (its own creds) | Nothing read directly |
mail-client (SMTP fallback) |
smtp_user / smtp_pass in output config |
SMTP login |
nextcloud |
Delegated to nextcloud-files skill (its own creds) | Nothing read directly |
Cleanup on uninstall
python3 scripts/setup.py --cleanup
Security model
Credential isolation
- API keys are read from dedicated files (default
~/.openclaw/secrets/), never from config.json. The scorer warns at runtime if a key file has overly permissive filesystem permissions. - SMTP credentials (fallback only) are stored in the output config block — use the mail-client skill delegation to avoid storing SMTP passwords.
Subprocess boundaries
- Dispatch delegates to other OpenClaw skills via
subprocess.run()(nevershell=True). Script paths are validated to reside under~/.openclaw/workspace/skills/before execution, preventing path traversal. - No credentials are passed as subprocess arguments — each skill manages its own authentication.
File output safety
- The
fileoutput type validates the target path before writing: only~/.openclaw/is allowed by default. Additional directories can be whitelisted viaconfig.security.allowed_output_dirs. Sensitive paths (.ssh,.gnupg,/etc/,.bashrc, etc.) are always blocked regardless of allowlist. - Written content is checked for suspicious patterns (shell shebangs, SSH keys, PGP blocks, code injection) and size-limited to 1 MB.
Cross-config reads
- The only cross-config file read is
~/.openclaw/openclaw.jsonfor the Telegram bot token, and only whentelegram_botoutput is enabled without an explicitbot_token. This read is logged to stderr. Setbot_tokenin the output config to eliminate this read entirely.
Autonomous dispatch
- When scheduled (cron), the skill can send messages/files to configured outputs without user interaction. All dispatch actions are logged to stderr with an audit summary. Use
enabled: falseon any output to disable it without removing its config.
CLI reference
fetch
python3 veille.py fetch [--hours N] [--filter-seen] [--filter-topic] [--sources FILE]
Options:
--hours N: lookback window in hours (default: from config, usually 24)--filter-seen: filter already-seen URLs (uses seen_urls.json TTL store)--filter-topic: deduplicate by topic (uses topic_seen.json + Jaccard similarity)--sources FILE: path to custom JSON sources file
Output (JSON on stdout):
{
"hours": 24,
"count": 42,
"skipped_url": 5,
"skipped_topic": 3,
"articles": [...],
"wrapped_listing": "=== UNTRUSTED EXTERNAL CONTENT ..."
}
seen-stats
python3 veille.py seen-stats
Shows URL seen store statistics (count, TTL, file path).
topic-stats
python3 veille.py topic-stats
Shows topic deduplication store statistics.
mark-seen
python3 veille.py mark-seen URL [URL ...]
Marks one or more URLs as already seen (prevents them from appearing in future fetches with --filter-seen).
score
python3 veille.py score [--dry-run]
Reads a digest JSON from stdin (output of fetch) and scores articles using an OpenAI-compatible LLM.
Returns enriched JSON with scored, ghost_picks, and per-article score/reason fields.
Options:
--dry-run: print summary on stderr without calling the LLM API
When llm.enabled is false (default), articles pass through unchanged ("scored": false).
Pipeline usage:
python3 veille.py fetch --filter-seen --filter-topic | python3 veille.py score | python3 veille.py send
send
python3 veille.py send [--profile NAME]
Reads a digest JSON from stdin and dispatches to all enabled outputs configured in config.json.
Accepts both raw fetch output (articles key) and LLM-processed digests (categories key).
Output types: telegram_bot, mail-client, nextcloud, file.
telegram_bot: bot token auto-read from OpenClaw config - no extra setup if Telegram already configured.mail-client: delegates to mail-client skill if installed, falls back to raw SMTP config.nextcloud: delegates to nextcloud-files skill if installed (append mode by default with date separator).file: writes digest to a local file. Path must be under~/.openclaw/(default) or a directory listed inconfig.security.allowed_output_dirs. Sensitive paths and suspicious content are blocked (see Security model).
Configure outputs interactively:
python3 scripts/setup.py --manage-outputs
config
python3 veille.py config
Prints the active configuration (no secrets).
LLM scoring configuration
The llm key in config.json controls the optional LLM-based article scoring:
{
"llm": {
"enabled": false,
"base_url": "https://api.openai.com/v1",
"api_key_file": "~/.openclaw/secrets/openai_api_key",
"model": "gpt-4o-mini",
"top_n": 10,
"ghost_threshold": 5
}
}
| Key | Default | Description |
|---|---|---|
enabled |
false |
Enable LLM scoring (requires API key) |
base_url |
https://api.openai.com/v1 |
OpenAI-compatible API endpoint |
api_key_file |
~/.openclaw/secrets/openai_api_key |
Path to file containing the API key |
model |
gpt-4o-mini |
Model to use for scoring |
top_n |
10 |
Max articles to send to LLM per batch |
ghost_threshold |
5 |
Score threshold for ghost_picks (blog-worthy articles) |
Scoring rules:
- Only the first
top_narticles are sent to the LLM. Articles beyondtop_nare excluded from the digest entirely.fetchreturns articles sorted by date desc, sotop_nselects the most recent ones. Increasetop_nto evaluate more articles per run (higher token cost). - Score >=
ghost_threshold: added toghost_pickslist - Score >= 3 : kept in
articleslist - Score \x3C= 2 : excluded from output
- Articles are sorted by score (descending)
When disabled, the score subcommand passes data through unchanged.
Nextcloud output mode
The nextcloud output now defaults to append mode with a date separator. Each dispatch adds content below a ## YYYY-MM-DD HH:MM header, preserving previous entries.
Set "mode": "overwrite" in the output config to restore the old behavior:
{ "type": "nextcloud", "path": "/Veille/digest.md", "mode": "overwrite" }
File output configuration
The file output writes digests to the local filesystem. By default, only paths under ~/.openclaw/ are allowed. To authorize additional directories, use config.security.allowed_output_dirs:
{
"security": {
"allowed_output_dirs": [
"~/Documents/veille",
"/srv/digests"
]
}
}
Blocked paths (always rejected, even if inside an allowed directory):
.ssh, .gnupg, .config/systemd, crontab, /etc/, .bashrc, .profile, .bash_profile, .zshrc, .env
Content validation — written content is rejected if it:
- Exceeds 1 MB
- Contains shell shebangs (
#!/), SSH keys, PGP blocks, or code injection patterns (eval(,exec(,__import__(,import os,import subprocess)
All blocked attempts are logged to stderr with the reason.
Templates (agent usage)
Basic digest
# In agent tool call:
result = exec("python3 scripts/veille.py fetch --hours 24 --filter-seen --filter-topic")
data = json.loads(result.stdout)
# data["wrapped_listing"] is ready for LLM prompt injection
# data["count"] = number of new articles
# data["articles"] = list of article dicts
Prompt template
You are a news analyst. Here are today's articles:
{data["wrapped_listing"]}
Please summarize the 5 most important stories, focusing on security and tech.
Agent workflow example
1. Call veille fetch --filter-seen --filter-topic
2. Pipe through veille score (LLM scoring, if enabled)
3. If count > 0: pass wrapped_listing to LLM for analysis
4. LLM produces digest summary
5. Pipe through veille send (dispatches to configured outputs)
Pipeline (CLI)
python3 scripts/veille.py fetch --filter-seen --filter-topic \
| python3 scripts/veille.py score \
| python3 scripts/veille.py send
Filtering by keyword (post-fetch)
data = json.loads(fetch_output)
security_articles = [
a for a in data["articles"]
if any(kw in a["title"].lower() for kw in ["cve", "vuln", "patch", "breach"])
]
Ideas
- Add keyword-based filtering (
--keywords security,cve,linux) - Add per-source TTL override in config
- Export digest as HTML or Markdown
- Schedule with cron:
0 8 * * * python3 veille.py fetch --filter-seen --filter-topic - Weight articles by source tier for LLM prioritization
- Add OPML import/export for source list management
- Integrate with ntfy or Telegram for real-time alerts on high-priority articles
Combine with
-
mail-client : send the digest by email after fetching
veille fetch --filter-seen | ... | mail-client send -
nextcloud-files : archive the daily digest as a Markdown file
veille fetch --filter-seen | jq .wrapped_listing -r > /tmp/digest.md nextcloud-files upload /tmp/digest.md /Digests/$(date +%Y-%m-%d).md
Troubleshooting
See references/troubleshooting.md for detailed troubleshooting steps.
Common issues:
- No articles returned: check
--hoursvalue, verify feed URLs in config - XML parse error on a feed: some feeds use non-standard XML; the skill skips broken items silently
- All articles filtered as seen: run
seen-statsto check store size; reset withrm seen_urls.json - Import error: ensure you run
veille.pyfrom its directory or via full path - File output blocked: path is outside
~/.openclaw/— add the target directory toconfig.security.allowed_output_dirs(see File output configuration)
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install fox-veille - After installation, invoke the skill by name or use
/fox-veille - Provide required inputs per the skill's parameter spec and get structured output
What is Fox Veille?
RSS feed aggregator, deduplication engine, LLM scoring, and output dispatcher for OpenClaw agents. Use when: fetching recent articles from configured sources... It is an AI Agent Skill for Claude Code / OpenClaw, with 95 downloads so far.
How do I install Fox Veille?
Run "/install fox-veille" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Fox Veille free?
Yes, Fox Veille is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Fox Veille support?
Fox Veille is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Fox Veille?
It is built and maintained by GarfieldQin (@qinthqod); the current version is v1.0.0.