← Back to Skills Marketplace

image-collector

Name: image-collector
Author: lunadelo

by luna · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

101

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install image-collector

Description

AI 科技日报图片采集工具，从官方来源自动采集新闻配图，支持水印检测、质量检查和关联性验证

README (SKILL.md)

image-collector Skill

功能概述

自动为 AI 科技日报新闻采集配图，确保：

✅ 图片与内容强关联 — 优先从官方来源采集
✅ 无水印 — 自动检测并过滤带水印图片
✅ 高质量 — 分辨率≥800x600，无明显压缩
✅ 不随意贴图 — 关联性验证，拒绝随机图片

快速开始

# 检查依赖
bash ~/.openclaw/workspace/skills/image-collector/scripts/check-deps.sh

基本用法

# 为单条新闻采集配图
python3 ~/.openclaw/workspace/skills/image-collector/scripts/collect_images.py \
  --news "苹果国行 AI 凌晨偷跑" \
  --keywords "Apple,Intelligence,Baidu" \
  --source "apple.com"

图片来源优先级

优先级	来源类型	示例域名
P0	官方新闻稿	apple.com, microsoft.com, openai.com
P1	权威媒体	36kr.com, bloomberg.com, reuters.com
P2	产品截图	手动截取（使用 web-access skill）
P3	自制图表	Python matplotlib / Excel
禁用	随机图片	unsplash.com, pixabay.com
禁用	微信图片	mmbiz.qpic.cn

验证流程

来源验证 → 只从白名单来源采集
水印检测 → 四角 + 底部检测，过滤带水印图片
质量检查 → 分辨率≥800x600，宽高比正常
关联性验证 → 文件名关键词匹配评分
✅ 最终输出 → 优化后的图片

配合 web-access skill

当自动采集失败时：

# 1. 用 web-access 打开官网截图
curl -s "http://localhost:3456/new?url=https://www.apple.com/newsroom"

# 2. 截图后手动保存到 article-images/

# 3. 用 image-collector 优化
python3 collect_images.py --optimize /tmp/screenshot.png

验证标准

1. 关联性验证

图片主题与新闻标题匹配
图片中包含新闻关键词
图片来源与新闻主体相关

2. 水印检查

四角无 Logo 水印
底部无公众号名称
无明显版权标识

3. 质量检查

分辨率 ≥ 800x600
无明显压缩痕迹
色彩正常

作者

九万

Usage Guidance

This skill appears coherent for automated image collection: it will make outbound HTTP requests and save downloaded images under /home/node/.openclaw/workspace/article-images. Before running it, review the full collect_images.py (the provided content was truncated in the listing) to confirm there are no unexpected network endpoints or commands. If you care about privacy or network egress, run it in a sandboxed environment or inspect/limit allowed domains. Also review the hardcoded lists of example image URLs and the use of the jina.ai proxy search URL (r.jina.ai) if you do not want search queries routed through third-party proxies.

Capability Analysis

Type: OpenClaw Skill Name: image-collector Version: 1.0.0 The image-collector skill is a legitimate tool designed to automate the gathering and processing of news images from official sources. It includes scripts for dependency checking (check-deps.sh), heuristic-based watermark and quality detection (collect_images.py), and direct downloading from a whitelist of official URLs (collect_simple.py). The code uses standard libraries like Pillow and requests and follows the stated purpose without any evidence of data exfiltration, malicious execution, or prompt injection. The use of r.jina.ai as a search proxy is a common pattern for AI agents to parse web content.

Capability Assessment

✓ Purpose & Capability

Name/description (image collection, watermark/quality/relevance checks) align with the provided scripts and README. The code implements search/download, watermark/quality checks, and local optimization; required Python packages (Pillow, requests) are consistent with the task.

ℹ Instruction Scope

SKILL.md instructs running the included scripts and (optionally) calling a local web-access service for screenshots. The scripts perform HTTP requests to external sites, download image bytes, and write them to the workspace /home/node/.openclaw/workspace/article-images. This behavior is expected for an image collection tool but means network I/O and filesystem writes will occur; the skill does not attempt to read unrelated system files or environment secrets.

✓ Install Mechanism

No install spec; code is shipped as plain Python scripts. No remote installers or archive downloads are used. Dependencies are standard Python packages (Pillow, requests) and a simple shell dependency check script.

✓ Credentials

The skill requests no environment variables or credentials. The scripts do set and use a fixed workspace path for output but do not require secret access tokens. External network requests are limited to image/search endpoints and hardcoded example URLs.

✓ Persistence & Privilege

Skill is not always-enabled and does not request elevated platform privileges. It writes files into its own workspace directory and does not modify other skills or global agent configuration.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install image-collector
After installation, invoke the skill by name or use /image-collector
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

image-collector 1.0.0 - 首次发布，自动采集 AI 科技日报新闻配图 - 实现从官方权威来源获取图片，自动检测水印并过滤 - 支持图片质量检查，分辨率≥800x600 - 提供图片与新闻内容的强关联性验证 - 附带用法示例与依赖检查脚本 - 支持与 web-access skill 联合使用

Metadata

Slug image-collector

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is image-collector?

AI 科技日报图片采集工具，从官方来源自动采集新闻配图，支持水印检测、质量检查和关联性验证. It is an AI Agent Skill for Claude Code / OpenClaw, with 101 downloads so far.

How do I install image-collector?

Run "/install image-collector" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is image-collector free?

Yes, image-collector is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does image-collector support?

image-collector is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created image-collector?

It is built and maintained by luna (@lunadelo); the current version is v1.0.0.

More Skills