/install photo-screener
Photo Screener — MobileCLIP-powered Smart Pre-screening
Intelligently filter, deduplicate, and classify photos using Apple MobileCLIP2-S0, preparing them for efficient multimodal LLM processing.
Why MobileCLIP2-S0?
Based on 4-model comparison test:
| Metric | MobileCLIP2-S0 | ViT-L/14 (baseline) |
|---|---|---|
| Encoding Speed | 26.7ms/img ⚡ | 483.7ms/img |
| Speed Ratio | 18.1x faster | 1x |
| Pearson Correlation | 0.78 | 1.0 (baseline) |
| Top-10 Overlap | 8/10 | 10/10 |
| Model Size | 74.8M | 427.6M |
| Embed Dim | 512 | 768 |
💡 1/18 of the time, 80% selection consistency — best speed/quality tradeoff.
Dependencies
Declaration file: requirements.txt
Prefer venv: Before running scripts, activate the project-root virtual environment (e.g. .venv/). If it doesn't exist, create one first:
# Create venv and install dependencies (recommended)
python3 -m venv .venv
source .venv/bin/activate
pip install -r photo-screener/requirements.txt
# Or use the skill's setup script (checks + installs)
bash photo-screener/scripts/setup_deps.sh
# Before each session, activate venv
source .venv/bin/activate
Alternatively, install globally:
pip3 install -r photo-screener/requirements.txt
Model Download Policy
The model is NOT pre-downloaded. This is by design to avoid:
- Unexpected large downloads (~300MB)
- Wasted bandwidth if the user doesn't need this skill
Download Behavior
| Mode | Behavior |
|---|---|
| Interactive (terminal) | Prompts user: "是否下载模型?[Y/n]" |
| Non-interactive (piped/agent) | Exits with manual download instructions |
| --auto-download flag | Downloads without confirmation |
Manual Download
# Using China mirror (recommended)
HF_ENDPOINT=https://hf-mirror.com python3 -c \
"import open_clip; open_clip.create_model_and_transforms('MobileCLIP2-S0', pretrained='dfndr2b')"
# Or run setup script
bash photo-screener/scripts/setup_deps.sh
Configuration
Copy config.example.toml to config.toml and edit. See config.example.toml for all available options.
Usage
# Basic screening
python3 scripts/screen.py ~/data/output/thumbnails
# Custom thresholds
python3 scripts/screen.py ~/data/output/thumbnails \
--min-score 5.0 --sim-threshold 0.95
# Keep top 50
python3 scripts/screen.py ~/data/output/thumbnails --top-k 50
# Auto-download model (skip confirmation)
python3 scripts/screen.py ~/data/output/thumbnails --auto-download
# Pass specific file paths instead of a directory
python3 scripts/screen.py \
--paths ~/data/RAW/001/thumbnails/DSC_0001.jpg \
~/data/RAW/001/thumbnails/DSC_0002.jpg
# Dry run
python3 scripts/screen.py ~/data/output/thumbnails --dry-run
| Option | Description | Default |
|---|---|---|
input_dir |
Directory with photos (optional with --paths) | required |
--paths |
Specific image paths (alternative to input_dir) | — |
--output, -o |
Output JSON path | auto |
--min-score |
Min aesthetic score (1-10) | 4.0 |
--sim-threshold |
Dedup threshold (0-1) | 0.97 |
--batch-size |
Max photos per LLM batch | 20 |
--top-k |
Keep only top K | all |
--recursive |
Search subdirectories | off |
--auto-download |
Skip model download prompt | off |
--dry-run |
Preview only | off |
Pipeline
Photos (thumbnails)
│
▼ Stage 1: MobileCLIP Encoding (~27ms/image)
│ → 512-dim normalized embeddings
│
├── Stage 2: Aesthetic Scoring
│ └── LAION MLP (zero-padded 512→768 dim)
│ └── Remove below threshold (default: 4.0)
│
├── Stage 3: Similarity Dedup
│ └── Cosine similarity + greedy dedup
│ └── Higher score = higher priority
│
├── Stage 4: Scene Classification
│ └── Zero-shot text matching (14 categories)
│
└── Output: filter_report.json
Agent Integration
When using this skill from an agent:
- Check dependencies:
bash scripts/setup_deps.sh - Run with --auto-download: in agent context, use
--auto-downloadto skip interactive prompt - Or pre-download: run setup_deps.sh first which handles model download with user confirmation
# Agent-friendly command (auto-download)
python3 photo-screener/scripts/screen.py \
~/data/output/{session-id}/thumbnails \
--output ~/data/output/{session-id}/filter_report.json \
--auto-download
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install photo-screener - 安装完成后,直接呼叫该 Skill 的名称或使用
/photo-screener触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Photo Screener 是什么?
AI-powered photo pre-screening using MobileCLIP2-S0 model. 18x faster than ViT-L/14 with 80% selection consistency (Top-10 overlap 8/10). Use when the user w... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 43 次。
如何安装 Photo Screener?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install photo-screener」即可一键安装,无需额外配置。
Photo Screener 是免费的吗?
是的,Photo Screener 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Photo Screener 支持哪些平台?
Photo Screener 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Photo Screener?
由 konanok(@konanok)开发并维护,当前版本 v1.0.0。