← Back to Skills Marketplace
43622283

Li PhotoIndexWithLLM

by Terry S Fisher · GitHub ↗ · v1.1.2 · MIT-0
cross-platform ⚠ suspicious
96
Downloads
0
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install li-photoindexwithllm
Description
基于视觉语言大模型,扫描并索引多格式照片,支持自然语言语义搜索与结构化JSON输出,便捷集成智能体调用。
README (SKILL.md)

Photo Search Skill

📸 智能照片搜索技能 - 基于 VL 大模型的照片索引和语义搜索

📋 概述

Photo Search Skill 是一个独立的智能技能程序,用于扫描、索引和搜索照片。它利用 VL(Vision-Language)大模型分析照片内容,建立结构化索引,并支持通过自然语言进行语义搜索。

✨ 特性

  • 🔍 完全独立:可以从任何目录调用,无需设置环境变量
  • 🤖 智能体友好:专为 hermes、openclaw 等智能体设计
  • 📦 零配置调用:自动定位主项目,无需手动配置路径
  • 🌐 JSON 输出:支持结构化输出,便于智能体解析

核心能力

  • 🔍 照片扫描:扫描指定目录的所有照片文件
  • 🤖 VL 分析:使用本地/远程 VL 模型智能解读照片内容
  • 📊 自动索引:生成场景、物体、人物、标签等结构化数据
  • 🔎 语义搜索:支持自然语言查询,理解搜索意图
  • 🏷️ 人工标注:用户可自定义标签,训练个性化识别
  • 🌐 CLI 接口:通过命令行调用,易于集成到智能体

📷 支持的图片格式(17 种)

类型 格式
常见格式 .jpg .jpeg .png .webp .bmp .tiff .gif
iPhone/Apple .heic .heif
Canon 单反 .cr2
Nikon 单反 .nef
Sony 单反 .arw
Olympus .orf
Fujifilm .raf
通用 RAW .dng
Panasonic .rw2
Pentax .pef
Sony 旧款 .sr2

🚀 快速使用

智能体调用方式

智能体(如 hermes, openclaw 等)可以通过命令行直接调用此技能:

# 从任何目录调用(使用绝对路径)
python G:\python\PhotoIndexWithLLM\skills\photo-search\skill.py search "海滩日落"

# 扫描照片
python G:\python\PhotoIndexWithLLM\skills\photo-search\skill.py scan --dir D:\Photos

# 扫描并搜索(一步完成)
python G:\python\PhotoIndexWithLLM\skills\photo-search\skill.py scan_and_search --dir D:\Photos --query "海边"

# JSON 格式输出(便于智能体解析)
python G:\python\PhotoIndexWithLLM\skills\photo-search\skill.py search "海滩" --format json

在项目目录中调用

cd G:\python\PhotoIndexWithLLM

# 扫描照片
python skills/photo-search/skill.py scan --dir D:\Photos

# 搜索照片
python skills/photo-search/skill.py search "海滩日落"

# 扫描并搜索
python skills/photo-search/skill.py scan_and_search --dir D:\Photos --query "海边"

📖 智能体调用指南

环境要求

主项目需要配置好:

  • Python 3.10+
  • 已安装依赖:pip install -r requirements.txt
  • 已配置 .env 文件
  • LM Studio 运行在端口 1234(如需本地模型)

Skill 本身无需额外配置!

基本调用模式

# 1. 扫描并索引照片
python \x3Cskill路径> scan --dir \x3C照片目录>

# 2. 搜索照片
python \x3Cskill路径> search "\x3C搜索关键词>"

# 3. 扫描并搜索(组合命令)
python \x3Cskill路径> scan_and_search --dir \x3C目录> --query "\x3C关键词>"

智能体工作流程

用户请求:"帮我找一下海边的照片"
    ↓
智能体执行:
    python G:\python\PhotoIndexWithLLM\skills\photo-search\skill.py search "海边"
    ↓
返回结果:
    {
      "results": [...],
      "total": 5,
      "search_type": "hybrid"
    }
    ↓
智能体回复用户:
    "找到了5张海边的照片..."

🎯 完整命令参考

扫描照片

# 扫描指定目录
python skill.py scan --dir D:\MyPhotos

# 扫描多个目录
python skill.py scan --dir D:\Photos E:\Pictures

# 强制重新索引
python skill.py scan --force --dir D:\Photos

搜索照片

# 关键词搜索
python skill.py search "海滩 日落"

# 语义搜索(自然语言)
python skill.py search "蓝色的海边风景"

# 带标签过滤
python skill.py search "旅行" --tags 风景,人物

# 按场景过滤
python skill.py search "风景" --scene 户外

# 按日期范围
python skill.py search "旅行" --date-from 2024-01-01 --date-to 2024-12-31

# 限制返回数量
python skill.py search "海滩" --limit 10

# JSON 格式输出
python skill.py search "海滩" --format json

扫描并搜索(组合命令)

# 一步完成:先扫描,再搜索
python skill.py scan_and_search --dir D:\Photos --query "海边"

# JSON 输出
python skill.py scan_and_search --dir D:\Photos --query "海边" --format json

人工标注

# 为照片添加标签
python skill.py annotate --photo D:\Photos\img001.jpg --type person --name 张三

# 添加场景标签
python skill.py annotate --photo D:\Photos\img002.jpg --type scene --name 海边

# JSON 输出
python skill.py annotate --photo D:\Photos\img001.jpg --type person --name 张三 --format json

训练模型

# 训练个性化模型
python skill.py train

# JSON 输出
python skill.py train --format json

其他命令

# 查看统计信息
python skill.py stats

# 测试 LLM 连接
python skill.py test

# 列出照片
python skill.py list --limit 20

🤖 智能体集成示例

示例 1:Python 智能体(Hermes、OpenClaw 等)

import subprocess
import json

def search_photos(query: str, limit: int = 20) -> dict:
    """搜索照片"""
    skill_path = r"G:\python\PhotoIndexWithLLM\skills\photo-search\skill.py"
    
    result = subprocess.run(
        ["python", skill_path, "search", query, "--limit", str(limit), "--format", "json"],
        capture_output=True,
        text=True
    )
    
    if result.returncode == 0:
        return json.loads(result.stdout)
    else:
        return {"error": result.stderr}

# 使用
photos = search_photos("海滩日落")
print(f"找到 {photos['total']} 张照片")

示例 2:Shell 脚本智能体

#!/bin/bash
# photo_agent.sh - 智能体照片搜索脚本

SKILL="G:\python\PhotoIndexWithLLM\skills\photo-search\skill.py"

# 搜索照片
search_photos() {
    local query="$1"
    local limit="${2:-20}"
    
    python "$SKILL" search "$query" --limit "$limit" --format json
}

# 扫描照片
scan_photos() {
    local dir="$1"
    python "$SKILL" scan --dir "$dir"
}

# 使用
search_photos "海滩" 10

示例 3:通用智能体封装

class PhotoSearchSkill:
    """照片搜索技能封装"""
    
    def __init__(self, skill_path: str):
        self.skill_path = skill_path
    
    def search(self, query: str, **kwargs) -> dict:
        """搜索照片"""
        cmd = ["python", self.skill_path, "search", query]
        
        if kwargs.get("format") == "json":
            cmd.append("--format")
            cmd.append("json")
        
        result = subprocess.run(cmd, capture_output=True, text=True)
        return json.loads(result.stdout) if result.returncode == 0 else None
    
    def scan(self, directories: list) -> bool:
        """扫描照片"""
        cmd = ["python", self.skill_path, "scan", "--dir"] + directories
        result = subprocess.run(cmd, capture_output=True, text=True)
        return result.returncode == 0
    
    def annotate(self, photo: str, type: str, name: str) -> bool:
        """添加标注"""
        cmd = [
            "python", self.skill_path, "annotate",
            "--photo", photo,
            "--type", type,
            "--name", name
        ]
        result = subprocess.run(cmd, capture_output=True, text=True)
        return result.returncode == 0

📊 输出格式

JSON 输出示例

{
  "results": [
    {
      "file_name": "beach_sunset_001.jpg",
      "file_path": "D:\\Photos\\2024\\beach_sunset_001.jpg",
      "scene_type": "风景",
      "description": "海边的日落,天空呈现橙红色",
      "tags": ["海滩", "日落", "风景", "海洋"],
      "confidence_score": 0.92
    }
  ],
  "total": 5,
  "search_type": "hybrid"
}

文本输出示例

🔍 搜索查询: '海滩 日落'
📊 找到 5 条结果

1. beach_sunset_001.jpg
   📁 D:\Photos\2024\beach_sunset_001.jpg
   🏷️ 场景: 风景
   📝 描述: 海边的日落,天空呈现橙红色
   🔖 标签: 海滩, 日落, 风景, 海洋
   ⭐ 置信度: 0.92

⚙️ 配置说明

主项目配置

Skill 依赖于主项目的配置(.env 文件),主要包括:

# 本地 LLM
LOCAL_LLM_ENDPOINT=http://localhost:1234/v1
LOCAL_LLM_MODEL=qwen3-vl-8b-q4_k_m

# 远程 LLM(可选)
REMOTE_LLM_API_KEY=your-api-key
REMOTE_LLM_MODEL=nvidia/nemotron-nano-12b-v2-vl:free

# 照片目录
PHOTO_SCAN_DIRS=D:\Photos

Skill 参数

参数 说明 示例
query 搜索关键词 "海滩"
--dir 扫描目录 "D:\Photos"
--tags 标签过滤 "风景,人物"
--scene 场景类型 "户外"
--date-from 起始日期 "2024-01-01"
--date-to 结束日期 "2024-12-31"
--limit 返回数量 20
--format 输出格式 json 或 text
--no-vector 禁用向量搜索 -

🔧 故障排除

问题 1:找不到 skill.py

解决方案:

# 使用绝对路径
python G:\python\PhotoIndexWithLLM\skills\photo-search\skill.py search "海滩"

问题 2:找不到项目根目录

解决方案:

# 确保主项目存在
ls G:\python\PhotoIndexWithLLM\config.py

问题 3:本地模型连接失败

# 测试连接
python skill.py test

# 检查 LM Studio 是否运行
netstat -ano | findstr :1234

问题 4:搜索无结果

# 1. 确认已索引照片
python skill.py stats

# 2. 重新扫描
python skill.py scan --force --dir D:\Photos

# 3. 尝试不同关键词
python skill.py search "海边"

🌟 高级用法

批量搜索

# 搜索多个关键词
python skill.py search "海滩" --format json > beach.json
python skill.py search "山脉" --format json > mountain.json
python skill.py search "城市" --format json > city.json

智能体管道

# 搜索 → 过滤 → 发送
python skill.py search "海滩" --format json | jq '.results[] | .file_path' | send_to_user

定时任务

# Windows 任务计划程序
# 每天凌晨2点扫描新照片
0 2 * * * python G:\python\PhotoIndexWithLLM\skills\photo-search\skill.py scan --dir D:\Photos

📝 注意事项

  1. 独立性:Skill 是独立程序,但依赖主项目的配置和功能
  2. 路径:建议使用绝对路径调用 skill.py
  3. 首次使用:需要先扫描并索引照片才能搜索
  4. 本地模型:需要 LM Studio 运行在端口 1234
  5. JSON 输出:智能体解析时务必使用 --format json

📞 获取帮助

  • 查看帮助:python skill.py --help
  • 查看文档:SKILL.md(本文件)
  • 查看主项目文档:README.md, USAGE.md

Skill 版本: v1.1.1 独立版本: ✅ 兼容智能体: hermes, openclaw, 所有支持 CLI 的智能体 更新日期: 2026-05-17 更新内容: 新增支持 17 种图片格式(HEIC/iPhone、RAW 单反相机格式等)

Usage Guidance
Install only if you are comfortable with a photo index being created. For private or family photos, use local-only mode, disable remote LLM upload, protect the `.env` and SQLite database files, scan only specific folders, and confirm how to delete or encrypt the generated index.
Capability Analysis
Type: OpenClaw Skill Name: li-photoindexwithllm Version: 1.1.2 The Photo Search Skill is a well-documented and legitimate tool designed for indexing and searching local photos using Vision-Language (VL) models. The core logic in `skill.py` handles directory scanning, metadata storage in a local SQLite database, and image analysis via LLM APIs (defaulting to a local LM Studio instance). The code includes robust privacy controls, such as a mandatory user confirmation prompt in `VLClient.analyze_image` before any data is sent to remote endpoints like OpenRouter. No evidence of malicious intent, data exfiltration, or prompt injection was found; the script's behavior is entirely consistent with its stated purpose of providing a searchable photo index.
Capability Tags
requires-sensitive-credentials
Capability Assessment
Purpose & Capability
The skill’s stated purpose—scan photo directories, analyze images with VL models, index them, and support natural-language search—is coherent, but it inherently handles private image content and metadata.
Instruction Scope
Agent-facing commands can scan and index user-specified directories, and the privacy documents describe remote VL analysis of full images; the artifacts give mixed signals about whether local-only mode and consent controls are actually enforced.
Install Mechanism
There is no install spec and no evidence of hidden automatic installation; the docs instruct users to install the normal Python dependency `requests`, but the source/homepage provenance is limited.
Credentials
Remote model support may use API keys from a `.env` file and may transmit complete photos to configured third-party endpoints, while registry metadata declares no credential requirements.
Persistence & Privilege
The artifacts describe a persistent SQLite photo index containing full paths, descriptions, and tags, and also state that the database is not encrypted.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install li-photoindexwithllm
  3. After installation, invoke the skill by name or use /li-photoindexwithllm
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.1.2
- Added multilingual documentation: README files in Arabic, German, English, Spanish, French, Japanese, Korean, Portuguese, Russian, and Chinese. - No changes to skill logic or core functionality; all new files are documentation related. - Improved accessibility for international users by providing guides in multiple languages.
v1.1.1
- Major documentation overhaul: SKILL.md now provides a detailed, step-by-step usage guide in Chinese, covering scanning, indexing, searching, annotation, and integration for intelligent agents. - Expanded feature descriptions, command references, and troubleshooting sections. - Clear CLI examples for both Windows and cross-platform scenarios. - Added environment and project configuration instructions. - Included integration samples for Python scripts and shell agents. - Detailed output format documentation, including JSON examples.
Metadata
Slug li-photoindexwithllm
Version 1.1.2
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 2
Frequently Asked Questions

What is Li PhotoIndexWithLLM?

基于视觉语言大模型,扫描并索引多格式照片,支持自然语言语义搜索与结构化JSON输出,便捷集成智能体调用。 It is an AI Agent Skill for Claude Code / OpenClaw, with 96 downloads so far.

How do I install Li PhotoIndexWithLLM?

Run "/install li-photoindexwithllm" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Li PhotoIndexWithLLM free?

Yes, Li PhotoIndexWithLLM is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Li PhotoIndexWithLLM support?

Li PhotoIndexWithLLM is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Li PhotoIndexWithLLM?

It is built and maintained by Terry S Fisher (@43622283); the current version is v1.1.2.

💬 Comments