← 返回 Skills 市场

Lora Pipeline

Name: Lora Pipeline
Author: iskwang

作者 iskWang · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

265

总下载

当前安装

版本数

在 OpenClaw 中安装

/install lora-pipeline

功能描述

Manages end-to-end LoRA training: collects and verifies photos, scrapes datasets, applies quality checks, captions, and trains the LoRA model locally.

使用说明 (SKILL.md)

LoRA Pipeline

Orchestrates the full LoRA dataset-to-model pipeline. Each phase is self-contained and can be delegated to a sub-agent independently.

Pipeline Overview

Phase 1: 蒐集範例照片   → collect 3–6 reference face photos
Phase 2: 確認人臉正確   → user confirms refs; deepface cross-check
Phase 3: 蒐集 datasets  → scrape web sources guided by face features
Phase 4: 確認照片正確   → face verify + dedup + quality filter + crop
Phase 5: 開始 caption   → WD14 local tagging + trigger word
Phase 6: LoRA training  → RunPod Kohya training → retrieve outputs

Phase Index

Phase	File	Can Sub-Agent	Model	Est. Time
01 — Reference Collection	`phases/01-reference.md`	✅	Haiku (Worker)	5–10 min
02 — Scraping	`phases/02-scraping.md`	✅	Haiku (Worker)	10–30 min
03 — Verify & Clean	`phases/03-verify.md`	✅	Haiku (Worker)	2–5 min
04 — Caption	`phases/04-caption.md`	✅	Haiku (Worker)	1–3 min
05 — Training	`phases/05-training.md`	✅	Haiku (Worker) + Sentry	15–30 min

To load a specific phase: read skills/lora-pipeline/phases/\x3Cphase-file> — each file is independently readable.

Directory Structure

~/.openclaw/workspace/
└── datasets/
    ├── face_references/
    │   └── \x3Clora_name>/          # Phase 1–2: Gold standard refs (3–6 photos)
    │       ├── ref_01.jpg
    │       └── ...
    ├── \x3Clora_name>_raw/          # Phase 3: Raw scraped images (pre-verification)
    │   └── ...
    └── \x3Clora_name>/              # Phase 4–5: Verified + captioned training set
        ├── image001.png
        ├── image001.txt
        └── ...

Privacy Rules (CRITICAL — All Phases)

NO DATA INSPECTION: Do NOT cat, read, or analyze image file contents or .txt caption files.
NO CLOUD UPLOAD: All face verification (DeepFace) must run locally. Never send images to cloud APIs.
NO DATA LEAKAGE: Do not describe dataset details (person names, attributes) to the LLM unnecessarily.
Treat datasets as opaque binary blobs except when running local scripts.

Quality Standards (SDXL)

Resolution: 1024×1024 minimum after crop
Format: Convert all to PNG before training
No black borders: Run autocrop before final save
Dataset diversity: ≥30% clothed/natural skin shots

Scripts

Script	Location	Purpose
`tag_batch.py`	`skills/lora-pipeline/scripts/tag_batch.py`	Local WD14 ONNX tagger for a directory
`smart_crop.py`	`skills/lora-pipeline/scripts/smart_crop.py`	Interactive or automated single-subject cropping
`batch_lora_train.py`	`skills/lora-pipeline/scripts/batch_lora_train.py`	Kohya batch training runner for RunPod

Sub-Agent Protocol

Each phase file contains:

Input Contract — what must already exist before this phase starts
Output Contract — what this phase produces
Completion Signal — how to report back (sessions_send + status file fallback)
Error Escalation — sub-agent reports to parent, never self-escalates model tier

安全使用建议

This skill implements a full LoRA training pipeline but is sloppy: it doesn't declare the system tools and Python libs it needs, contains hardcoded paths (e.g., /Users/mini/...), and assumes you have runpodctl/SSH keys and local model files. Before installing or running: 1) Do not run it blindly — inspect and fix absolute paths in tag_batch.py and other scripts. 2) Ensure you understand and consent to uploading datasets to remote RunPod pods and that you control the SSH keys used. 3) Verify required Python packages and ONNX/Wd14 models are installed in known locations, or change the scripts to configurable paths. 4) Confirm you have permission to scrape and use the images (privacy and legal risk). 5) If you expect a small/local-only helper, this skill is overprivileged; if you intend cloud training, validate runpodctl configuration and review the SCP/SSH commands carefully. If you want, provide the missing dependency list and replace hardcoded paths and I'll re-evaluate.

功能分析

Type: OpenClaw Skill Name: lora-pipeline Version: 1.0.0 The skill bundle provides a complex LoRA training pipeline involving web scraping, local face verification (DeepFace), and remote training on RunPod. It is classified as suspicious because it requires high-risk capabilities—including remote command execution via SSH, automated cloud instance management (runpodctl), and web scraping with JavaScript execution—which, while plausibly needed for the stated purpose, represent a significant attack surface. Additionally, 'scripts/tag_batch.py' contains hardcoded absolute file paths ('/Users/mini/...') that would cause failures or unexpected behavior on other systems. Despite these risks, the bundle includes explicit privacy-preserving instructions for the agent, such as 'NO CLOUD UPLOAD' for face data and 'NO DATA INSPECTION' of dataset contents.

能力评估

⚠ Purpose & Capability

The skill's description (end-to-end LoRA pipeline) matches the instructions and included scripts. However the registry metadata declares no required binaries or env vars while the SKILL.md explicitly depends on runpodctl, ssh/scp, unzip, Python + many Python packages (deepface, opencv, onnxruntime, pandas, PIL), and local ONNX/WD14 tagger models. That mismatch (no declared dependencies vs. heavy toolchain required) is incoherent and will cause failures or implicit network activity to fetch models/tools.

⚠ Instruction Scope

Runtime instructions include web scraping (browser JS snippets and instructions to bypass SNS login via mirrors), extensive filesystem operations, spawning sub-agents, scp/ssh upload to remote RunPod pods, and automated remote training. The SKILL.md's 'NO DATA INSPECTION/NO CLOUD UPLOAD' guidance is contradictory in places (e.g., it forbids sending images to cloud APIs for verification but instructs uploading datasets to remote pods for training). The agent is instructed to perform network transfers (scp/ssh) and spawn long-running sub-agents which are beyond simple local helper behavior — these are appropriate for training but require clear declared permissions and user consent.

ℹ Install Mechanism

There is no install spec (instruction-only), which lowers install risk. But included scripts assume many preinstalled binaries and libraries (accelerate path '/venv/bin/accelerate', runpodctl, system Python packages) and expect model files to exist locally. No mechanism is provided to install or verify those dependencies; this is an operational risk (failures or implicit downloads at runtime).

⚠ Credentials

The skill requests no declared environment variables or credentials, yet the workflow requires access to the user's SSH key, runpodctl configuration, and possibly local model directories (e.g., tag_batch.py hardcodes '/Users/mini/.openclaw/...'). Hardcoded absolute paths and implicit reliance on SSH keys / known_hosts files are disproportionate to a clean, portable skill design and risk accidental use of personal files or keys. The skill also requires RunPod credits / account access (implied) but doesn't declare or request credentials explicitly.

✓ Persistence & Privilege

The skill is not force-installed (always:false) and follows the normal model-invocation defaults. It uses sub-agents and sessions_spawn as part of its design; this autonomous behavior is expected for long-running training tasks. Nothing in the package attempts to modify other skills or grant itself permanent system-wide privileges.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install lora-pipeline
安装完成后，直接呼叫该 Skill 的名称或使用 /lora-pipeline 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of the lora-pipeline skill: end-to-end LoRA model training pipeline. - Automates the full process: photo collection, face verification, dataset scraping and cleaning, captioning, and LoRA training. - Each phase is modular and can be delegated to a sub-agent independently. - Includes strict privacy rules: no cloud uploads, all verifications run locally, never read or leak dataset contents. - Provides scripts for captioning, smart cropping, and batch training. - Ensures high-quality datasets: PNG format, 1024×1024 resolution minimum, no black borders, and enforced diversity. - Detailed directory structure and phase documentation for transparency and reproducibility.

元数据

Slug lora-pipeline

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题

Lora Pipeline 是什么？

Manages end-to-end LoRA training: collects and verifies photos, scrapes datasets, applies quality checks, captions, and trains the LoRA model locally. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 265 次。

如何安装 Lora Pipeline？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install lora-pipeline」即可一键安装，无需额外配置。

Lora Pipeline 是免费的吗？

是的，Lora Pipeline 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Lora Pipeline 支持哪些平台？

Lora Pipeline 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Lora Pipeline？

由 iskWang（@iskwang）开发并维护，当前版本 v1.0.0。