← 返回 Skills 市场
diamond2nv

expflow Pipeline HPO

作者 diamond2nv · GitHub ↗ · v0.5.0 · MIT-0
cross-platform ✓ 安全检测通过
49
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install expflow-pipeline-hpo
功能描述
PDEBench competition workflow orchestration with expflow — three pipeline modes (full/fast/skip), distributed HPO, pruner integration, and ClearML HyperParam...
使用说明 (SKILL.md)

\r \r

expflow PDEBench Pipeline & HPO\r

\r Orchestrate experiment workflows for the AI4S PDE competition using expflow.\r Three modes for three competition phases.\r \r

Triggers\r

\r

  • User says "run HPO", "submit pipeline", "distributed experiment"\r
  • User says "competition sprint" or "fast iterate"\r
  • User asks about automating the train→eval→submit loop\r
  • User mentions needing to find best hyperparams\r \r

Installation\r

\r

pip install "expflow-pde[pipeline]"\r
```\r
\r
## Available Pipeline Modes\r
\r
Three pipeline modes, each mapped to a CLI command:\r
\r
### Mode A — Full (HPO → Train → Eval)\r
\r
For the **exploration phase** of a competition task. Optuna finds best params\r
via distributed clearml-agent trials, trains with best, then evaluates.\r
\r
```bash\r
expflow pipeline submit-full train_task1.py \\r
    --queue default \\r
    --trials 50 --parallel 4 \\r
    --eval-script eval_task1.py \\r
    --metric seg_total --direction maximize\r
```\r
\r
Flags used:\r
- `--trials N`: total HPO trials\r
- `--parallel M`: max concurrent trials (use GPU node count)\r
- `--metric`: objective metric name prefixed `METRIC:` in script stdout\r
- `--pruner hyperband|median|percentile`: early-stop poor trials\r
- `--study-name`: Optuna study name (auto if omitted; persists to SQLite)\r
- `--skip hpo --skip eval`: run train only within full skeleton\r
\r
### Mode B — Fast (Train → Eval)\r
\r
For the **competition sprint** phase. You already know best params. Skip HPO,\r
run directly with fixed args.\r
\r
```bash\r
expflow pipeline submit train_task1.py \\r
    --queue default \\r
    --train-param lr=0.001 --train-param epochs=80 \\r
    --eval-script eval_task1.py \\r
    --eval-param sub_step=5\r
```\r
\r
Flags:\r
- `--skip eval`: train-only (just submit checkpoint)\r
- `--train-param key=val`: injected as `--key=val` to training script\r
- `--eval-param key=val`: injected as `--key=val` to eval script\r
\r
### Mode C — Flexible Skip\r
\r
Override step inclusion on either mode:\r
\r
```bash\r
expflow pipeline submit-full train_task1.py \\r
    --skip hpo --skip eval          # = train only\r
expflow pipeline submit-full train_task1.py \\r
    --skip train --skip eval         # = HPO only\r
```\r
\r
## HPO: Three Execution Modes\r
\r
HPO (`expflow optuna run`) has three backends:\r
\r
| Mode | Flag | Description | Best for |\r
|------|------|-------------|----------|\r
| Local | (default) | subprocess serial on CPU | ≤20 trials, quick test |\r
| Distributed | `--distributed` | ask/tell + clearml Task clone| Multi-GPU, custom control|\r
| Optimizer | `--optimizer -O` | Clearml `HyperParameterOptimizer` | Production, 50-200+ trials |\r
\r
### Key flags across all HPO modes:\r
- `--pruner hyperband|median|percentile|none`: ASHA pruner saves ~40% GPU time\r
- `--metric \x3Cname>`: reads `METRIC:\x3Cname>=\x3Cvalue>` from script stdout\r
- `--direction maximize|minimize`\r
- `--timeout \x3Cmin>`: safety cutoff\r
\r
## Script Requirements\r
\r
The training/eval script must:\r
1. Accept hyperparams as `--key=value` CLI arguments\r
2. Output `METRIC:\x3Cname>=\x3Cvalue>` to stdout for objective capture (local mode)\r
3. Report clearml scalars for distributed/optimizer mode:\r
   ```python\r
   Task.current_task().report_scalar("Score", "seg_total", value, iteration=epoch)\r
   ```\r
\r
## Pitfalls\r
\r
- **Pruner needs `trial.report()` calls during training.** If the script only reports at the end, the pruner has nothing to prune on. Call `trial.report(val_loss, epoch)` at least every 10 epochs.\r
- **HyperParameterOptimizer needs the metric name in `Title/Series` format.** If your metric is `seg_total`, it becomes `title=seg_total, series=seg_total`. If your clearml report_scalar is `report_scalar("Score", "seg_total", v)`, pass `--metric Score/seg_total`.\r
- **Clearml-agent must be running on GPU nodes** before submitting. Verify with `expflow clearml workers` or check Web UI.\r
- **`_collect_one_trial` polls every 5s** — waits up to 60min per trial. If trials are expected to run longer, increase `timeout_minutes`.\r
\r
## Architecture Reference\r
\r
Key files in `expflow_pde/`:\r
- `hpo.py` — 3-mode HPO runner (local/distributed/optimizer)\r
- `pipeline.py` — ExperimentPipeline class (fast/full modes)\r
- `cli_pipeline.py` — `pipeline submit` + `pipeline submit-full`\r
- `cli_optuna.py` — `optuna run` with all three backends\r
\r
## Related\r
\r
- `experiment-lifecycle-governance` — PIN, metrics registry, compare-scores, competition rules audit\r
- `pde-experiment-hyperparameters` — PDEBench-specific hyperparameter reference\r
- `multi-agent-distributed-experiment-workflow` — Hermes → OpenCode → clearml\r
安全使用建议
Treat this review as incomplete because local artifact reads failed; review metadata.json and the artifact directory before installing.
能力评估
Purpose & Capability
No reviewed artifact evidence showed a purpose-capability mismatch.
Instruction Scope
No reviewed artifact evidence showed hidden or overbroad runtime instructions.
Install Mechanism
No reviewed artifact evidence showed a risky install mechanism.
Credentials
No reviewed artifact evidence showed disproportionate environment access.
Persistence & Privilege
No reviewed artifact evidence showed persistence or privilege abuse.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install expflow-pipeline-hpo
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /expflow-pipeline-hpo 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.5.0
expflow-pipeline-hpo 0.5.0 - Adds robust PDEBench competition pipeline orchestration via expflow with three selectable modes: Full (HPO→Train→Eval), Fast (Train→Eval), and Flexible Skip. - Integrates distributed hyperparameter optimization (HPO) with pruner support and native ClearML HyperParameterOptimizer. - CLI supports custom step skipping, dynamic parameter injection, and three HPO execution backends (local, distributed, optimizer). - Enhances script compatibility requirements, pruner integration details, and error-proofing for distributed workflows. - Documentation updated with detailed usage, CLI flags, and troubleshooting tips for efficient competition submissions.
元数据
Slug expflow-pipeline-hpo
版本 0.5.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

expflow Pipeline HPO 是什么?

PDEBench competition workflow orchestration with expflow — three pipeline modes (full/fast/skip), distributed HPO, pruner integration, and ClearML HyperParam... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 49 次。

如何安装 expflow Pipeline HPO?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install expflow-pipeline-hpo」即可一键安装,无需额外配置。

expflow Pipeline HPO 是免费的吗?

是的,expflow Pipeline HPO 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

expflow Pipeline HPO 支持哪些平台?

expflow Pipeline HPO 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 expflow Pipeline HPO?

由 diamond2nv(@diamond2nv)开发并维护,当前版本 v0.5.0。

💬 留言讨论