← Back to Skills Marketplace

expflow Pipeline HPO

Name: expflow Pipeline HPO
Author: diamond2nv

by diamond2nv · GitHub ↗ · v0.5.0 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install expflow-pipeline-hpo

Description

PDEBench competition workflow orchestration with expflow — three pipeline modes (full/fast/skip), distributed HPO, pruner integration, and ClearML HyperParam...

README (SKILL.md)

\r \r

expflow PDEBench Pipeline & HPO\r

\r Orchestrate experiment workflows for the AI4S PDE competition using expflow.\r Three modes for three competition phases.\r \r

Triggers\r

User says "run HPO", "submit pipeline", "distributed experiment"\r
User says "competition sprint" or "fast iterate"\r
User asks about automating the train→eval→submit loop\r
User mentions needing to find best hyperparams\r \r

Installation\r

pip install "expflow-pde[pipeline]"\r
```\r
\r
## Available Pipeline Modes\r
\r
Three pipeline modes, each mapped to a CLI command:\r
\r
### Mode A — Full (HPO → Train → Eval)\r
\r
For the **exploration phase** of a competition task. Optuna finds best params\r
via distributed clearml-agent trials, trains with best, then evaluates.\r
\r
```bash\r
expflow pipeline submit-full train_task1.py \\r
    --queue default \\r
    --trials 50 --parallel 4 \\r
    --eval-script eval_task1.py \\r
    --metric seg_total --direction maximize\r
```\r
\r
Flags used:\r
- `--trials N`: total HPO trials\r
- `--parallel M`: max concurrent trials (use GPU node count)\r
- `--metric`: objective metric name prefixed `METRIC:` in script stdout\r
- `--pruner hyperband|median|percentile`: early-stop poor trials\r
- `--study-name`: Optuna study name (auto if omitted; persists to SQLite)\r
- `--skip hpo --skip eval`: run train only within full skeleton\r
\r
### Mode B — Fast (Train → Eval)\r
\r
For the **competition sprint** phase. You already know best params. Skip HPO,\r
run directly with fixed args.\r
\r
```bash\r
expflow pipeline submit train_task1.py \\r
    --queue default \\r
    --train-param lr=0.001 --train-param epochs=80 \\r
    --eval-script eval_task1.py \\r
    --eval-param sub_step=5\r
```\r
\r
Flags:\r
- `--skip eval`: train-only (just submit checkpoint)\r
- `--train-param key=val`: injected as `--key=val` to training script\r
- `--eval-param key=val`: injected as `--key=val` to eval script\r
\r
### Mode C — Flexible Skip\r
\r
Override step inclusion on either mode:\r
\r
```bash\r
expflow pipeline submit-full train_task1.py \\r
    --skip hpo --skip eval          # = train only\r
expflow pipeline submit-full train_task1.py \\r
    --skip train --skip eval         # = HPO only\r
```\r
\r
## HPO: Three Execution Modes\r
\r
HPO (`expflow optuna run`) has three backends:\r
\r
| Mode | Flag | Description | Best for |\r
|------|------|-------------|----------|\r
| Local | (default) | subprocess serial on CPU | ≤20 trials, quick test |\r
| Distributed | `--distributed` | ask/tell + clearml Task clone| Multi-GPU, custom control|\r
| Optimizer | `--optimizer -O` | Clearml `HyperParameterOptimizer` | Production, 50-200+ trials |\r
\r
### Key flags across all HPO modes:\r
- `--pruner hyperband|median|percentile|none`: ASHA pruner saves ~40% GPU time\r
- `--metric \x3Cname>`: reads `METRIC:\x3Cname>=\x3Cvalue>` from script stdout\r
- `--direction maximize|minimize`\r
- `--timeout \x3Cmin>`: safety cutoff\r
\r
## Script Requirements\r
\r
The training/eval script must:\r
1. Accept hyperparams as `--key=value` CLI arguments\r
2. Output `METRIC:\x3Cname>=\x3Cvalue>` to stdout for objective capture (local mode)\r
3. Report clearml scalars for distributed/optimizer mode:\r
   ```python\r
   Task.current_task().report_scalar("Score", "seg_total", value, iteration=epoch)\r
   ```\r
\r
## Pitfalls\r
\r
- **Pruner needs `trial.report()` calls during training.** If the script only reports at the end, the pruner has nothing to prune on. Call `trial.report(val_loss, epoch)` at least every 10 epochs.\r
- **HyperParameterOptimizer needs the metric name in `Title/Series` format.** If your metric is `seg_total`, it becomes `title=seg_total, series=seg_total`. If your clearml report_scalar is `report_scalar("Score", "seg_total", v)`, pass `--metric Score/seg_total`.\r
- **Clearml-agent must be running on GPU nodes** before submitting. Verify with `expflow clearml workers` or check Web UI.\r
- **`_collect_one_trial` polls every 5s** — waits up to 60min per trial. If trials are expected to run longer, increase `timeout_minutes`.\r
\r
## Architecture Reference\r
\r
Key files in `expflow_pde/`:\r
- `hpo.py` — 3-mode HPO runner (local/distributed/optimizer)\r
- `pipeline.py` — ExperimentPipeline class (fast/full modes)\r
- `cli_pipeline.py` — `pipeline submit` + `pipeline submit-full`\r
- `cli_optuna.py` — `optuna run` with all three backends\r
\r
## Related\r
\r
- `experiment-lifecycle-governance` — PIN, metrics registry, compare-scores, competition rules audit\r
- `pde-experiment-hyperparameters` — PDEBench-specific hyperparameter reference\r
- `multi-agent-distributed-experiment-workflow` — Hermes → OpenCode → clearml\r

Usage Guidance

Treat this review as incomplete because local artifact reads failed; review metadata.json and the artifact directory before installing.

Capability Assessment

✓ Purpose & Capability

No reviewed artifact evidence showed a purpose-capability mismatch.

✓ Instruction Scope

No reviewed artifact evidence showed hidden or overbroad runtime instructions.

✓ Install Mechanism

No reviewed artifact evidence showed a risky install mechanism.

✓ Credentials

No reviewed artifact evidence showed disproportionate environment access.

✓ Persistence & Privilege

No reviewed artifact evidence showed persistence or privilege abuse.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install expflow-pipeline-hpo
After installation, invoke the skill by name or use /expflow-pipeline-hpo
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.5.0

expflow-pipeline-hpo 0.5.0 - Adds robust PDEBench competition pipeline orchestration via expflow with three selectable modes: Full (HPO→Train→Eval), Fast (Train→Eval), and Flexible Skip. - Integrates distributed hyperparameter optimization (HPO) with pruner support and native ClearML HyperParameterOptimizer. - CLI supports custom step skipping, dynamic parameter injection, and three HPO execution backends (local, distributed, optimizer). - Enhances script compatibility requirements, pruner integration details, and error-proofing for distributed workflows. - Documentation updated with detailed usage, CLI flags, and troubleshooting tips for efficient competition submissions.

Metadata

Slug expflow-pipeline-hpo

Version 0.5.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is expflow Pipeline HPO?

PDEBench competition workflow orchestration with expflow — three pipeline modes (full/fast/skip), distributed HPO, pruner integration, and ClearML HyperParam... It is an AI Agent Skill for Claude Code / OpenClaw, with 49 downloads so far.

How do I install expflow Pipeline HPO?

Run "/install expflow-pipeline-hpo" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is expflow Pipeline HPO free?

Yes, expflow Pipeline HPO is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does expflow Pipeline HPO support?

expflow Pipeline HPO is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created expflow Pipeline HPO?

It is built and maintained by diamond2nv (@diamond2nv); the current version is v0.5.0.

More Skills