← Back to Skills Marketplace
diamond2nv

expflow Pipeline HPO

by diamond2nv · GitHub ↗ · v0.5.0 · MIT-0
cross-platform ✓ Security Clean
49
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install expflow-pipeline-hpo
Description
PDEBench competition workflow orchestration with expflow — three pipeline modes (full/fast/skip), distributed HPO, pruner integration, and ClearML HyperParam...
README (SKILL.md)

\r \r

expflow PDEBench Pipeline & HPO\r

\r Orchestrate experiment workflows for the AI4S PDE competition using expflow.\r Three modes for three competition phases.\r \r

Triggers\r

\r

  • User says "run HPO", "submit pipeline", "distributed experiment"\r
  • User says "competition sprint" or "fast iterate"\r
  • User asks about automating the train→eval→submit loop\r
  • User mentions needing to find best hyperparams\r \r

Installation\r

\r

pip install "expflow-pde[pipeline]"\r
```\r
\r
## Available Pipeline Modes\r
\r
Three pipeline modes, each mapped to a CLI command:\r
\r
### Mode A — Full (HPO → Train → Eval)\r
\r
For the **exploration phase** of a competition task. Optuna finds best params\r
via distributed clearml-agent trials, trains with best, then evaluates.\r
\r
```bash\r
expflow pipeline submit-full train_task1.py \\r
    --queue default \\r
    --trials 50 --parallel 4 \\r
    --eval-script eval_task1.py \\r
    --metric seg_total --direction maximize\r
```\r
\r
Flags used:\r
- `--trials N`: total HPO trials\r
- `--parallel M`: max concurrent trials (use GPU node count)\r
- `--metric`: objective metric name prefixed `METRIC:` in script stdout\r
- `--pruner hyperband|median|percentile`: early-stop poor trials\r
- `--study-name`: Optuna study name (auto if omitted; persists to SQLite)\r
- `--skip hpo --skip eval`: run train only within full skeleton\r
\r
### Mode B — Fast (Train → Eval)\r
\r
For the **competition sprint** phase. You already know best params. Skip HPO,\r
run directly with fixed args.\r
\r
```bash\r
expflow pipeline submit train_task1.py \\r
    --queue default \\r
    --train-param lr=0.001 --train-param epochs=80 \\r
    --eval-script eval_task1.py \\r
    --eval-param sub_step=5\r
```\r
\r
Flags:\r
- `--skip eval`: train-only (just submit checkpoint)\r
- `--train-param key=val`: injected as `--key=val` to training script\r
- `--eval-param key=val`: injected as `--key=val` to eval script\r
\r
### Mode C — Flexible Skip\r
\r
Override step inclusion on either mode:\r
\r
```bash\r
expflow pipeline submit-full train_task1.py \\r
    --skip hpo --skip eval          # = train only\r
expflow pipeline submit-full train_task1.py \\r
    --skip train --skip eval         # = HPO only\r
```\r
\r
## HPO: Three Execution Modes\r
\r
HPO (`expflow optuna run`) has three backends:\r
\r
| Mode | Flag | Description | Best for |\r
|------|------|-------------|----------|\r
| Local | (default) | subprocess serial on CPU | ≤20 trials, quick test |\r
| Distributed | `--distributed` | ask/tell + clearml Task clone| Multi-GPU, custom control|\r
| Optimizer | `--optimizer -O` | Clearml `HyperParameterOptimizer` | Production, 50-200+ trials |\r
\r
### Key flags across all HPO modes:\r
- `--pruner hyperband|median|percentile|none`: ASHA pruner saves ~40% GPU time\r
- `--metric \x3Cname>`: reads `METRIC:\x3Cname>=\x3Cvalue>` from script stdout\r
- `--direction maximize|minimize`\r
- `--timeout \x3Cmin>`: safety cutoff\r
\r
## Script Requirements\r
\r
The training/eval script must:\r
1. Accept hyperparams as `--key=value` CLI arguments\r
2. Output `METRIC:\x3Cname>=\x3Cvalue>` to stdout for objective capture (local mode)\r
3. Report clearml scalars for distributed/optimizer mode:\r
   ```python\r
   Task.current_task().report_scalar("Score", "seg_total", value, iteration=epoch)\r
   ```\r
\r
## Pitfalls\r
\r
- **Pruner needs `trial.report()` calls during training.** If the script only reports at the end, the pruner has nothing to prune on. Call `trial.report(val_loss, epoch)` at least every 10 epochs.\r
- **HyperParameterOptimizer needs the metric name in `Title/Series` format.** If your metric is `seg_total`, it becomes `title=seg_total, series=seg_total`. If your clearml report_scalar is `report_scalar("Score", "seg_total", v)`, pass `--metric Score/seg_total`.\r
- **Clearml-agent must be running on GPU nodes** before submitting. Verify with `expflow clearml workers` or check Web UI.\r
- **`_collect_one_trial` polls every 5s** — waits up to 60min per trial. If trials are expected to run longer, increase `timeout_minutes`.\r
\r
## Architecture Reference\r
\r
Key files in `expflow_pde/`:\r
- `hpo.py` — 3-mode HPO runner (local/distributed/optimizer)\r
- `pipeline.py` — ExperimentPipeline class (fast/full modes)\r
- `cli_pipeline.py` — `pipeline submit` + `pipeline submit-full`\r
- `cli_optuna.py` — `optuna run` with all three backends\r
\r
## Related\r
\r
- `experiment-lifecycle-governance` — PIN, metrics registry, compare-scores, competition rules audit\r
- `pde-experiment-hyperparameters` — PDEBench-specific hyperparameter reference\r
- `multi-agent-distributed-experiment-workflow` — Hermes → OpenCode → clearml\r
Usage Guidance
Treat this review as incomplete because local artifact reads failed; review metadata.json and the artifact directory before installing.
Capability Assessment
Purpose & Capability
No reviewed artifact evidence showed a purpose-capability mismatch.
Instruction Scope
No reviewed artifact evidence showed hidden or overbroad runtime instructions.
Install Mechanism
No reviewed artifact evidence showed a risky install mechanism.
Credentials
No reviewed artifact evidence showed disproportionate environment access.
Persistence & Privilege
No reviewed artifact evidence showed persistence or privilege abuse.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install expflow-pipeline-hpo
  3. After installation, invoke the skill by name or use /expflow-pipeline-hpo
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.5.0
expflow-pipeline-hpo 0.5.0 - Adds robust PDEBench competition pipeline orchestration via expflow with three selectable modes: Full (HPO→Train→Eval), Fast (Train→Eval), and Flexible Skip. - Integrates distributed hyperparameter optimization (HPO) with pruner support and native ClearML HyperParameterOptimizer. - CLI supports custom step skipping, dynamic parameter injection, and three HPO execution backends (local, distributed, optimizer). - Enhances script compatibility requirements, pruner integration details, and error-proofing for distributed workflows. - Documentation updated with detailed usage, CLI flags, and troubleshooting tips for efficient competition submissions.
Metadata
Slug expflow-pipeline-hpo
Version 0.5.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is expflow Pipeline HPO?

PDEBench competition workflow orchestration with expflow — three pipeline modes (full/fast/skip), distributed HPO, pruner integration, and ClearML HyperParam... It is an AI Agent Skill for Claude Code / OpenClaw, with 49 downloads so far.

How do I install expflow Pipeline HPO?

Run "/install expflow-pipeline-hpo" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is expflow Pipeline HPO free?

Yes, expflow Pipeline HPO is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does expflow Pipeline HPO support?

expflow Pipeline HPO is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created expflow Pipeline HPO?

It is built and maintained by diamond2nv (@diamond2nv); the current version is v0.5.0.

💬 Comments