← Back to Skills Marketplace
diamond2nv

Clearml Metrics Logging Pattern

by diamond2nv · GitHub ↗ · v0.5.0 · MIT-0
cross-platform ✓ Security Clean
38
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install clearml-metrics-logging-pattern
Description
Standardized ClearML metrics logging patterns for PDEBench experiment scripts — train loss, validation metrics, competition scores, PDE residual, and TensorB...
README (SKILL.md)

ClearML Metrics Logging Pattern

When to Use

  • Creating or modifying PDEBench training/evaluation scripts
  • Adding clearml logging to train_task1.py, train_task1_phys.py, train_task1_ft.py, train_task1_unroll.py
  • Ensuring expflow (single-node + distributed) can auto-capture metrics
  • Standardizing metric naming for compare-scores and gating

Installation

pip install "expflow-pde[clearml]"

Standardized Metric Naming Convention

All clearml metrics use Group/Metric naming, compatible with expflow clearml compare-scores:

# Loss group — error/cost related scalars
clearml_logger.report_scalar('Loss', 'Train MSE',     float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Val MSE',       float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Val RelMSE',    float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Physics',       float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Commut',        float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Stability',     float_val, iteration=epoch)

# Score group — competition segment scores (100-point scale)
clearml_logger.report_scalar('Score', 'Seg Total',    float_val, iteration=epoch)
clearml_logger.report_scalar('Score', 'Seg1',         float_val, iteration=epoch)
clearml_logger.report_scalar('Score', 'Seg2',         float_val, iteration=epoch)
clearml_logger.report_scalar('Score', 'Seg3',         float_val, iteration=epoch)

# PDE group — PDE residuals (per-segment)
clearml_logger.report_scalar('PDE', 'Mean Residual',  float_val, iteration=epoch)
clearml_logger.report_scalar('PDE', 'Seg1 Residual',  float_val, iteration=epoch)
clearml_logger.report_scalar('PDE', 'Seg2 Residual',  float_val, iteration=epoch)
clearml_logger.report_scalar('PDE', 'Seg3 Residual',  float_val, iteration=epoch)

# System group — system monitoring
clearml_logger.report_scalar('System', 'GPU Alloc MB',   float_val, iteration=epoch)
clearml_logger.report_scalar('System', 'GPU Reserved MB', float_val, iteration=epoch)
clearml_logger.report_scalar('System', 'LR',              float_val, iteration=epoch)

# Kfold group — k-fold cross-validation results
clearml_logger.report_scalar('Kfold', 'Mean Seg',    float_val, iteration=0)
clearml_logger.report_scalar('Kfold', 'Std Seg',     float_val, iteration=0)
clearml_logger.report_scalar('Kfold', 'CV Seg%',     float_val, iteration=0)

Code Templates

Template A: Add clearml logging to training loop

Insert into existing train_task1.py / train_task1_phys.py / train_task1_ft.py / train_task1_unroll.py:

# After Task.init(), get logger
clearml_logger = None
if clearml_task is not None:
    try:
        clearml_logger = clearml_task.get_logger()
    except Exception:
        pass

# At end of epoch loop (after avg_loss is computed)
if clearml_logger is not None:
    clearml_logger.report_scalar('Loss', 'Train MSE', avg_loss, iteration=epoch + 1)
    clearml_logger.report_scalar('System', 'LR', scheduler.get_last_lr()[0], iteration=epoch + 1)
    if DEVICE.type == 'cuda':
        clearml_logger.report_scalar('System', 'GPU Alloc MB', round(gpu_alloc, 1), iteration=epoch + 1)

# After validation (after val_mse, val_rel, seg are computed)
if clearml_logger is not None:
    clearml_logger.report_scalar('Loss', 'Val MSE', val_mse, iteration=epoch + 1)
    clearml_logger.report_scalar('Loss', 'Val RelMSE', val_rel, iteration=epoch + 1)
    clearml_logger.report_scalar('Score', 'Seg Total', seg['total_segmented_score'], iteration=epoch + 1)
    clearml_logger.report_scalar('Score', 'Seg1', seg['seg1_score'], iteration=epoch + 1)
    clearml_logger.report_scalar('Score', 'Seg2', seg['seg2_score'], iteration=epoch + 1)
    clearml_logger.report_scalar('Score', 'Seg3', seg['seg3_score'], iteration=epoch + 1)

# For physics loss (train_task1_phys.py)
if clearml_logger is not None and phys_loss is not None:
    clearml_logger.report_scalar('Loss', 'Physics', phys_loss.item(), iteration=epoch + 1)

Template B: Eval script clearml logging

def run_eval_and_log(model, val_data, cl_task, tag):
    clearml_logger = cl_task.get_logger() if cl_task is not None else None
    val_mse, val_rel, seg_scores = evaluate_autoregressive(model, val_data)

    if clearml_logger is not None:
        clearml_logger.report_scalar('Score', 'Seg Total', seg_scores['total_segmented_score'], iteration=1)
        clearml_logger.report_scalar('Score', 'Seg1', seg_scores['seg1_score'], iteration=1)
        clearml_logger.report_scalar('Score', 'Seg2', seg_scores['seg2_score'], iteration=1)
        clearml_logger.report_scalar('Score', 'Seg3', seg_scores['seg3_score'], iteration=1)
        clearml_logger.report_scalar('Loss', 'Val MSE', val_mse, iteration=1)
        clearml_logger.report_scalar('Loss', 'Val RelMSE', val_rel, iteration=1)

    return val_mse, val_rel, seg_scores

Template C: Double Logger (TensorBoardX + ClearML)

class DoubleLogger:
    def __init__(self, tb_writer=None, cl_logger=None):
        self.tb = tb_writer
        self.cl = cl_logger

    def scalar(self, group, name, value, iteration):
        if self.tb is not None:
            self.tb.add_scalar(f'{group}/{name}', value, iteration)
        if self.cl is not None:
            self.cl.report_scalar(group, name, value, iteration=iteration)

Consistency with expflow

  • Group names match compare-scores display names
  • Metric names match STANDARD_METRICS keys (via underscore)
  • iteration must increment monotonically (clearml x-axis requirement)
  • Single-value eval metrics use iteration=1

Known Pitfalls

  1. Task.get_logger() must be called after Task.init(), otherwise returns None
  2. capture_tensorboard=True — TensorBoardX and clearml dual-write works, but clearml adds TensorBoard path prefix to group names
  3. Distributed metrics are stored per-trial — parent optuna study only stores user_objective, not aggregated trial metrics
  4. Group + Metric name must be consistent — always Score/Seg Total, never Score/Seg_Total
Usage Guidance
This result is low confidence because local artifact inspection failed with the available execution environment. Review metadata.json and artifact contents before installing, especially for broad file access, credential use, persistence, or automatic commands.
Capability Assessment
Purpose & Capability
No artifact evidence was available to show a purpose or capability mismatch.
Instruction Scope
No artifact evidence was available to show unsafe or hidden runtime instructions.
Install Mechanism
No artifact evidence was available to show a risky install mechanism.
Credentials
No artifact evidence was available to show disproportionate environment access.
Persistence & Privilege
No artifact evidence was available to show persistence, privilege abuse, or credential misuse.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install clearml-metrics-logging-pattern
  3. After installation, invoke the skill by name or use /clearml-metrics-logging-pattern
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.5.0
Version 0.5.0 - Adds SKILL.md documentation, detailing standardized ClearML metrics logging patterns for PDEBench experiment scripts. - Provides installation instructions and code templates for integrating ClearML logging in training and evaluation scripts. - Defines consistent metric group/naming conventions for loss, score, PDE residual, system monitoring, and k-fold results. - Includes guidance for combining TensorBoardX and ClearML loggers. - Documents compatibility notes and common pitfalls for proper usage in PDEBench and expflow environments.
Metadata
Slug clearml-metrics-logging-pattern
Version 0.5.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Clearml Metrics Logging Pattern?

Standardized ClearML metrics logging patterns for PDEBench experiment scripts — train loss, validation metrics, competition scores, PDE residual, and TensorB... It is an AI Agent Skill for Claude Code / OpenClaw, with 38 downloads so far.

How do I install Clearml Metrics Logging Pattern?

Run "/install clearml-metrics-logging-pattern" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Clearml Metrics Logging Pattern free?

Yes, Clearml Metrics Logging Pattern is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Clearml Metrics Logging Pattern support?

Clearml Metrics Logging Pattern is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Clearml Metrics Logging Pattern?

It is built and maintained by diamond2nv (@diamond2nv); the current version is v0.5.0.

💬 Comments