← 返回 Skills 市场
diamond2nv

Clearml Metrics Logging Pattern

作者 diamond2nv · GitHub ↗ · v0.5.0 · MIT-0
cross-platform ✓ 安全检测通过
38
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install clearml-metrics-logging-pattern
功能描述
Standardized ClearML metrics logging patterns for PDEBench experiment scripts — train loss, validation metrics, competition scores, PDE residual, and TensorB...
使用说明 (SKILL.md)

ClearML Metrics Logging Pattern

When to Use

  • Creating or modifying PDEBench training/evaluation scripts
  • Adding clearml logging to train_task1.py, train_task1_phys.py, train_task1_ft.py, train_task1_unroll.py
  • Ensuring expflow (single-node + distributed) can auto-capture metrics
  • Standardizing metric naming for compare-scores and gating

Installation

pip install "expflow-pde[clearml]"

Standardized Metric Naming Convention

All clearml metrics use Group/Metric naming, compatible with expflow clearml compare-scores:

# Loss group — error/cost related scalars
clearml_logger.report_scalar('Loss', 'Train MSE',     float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Val MSE',       float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Val RelMSE',    float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Physics',       float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Commut',        float_val, iteration=epoch)
clearml_logger.report_scalar('Loss', 'Stability',     float_val, iteration=epoch)

# Score group — competition segment scores (100-point scale)
clearml_logger.report_scalar('Score', 'Seg Total',    float_val, iteration=epoch)
clearml_logger.report_scalar('Score', 'Seg1',         float_val, iteration=epoch)
clearml_logger.report_scalar('Score', 'Seg2',         float_val, iteration=epoch)
clearml_logger.report_scalar('Score', 'Seg3',         float_val, iteration=epoch)

# PDE group — PDE residuals (per-segment)
clearml_logger.report_scalar('PDE', 'Mean Residual',  float_val, iteration=epoch)
clearml_logger.report_scalar('PDE', 'Seg1 Residual',  float_val, iteration=epoch)
clearml_logger.report_scalar('PDE', 'Seg2 Residual',  float_val, iteration=epoch)
clearml_logger.report_scalar('PDE', 'Seg3 Residual',  float_val, iteration=epoch)

# System group — system monitoring
clearml_logger.report_scalar('System', 'GPU Alloc MB',   float_val, iteration=epoch)
clearml_logger.report_scalar('System', 'GPU Reserved MB', float_val, iteration=epoch)
clearml_logger.report_scalar('System', 'LR',              float_val, iteration=epoch)

# Kfold group — k-fold cross-validation results
clearml_logger.report_scalar('Kfold', 'Mean Seg',    float_val, iteration=0)
clearml_logger.report_scalar('Kfold', 'Std Seg',     float_val, iteration=0)
clearml_logger.report_scalar('Kfold', 'CV Seg%',     float_val, iteration=0)

Code Templates

Template A: Add clearml logging to training loop

Insert into existing train_task1.py / train_task1_phys.py / train_task1_ft.py / train_task1_unroll.py:

# After Task.init(), get logger
clearml_logger = None
if clearml_task is not None:
    try:
        clearml_logger = clearml_task.get_logger()
    except Exception:
        pass

# At end of epoch loop (after avg_loss is computed)
if clearml_logger is not None:
    clearml_logger.report_scalar('Loss', 'Train MSE', avg_loss, iteration=epoch + 1)
    clearml_logger.report_scalar('System', 'LR', scheduler.get_last_lr()[0], iteration=epoch + 1)
    if DEVICE.type == 'cuda':
        clearml_logger.report_scalar('System', 'GPU Alloc MB', round(gpu_alloc, 1), iteration=epoch + 1)

# After validation (after val_mse, val_rel, seg are computed)
if clearml_logger is not None:
    clearml_logger.report_scalar('Loss', 'Val MSE', val_mse, iteration=epoch + 1)
    clearml_logger.report_scalar('Loss', 'Val RelMSE', val_rel, iteration=epoch + 1)
    clearml_logger.report_scalar('Score', 'Seg Total', seg['total_segmented_score'], iteration=epoch + 1)
    clearml_logger.report_scalar('Score', 'Seg1', seg['seg1_score'], iteration=epoch + 1)
    clearml_logger.report_scalar('Score', 'Seg2', seg['seg2_score'], iteration=epoch + 1)
    clearml_logger.report_scalar('Score', 'Seg3', seg['seg3_score'], iteration=epoch + 1)

# For physics loss (train_task1_phys.py)
if clearml_logger is not None and phys_loss is not None:
    clearml_logger.report_scalar('Loss', 'Physics', phys_loss.item(), iteration=epoch + 1)

Template B: Eval script clearml logging

def run_eval_and_log(model, val_data, cl_task, tag):
    clearml_logger = cl_task.get_logger() if cl_task is not None else None
    val_mse, val_rel, seg_scores = evaluate_autoregressive(model, val_data)

    if clearml_logger is not None:
        clearml_logger.report_scalar('Score', 'Seg Total', seg_scores['total_segmented_score'], iteration=1)
        clearml_logger.report_scalar('Score', 'Seg1', seg_scores['seg1_score'], iteration=1)
        clearml_logger.report_scalar('Score', 'Seg2', seg_scores['seg2_score'], iteration=1)
        clearml_logger.report_scalar('Score', 'Seg3', seg_scores['seg3_score'], iteration=1)
        clearml_logger.report_scalar('Loss', 'Val MSE', val_mse, iteration=1)
        clearml_logger.report_scalar('Loss', 'Val RelMSE', val_rel, iteration=1)

    return val_mse, val_rel, seg_scores

Template C: Double Logger (TensorBoardX + ClearML)

class DoubleLogger:
    def __init__(self, tb_writer=None, cl_logger=None):
        self.tb = tb_writer
        self.cl = cl_logger

    def scalar(self, group, name, value, iteration):
        if self.tb is not None:
            self.tb.add_scalar(f'{group}/{name}', value, iteration)
        if self.cl is not None:
            self.cl.report_scalar(group, name, value, iteration=iteration)

Consistency with expflow

  • Group names match compare-scores display names
  • Metric names match STANDARD_METRICS keys (via underscore)
  • iteration must increment monotonically (clearml x-axis requirement)
  • Single-value eval metrics use iteration=1

Known Pitfalls

  1. Task.get_logger() must be called after Task.init(), otherwise returns None
  2. capture_tensorboard=True — TensorBoardX and clearml dual-write works, but clearml adds TensorBoard path prefix to group names
  3. Distributed metrics are stored per-trial — parent optuna study only stores user_objective, not aggregated trial metrics
  4. Group + Metric name must be consistent — always Score/Seg Total, never Score/Seg_Total
安全使用建议
This result is low confidence because local artifact inspection failed with the available execution environment. Review metadata.json and artifact contents before installing, especially for broad file access, credential use, persistence, or automatic commands.
能力评估
Purpose & Capability
No artifact evidence was available to show a purpose or capability mismatch.
Instruction Scope
No artifact evidence was available to show unsafe or hidden runtime instructions.
Install Mechanism
No artifact evidence was available to show a risky install mechanism.
Credentials
No artifact evidence was available to show disproportionate environment access.
Persistence & Privilege
No artifact evidence was available to show persistence, privilege abuse, or credential misuse.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install clearml-metrics-logging-pattern
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /clearml-metrics-logging-pattern 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.5.0
Version 0.5.0 - Adds SKILL.md documentation, detailing standardized ClearML metrics logging patterns for PDEBench experiment scripts. - Provides installation instructions and code templates for integrating ClearML logging in training and evaluation scripts. - Defines consistent metric group/naming conventions for loss, score, PDE residual, system monitoring, and k-fold results. - Includes guidance for combining TensorBoardX and ClearML loggers. - Documents compatibility notes and common pitfalls for proper usage in PDEBench and expflow environments.
元数据
Slug clearml-metrics-logging-pattern
版本 0.5.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Clearml Metrics Logging Pattern 是什么?

Standardized ClearML metrics logging patterns for PDEBench experiment scripts — train loss, validation metrics, competition scores, PDE residual, and TensorB... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 38 次。

如何安装 Clearml Metrics Logging Pattern?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install clearml-metrics-logging-pattern」即可一键安装,无需额外配置。

Clearml Metrics Logging Pattern 是免费的吗?

是的,Clearml Metrics Logging Pattern 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Clearml Metrics Logging Pattern 支持哪些平台?

Clearml Metrics Logging Pattern 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Clearml Metrics Logging Pattern?

由 diamond2nv(@diamond2nv)开发并维护,当前版本 v0.5.0。

💬 留言讨论