← Back to Skills Marketplace

huawei-cloud-ascend-op-mfu-calculator

Name: huawei-cloud-ascend-op-mfu-calculator
Author: huaweiclouddev

by huaweicloud-skills-team · GitHub ↗ · v0.0.2 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install huawei-cloud-ascend-op-mfu-calculator

Description

Calculate MFU (Machine FLOP Utilization) for operators like matmul/GEMM/FlashAttention on Ascend NPU, providing clear formulas and derivation process Use thi...

README (SKILL.md)

Huawei Cloud Ascend Operator MFU Calculator

Overview

This skill calculates MFU (Machine FLOP Utilization) for operators like matmul/GEMM/FlashAttention on Ascend NPU, providing clear formulas and derivation process.

Architecture: Input Validation → FLOPs Calculation → Achieved TFLOPs/s → MFU Calculation → Result Analysis

Related Skills:

huawei-cloud-ascend-profiler-db-explorer - Profiling data analysis for operator performance data

Prerequisites

Python 3.8+ installed
Basic understanding of FLOPs calculation concepts

Usage Scenarios

Typical Problem Scenarios:

Evaluating how well an operator utilizes Ascend NPU compute power
Comparing performance of different operator implementations
Identifying optimization opportunities for matrix operations

Typical User Utterances:

"Calculate MFU for my GEMM operator"
"What's the machine FLOP utilization for FlashAttention?"
"Analyze my matmul operator performance efficiency"

Workflow

Input Collection: Gather operator parameters (matrix dimensions, data types, execution time)
FLOPs Calculation: Compute theoretical FLOPs for the operation
Achieved Performance: Calculate achieved TFLOPs/s from execution time
MFU Calculation: Apply formula MFU = Achieved FLOPs / Peak FLOPs
Result Analysis: Provide interpretation and optimization suggestions

MFU Calculation Formula

MFU = (Achieved FLOPs / Peak FLOPs) × 100%

Where:

Achieved FLOPs = Operation FLOPs / Execution Time
Peak FLOPs = Hardware-specific peak performance (e.g., Ascend 910B: 256 TFLOPs for FP16)

Reference Documents

Document	Description
Ascend 910B Series Technical Specifications	Official Ascend 910B series product specifications
MFU Calculation Methodology	Detailed MFU calculation formulas and examples
FlashAttention Technical Paper	Original FlashAttention research paper

Enhanced Features

Intelligent Bottleneck Diagnoser

AI-powered bottleneck diagnosis that analyzes profiling data to identify root causes automatically
Classifies bottlenecks into categories: memory-bound, compute-bound, communication-bound, or operator-fallback
Provides actionable optimization recommendations with priority ranking
Includes pattern matching for known performance anti-patterns

Parameter Confirmation

Parameter	Description	Required
operator	Operator type (matmul/flash_attention/gemm, etc.)	Yes
flops	Theoretical FLOPs of the operator	Yes
time_ms	Operator execution time (milliseconds)	Yes
peak_tflops	Hardware peak computing power (TFLOPS)	Yes
device	NPU device type (910B/910, etc.)	No

Usage Guidance

Install this if you need Ascend MFU calculation guidance. Treat hardware peak numbers and formulas as analysis aids to verify against current official documentation, and only provide profiler CSV files you intend the assistant to analyze.

Capability Assessment

✓ Purpose & Capability

The skill purpose, references, and examples all align around calculating MFU for matmul, GEMM, and FlashAttention performance analysis on Ascend NPUs.

ℹ Instruction Scope

The skill declares python3 as an allowed tool and includes examples that may read a user-provided profiling CSV, which is purpose-aligned but should be used only with intended performance data.

✓ Install Mechanism

The artifact contains only markdown files and references; there are no executable scripts, package install hooks, dependencies, or setup commands.

✓ Credentials

Requested capability is proportionate for arithmetic and CSV-based performance calculations; there is no request for credentials, account access, network calls, broad filesystem access, or data mutation.

✓ Persistence & Privilege

No persistence, background workers, privilege escalation, credential/session handling, or long-running execution is present in the artifacts.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install huawei-cloud-ascend-op-mfu-calculator
After installation, invoke the skill by name or use /huawei-cloud-ascend-op-mfu-calculator
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.0.2

No changes detected in this version. - No file changes or updates were made compared to the previous version.

v0.0.1

- Initial release of the Huawei Cloud Ascend Operator MFU Calculator skill. - Calculates MFU (Machine FLOP Utilization) for operators such as MatMul, GEMM, and FlashAttention on Ascend NPUs. - Provides step-by-step calculation: input parameters → FLOPs calculation → achieved performance → MFU computation → result analysis. - Includes intelligent bottleneck diagnosis with optimization suggestions and pattern recognition. - Offers clear usage scenarios, formula documentation, and hardware-specific references for Ascend devices.

Metadata

Slug huawei-cloud-ascend-op-mfu-calculator

Version 0.0.2

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is huawei-cloud-ascend-op-mfu-calculator?

Calculate MFU (Machine FLOP Utilization) for operators like matmul/GEMM/FlashAttention on Ascend NPU, providing clear formulas and derivation process Use thi... It is an AI Agent Skill for Claude Code / OpenClaw, with 38 downloads so far.

How do I install huawei-cloud-ascend-op-mfu-calculator?

Run "/install huawei-cloud-ascend-op-mfu-calculator" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is huawei-cloud-ascend-op-mfu-calculator free?

Yes, huawei-cloud-ascend-op-mfu-calculator is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does huawei-cloud-ascend-op-mfu-calculator support?

huawei-cloud-ascend-op-mfu-calculator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created huawei-cloud-ascend-op-mfu-calculator?

It is built and maintained by huaweicloud-skills-team (@huaweiclouddev); the current version is v0.0.2.

More Skills