← 返回 Skills 市场
huaweiclouddev

huawei-cloud-ascend-command

作者 huaweicloud-skills-team · GitHub ↗ · v0.0.1 · MIT-0
cross-platform ⚠ suspicious
36
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install huawei-cloud-ascend-command
功能描述
Huawei Ascend NPU natural language management skill, supporting both local direct connection and SSH remote modes. Provides comprehensive npu-smi command cap...
使用说明 (SKILL.md)

Provides Huawei Ascend NPU natural language command translation and execution. Supports full npu-smi commands: device queries, configuration management, firmware upgrade, vNPU virtualization, certificate management, and compute power testing.

Overview

This skill provides natural language management for Huawei Ascend NPU devices via npu-smi commands. Supports both local direct execution and SSH remote modes.

Related Skills:

  • huawei-cloud-ascend-remote-connect - General SSH remote operations (disk, Docker, logs, processes)
  • npu-smi-skill - Alternative NPU management implementation
  • model-deploy-test-skill - Model deployment and testing on Ascend DevServer

Capabilities:

  • Device queries (health, temperature, power, HBM, ECC, topology)
  • Configuration management (ECC, fan mode, fan speed)
  • Firmware upgrade and status
  • vNPU virtualization
  • Certificate management
  • Compute power testing (FLOPS)

Architecture

System Architecture Diagram

┌─────────────────────────────────────────────────────────────────────┐
│                         Agent Orchestration                         │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │  1. Parse user intent → Match Trigger keywords                        │    │
│  │  2. Prepare params → host, user, password explicit                 │    │
│  │  3. Invoke Skill → skill exec --name ascend-command            │    │
│  │  4. Return result → User                                          │    │
│  └────────────────────────────┬────────────────────────────────┘    │
│                               │ Explicit param passing (Rule 1)                │
│                               ▼                                     │
├─────────────────────────────────────────────────────────────────────┤
│                        User Interaction Layer                       │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │  Natural Language Commands / CLI Arguments                  │    │
│  │  (NPU health check, temperature, firmware upgrade)          │    │
│  └────────────────────────────┬────────────────────────────────┘    │
│                               │                                     │
│                               ▼                                     │
├─────────────────────────────────────────────────────────────────────┤
│                      Skill Core Components (Stateless (Rule 2))           │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐      │
│  │ Command Executor│←→│   NPU Client    │←→│ Trend Recorder  │      │
│  │  - NL Parsing   │  │  - npu-smi cmd  │  │  - Metric Log   │      │
│  │  - Command Route│  │  - Local/SSH    │  │  - Trend Analysis│      │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘      │
│         │                      │                                     │
│         ▼                      ▼                                     │
├─────────────────────────────────────────────────────────────────────┤
│                    Huawei Cloud Ascend Infrastructure               │
│  Data Flow: Agent → Skill(host,user,pwd) → SSH → npu-smi → Response │
└─────────────────────────────────────────────────────────────────────┘

Agent Orchestration Flow

User request: "Query NPU status on 116.204.23.145"
         │
         ▼
┌─────────────────────────────────────────────────────────────┐
│ Agent parses intent                                               │
│   - Keyword "NPU" → Match ascend-command                       │
│   - IP address → SSH params needed                                  │
└─────────────────────────────────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────────────────────────────┐
│ Agent prepares params (explicit, Rule 1)                              │
│   --host 116.204.23.145                                      │
│   --user root                                                │
│   --password ***                                             │
│   --command "NPU list"                                       │
└─────────────────────────────────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────────────────────────────┐
│ Skill executes (Stateless, Rule 2)                                    │
│   - NpuClient(host, user, password) explicit receive                 │
│   - Execute npu-smi command                                        │
│   - Return result                                                 │
└─────────────────────────────────────────────────────────────┘
         │
         ▼
      Return to user

Related Skills (Agent orchestrated, no direct call, Rule 3)

Skill Purpose Orchestration Scenario
huawei-cloud-ascend-remote-connect SSH remote ops Agent selects when user needs disk/Docker/logs
npu-smi-skill NPU management Alternative impl, Agent selects by scenario
model-deploy-test-skill Model deploy Use ascend-command to monitor NPU after deploy

Note: No direct calls between Skills, orchestrated by Agent based on user intent.

Prerequisites

System Requirements

  • Python 3.8+
  • Pure Python standard library implementation, no additional dependencies required

Environment Check

Prerequisite check: Python3 required

python3 --version  # Python3 >= 3.8

Authentication

Security rules (must be followed):

  • Prohibited from reading, echoing, or printing password values
  • Prohibited from asking the user to input passwords directly in the conversation
  • Only allowed to read credentials from command line arguments

Parameter Confirmation

Input Parameters

Parameter Required Description
command Yes (one-shot) Natural language command to execute
host No SSH remote host IP address
user No SSH username
password No SSH password

Confirmation Requirements

The following operations require explicit user confirmation:

  • ECC enable/disable
  • Fan mode changes
  • Firmware upgrade
  • vNPU creation/deletion
  • Certificate management

Core Workflow

Task 1: One-Shot Command Execution

python3 scripts/main.py --command "NPU health check"

Task 2: SSH Remote Execution

python3 scripts/main.py --command "NPU list" --host 192.168.1.100 --user root --password xxx

Task 3: Interactive Mode

python3 scripts/main.py

Usage Instructions

Device Queries

List all NPU devices
NPU health check
Get NPU temperature
Get NPU power usage
Get NPU memory info

Configuration Management

Set ECC mode
Set fan mode

Firmware Upgrade

Query firmware version
Upgrade firmware

Virtualization

Query vNPU info
Create vNPU

FLOPS Test

Run FLOPS test
Run FP32 FLOPS test
Run multi-device FLOPS test

Output Format

Standard Response Format

All command outputs follow a structured format:

{
  "status": "success",
  "command": "npu-smi info -l",
  "description": "List all NPU devices"
}

Error Response Format

[ERROR] \x3Cerror-code>
Message: \x3Cerror-description>
Suggestion: \x3Ctroubleshooting-tip>

Verification Method

Basic Verification Steps

  1. Environment Check

    python3 --version  # Verify Python 3.8+
    
  2. Local Command Test

    python3 scripts/main.py --command "NPU list"
    
  3. Remote Command Test

    python3 scripts/main.py --host \x3Cascend-ip> --user root --password \x3Cpwd> --command "NPU health check"
    

Expected Results

Test Case Expected Output
Environment check Python version >= 3.8
Local command NPU device information or "npu-smi not found"
Remote command NPU status from remote server

Script Files

Entry File

  • main.py: Skill entry file (required)
    • Function: Provide interactive mode, one-shot mode, SSH remote mode
    • Supports: Command line arguments, interactive prompt

Core Scripts

  • executor.py: NPU command executor (main entry)

    • Function: Natural language parsing, command routing, sensitive operation confirmation
    • Supported commands: Device queries, configuration, firmware upgrade, virtualization, certificate management
    • Core methods: handle_command(text), confirm_sensitive()
  • npu_client.py: NPU client

    • Function: Wraps npu-smi command execution, supports local and SSH remote execution
    • Core methods: list_devices(), get_health(), get_temperature(), get_power(), get_memory(), set_ecc_enable(), upgrade_query(), create_vnpu(), get_tls_cert()
    • Optimization: SSH ControlMaster, batch execution, parallel FLOPS testing

Command Translation Mapping

Device Queries

Natural Language Command Description
List all NPU devices npu-smi info -l List all NPU devices
NPU card info npu-smi info -m Card and chip mapping
NPU info npu-smi info Full NPU information

Health Check

Natural Language Command Description
NPU health check npu-smi info -t health -i 0 Check device health
Check NPU status npu-smi info -t health -i 0 Check device health

Temperature/Power/Memory

Natural Language Command Description
NPU temperature npu-smi info -t temp -i 0 -c 0 Get temperature
NPU power npu-smi info -t power -i 0 -c 0 Get power usage
NPU memory npu-smi info -t memory -i 0 -c 0 Get memory info
NPU utilization npu-smi info -t usages -i 0 -c 0 Get utilization

Configuration

Natural Language Command Description
Set ECC npu-smi set -t ecc-enable -i 0 -c 0 -d 1 Enable ECC (requires confirmation)
Set fan mode npu-smi set -t pwm-mode -d 0 Set fan mode (requires confirmation)

Firmware Upgrade

Natural Language Command Description
Firmware version npu-smi upgrade -b mcu -i 0 Query firmware version
Upgrade firmware npu-smi upgrade -t mcu -i 0 -f file.hpm Upload firmware (requires confirmation)

Virtualization

Natural Language Command Description
vNPU info npu-smi info -t info-vnpu -i 0 -c 0 Query vNPU info
Template info npu-smi info -t template-info -i 0 Query template info
Create vNPU npu-smi set -t create-vnpu -i 0 -c 0 -f vir04 Create vNPU (requires confirmation)

Certificate Management

Natural Language Command Description
Certificate info npu-smi info -t tls-cert -i 0 -c 0 View TLS certificate
Certificate expiration npu-smi info -t tls-cert-period -i 0 -c 0 Query certificate expiration threshold

FLOPS Test (ascend-dmi)

Default behavior: single device FP16, fast return (~2 seconds)

Natural Language Command Description
FLOPS test ascend-dmi -f -d 0 -t fp16 -q Default: NPU 0, FP16
NPU FLOPS ascend-dmi -f -d 0 -t fp16 -q Same as above
NPU 3 FLOPS ascend-dmi -f -d 3 -t fp16 -q Specify NPU 3
FP32 FLOPS ascend-dmi -f -d 0 -t fp32 -q FP32 precision
All devices FLOPS 8 devices async parallel All NPUs tested simultaneously

Execution strategy:

  • Single device: Direct execution, ~2 seconds return
  • Multi-device: Async parallel execution, total ~2 seconds (not serial 16 seconds)
  • Precision: Default fp16, supports fp32/bf16/int8/hf32

Performance Optimization

Optimization Description Effect
SSH persistent connection ControlMaster reuse Connection time ~0.1s
Batch execution execute_batch() for multiple commands Reduce SSH round trips
Parallel query test_flops_parallel() 8-device async test 16s -> 2s
Single parse parse_full_info() get all card info at once O(N) -> O(1)

Troubleshooting

Command Not Recognized

  • Check if input contains supported keywords
  • Try using more explicit commands

SSH Connection Failed

  • Verify SSH parameters are correct
  • Ensure SSH service is running on target machine

npu-smi Not Available

  • Ensure CANN and npu-smi are installed on target machine
  • Check LD_LIBRARY_PATH environment variable

Notes

Security Warnings

  • Sensitive operations require user confirmation before execution
  • Passwords are only stored in memory during session
  • Blocked commands cannot be bypassed

Limitations

  • Requires npu-smi or ascend-dmi installed on target machine
  • SSH password authentication only (no interactive password prompt)

Best Practices

Best Practices

  1. Connection Mode Selection

    • Local direct: For NPU installed on local machine
    • SSH remote: For managing remote Ascend servers
  2. Command Execution Tips

    • Sensitive ops (firmware upgrade, config change) require user confirmation
    • Test on single card before batch operations
  3. Performance Monitoring

    • Regularly check NPU temperature and power
    • Monitor HBM usage to avoid OOM
  4. Troubleshooting

    • First check NPU health: npu-smi info
    • View detailed error log: npu-smi info -t board -i \x3Cnpu_id>
    • See troubleshooting.md
  5. Trigger Keywords

    • Chinese: NPU, 昇腾, 显存, 算力, 温度, 功耗, 固件, 虚拟化
    • English: npu, ascend, temperature, power, HBM, firmware, vNPU, FLOPS

Directory Structure

huawei-cloud-ascend-command/
├── SKILL.md                    # Skill description (required)
├── scripts/                    # Scripts directory (required)
│   ├── __init__.py             # Module export
│   ├── main.py                 # Skill entry file (required)
│   ├── executor.py             # NPU command executor
│   └── npu_client.py           # NPU client
└── references/                 # Reference documentation
    ├── acceptance-criteria.md  # Acceptance criteria
    ├── verification-method.md  # Verification steps
    ├── troubleshooting.md      # Troubleshooting guide
    ├── device-queries.md       # Device queries reference
    ├── configuration.md        # Configuration reference
    ├── firmware-upgrade.md     # Firmware upgrade reference
    ├── virtualization.md       # Virtualization reference
    └── certificate-management.md # Certificate management

References

Document Description
acceptance-criteria.md Acceptance criteria for skill validation
verification-method.md Verification steps and test methods
troubleshooting.md Troubleshooting guide and common issues
device-queries.md Device queries command reference
configuration.md Configuration management reference
firmware-upgrade.md Firmware upgrade reference
virtualization.md Virtualization management reference
certificate-management.md Certificate management reference

Technical Support

安全使用建议
Install only if you intentionally need an agent to administer Huawei Ascend NPUs. Prefer read-only use, SSH keys and least-privilege accounts, explicit device IDs, and manual review before any firmware, certificate, vNPU, ECC, fan, reset, or raw npu-smi command. Avoid command-line passwords and do not follow the chmod 666 troubleshooting step on shared or production systems.
能力评估
Purpose & Capability
The stated purpose covers broad NPU administration, including local and SSH remote execution, configuration, firmware, vNPU, certificates, and FLOPS testing; however the artifacts also expose direct npu-smi passthrough and a generic batch execution helper that can run arbitrary command strings on local or remote hosts.
Instruction Scope
Many triggers are broad, and the direct npu-smi path bypasses the natural-language confirmation flow, so sensitive operations can be reached without the same guardrails documented for ECC, fan, firmware, vNPU, and certificate changes.
Install Mechanism
The skill appears to run from source with no hidden installer or persistence, but documentation inconsistently claims no external dependencies while SSH mode imports paramiko.
Credentials
SSH remote administration, root examples, disabled host-key verification via AutoAddPolicy, hardware configuration changes, firmware actions, and troubleshooting guidance such as chmod 666 on device nodes are high-impact for shared or production Ascend hosts.
Persistence & Privilege
No long-running persistence was found, but the skill handles privileged SSH sessions and accepts passwords on the command line, which can expose credentials through process listings, shell history, logs, or orchestration metadata.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install huawei-cloud-ascend-command
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /huawei-cloud-ascend-command 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.0.1
Initial release – Provides natural language management for Huawei Ascend NPU devices: - Supports full npu-smi command set: device queries, configuration management, firmware upgrade, vNPU virtualization, certificate management, and compute power testing. - Enables both local execution and SSH remote management for real-time device monitoring and control. - Responds to a wide range of trigger keywords for intuitive user interaction. - Requires explicit parameter input for remote operations; prohibits handling passwords in conversation. - Core tasks include device health checks, resource monitoring, configuration changes (with confirmation), and compute power tests. - No external dependencies; pure Python 3.8+ implementation.
元数据
Slug huawei-cloud-ascend-command
版本 0.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

huawei-cloud-ascend-command 是什么?

Huawei Ascend NPU natural language management skill, supporting both local direct connection and SSH remote modes. Provides comprehensive npu-smi command cap... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 36 次。

如何安装 huawei-cloud-ascend-command?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install huawei-cloud-ascend-command」即可一键安装,无需额外配置。

huawei-cloud-ascend-command 是免费的吗?

是的,huawei-cloud-ascend-command 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

huawei-cloud-ascend-command 支持哪些平台?

huawei-cloud-ascend-command 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 huawei-cloud-ascend-command?

由 huaweicloud-skills-team(@huaweiclouddev)开发并维护,当前版本 v0.0.1。

💬 留言讨论