← Back to Skills Marketplace

Gpu Container Setup Flagos

Name: Gpu Container Setup Flagos
Author: wbavon

by Flagos · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install gpu-container-setup-flagos

Description

Automatically detect GPU vendor, find appropriate PyTorch container image, launch with correct mounts, and validate GPU functionality. Supports NVIDIA, Ascen...

README (SKILL.md)

GPU Container Setup Skill

This skill automates multi-vendor GPU container setup for PyTorch workloads.

Supported GPU Vendors

Vendor	PyTorch Backend	Detection
NVIDIA	CUDA	`nvidia-smi`
AMD	ROCm (HIP)	`rocm-smi`, `/opt/rocm`
Ascend	torch_npu	`npu-smi`, `/usr/local/Ascend`
Metax	torch_musa	`mx-smi`, `/opt/metax`
Iluvatar	torch_corex	`ixsmi`, `/opt/iluvatar`

Execution Flow

When invoked, follow these steps:

Step 1: Parse Arguments

Check if user provided:

--vendor \x3Cname> - Force specific vendor (skip detection)
--image \x3Cimage> - Force specific container image
--data \x3Cpath> - Force specific data mount path
--name \x3Cname> - Container name (default: pytorch-gpu)

Step 2: Detect GPU Vendor

Run the detection script:

python3 .claude/skills/gpu-container-setup/scripts/detect_gpu.py

Expected output:

{"vendor": "ascend", "devices": ["Ascend 910B"], "count": 8}

If detection fails and no --vendor flag provided, ask user which vendor to use.

Step 3: Find Data Disk

Run the data disk detection:

python3 .claude/skills/gpu-container-setup/scripts/find_data_disk.py

Expected output:

{"data_disk": "/mnt/data", "found": true, "size": "2.0T", "available": "1.5T"}

If no suitable disk found, ask user for data mount path.

Step 4: Find Container Image

Follow strict priority order (only proceed to next if current fails):

1. Primary Vendor Hub (hardcoded) → 2. BAAI Harbor → 3. Web Search → 4. Local Images → 5. Ask User

Step 4.1: Primary Vendor Hub (hardcoded URLs)

Vendor	Registry	API/Query
NVIDIA	`nvcr.io`	`https://api.ngc.nvidia.com/v2/repos/nvidia/pytorch/tags`
Ascend	`ascendhub.huawei.com`	Portal: https://ascendhub.huawei.com
Metax	`registry.metax-tech.com`	`https://registry.metax-tech.com/v2/pytorch/metax-pytorch/tags/list`
Iluvatar	`hub.iluvatar.com`	`https://hub.iluvatar.com/v2/pytorch/iluvatar-pytorch/tags/list`
AMD	`docker.io` (rocm/pytorch)	`https://hub.docker.com/v2/repositories/rocm/pytorch/tags`

# Example: Query NGC for latest NVIDIA PyTorch
TAG=$(curl -s "https://api.ngc.nvidia.com/v2/repos/nvidia/pytorch/tags" | jq -r '.tags[].name' | grep -E '^[0-9]{2}\.[0-9]{2}-py3$' | sort -rV | head -1)
IMAGE="nvcr.io/nvidia/pytorch:${TAG}"

Step 4.2: BAAI Harbor (fallback)

Only if Step 4.1 fails (unreachable, no image, pull fails).

# Query BAAI Harbor
curl -s "https://harbor.baai.ac.cn/api/v2.0/projects/flagrelease-public/repositories?page_size=100" | jq -r '.[].name' | grep "flagrelease-\x3Cvendor>"

Step 4.3: Web Search (fallback)

Only if Steps 4.1 and 4.2 fail. Search for "\x3Cvendor> pytorch docker official".

Step 4.4: Local Images (fallback)

Only if Steps 4.1-4.3 fail. Check docker images | grep pytorch.

Test Before Use

docker pull "${IMAGE}" && docker run --rm "${IMAGE}" python -c "import torch; print(torch.__version__)"

If test fails, try next source. If all fail, ask user for image.

Step 4.5: Update Skill (self-improvement)

IMPORTANT: If image found via Web Search (Step 4.3) passes all tests, update references/image-sources.md to add the newly discovered vendor hub as a primary source. This makes future lookups faster.

# After successful web search discovery:
# 1. Verify image works (pull + pytorch test + GPU test)
# 2. Extract registry URL pattern
# 3. Update references/image-sources.md Step 1 section with new vendor hub

Step 5: Build Docker Command

Refer to references/mount-requirements.md for vendor-specific requirements.

NVIDIA:

docker run -d --gpus all \
  --name pytorch-gpu \
  --shm-size=16g \
  -v \x3Cdata_disk>:/data \
  \x3Cimage> sleep infinity

AMD/ROCm:

docker run -d \
  --device=/dev/kfd --device=/dev/dri \
  --group-add video --group-add render \
  --name pytorch-gpu \
  --shm-size=16g \
  -v \x3Cdata_disk>:/data \
  \x3Cimage> sleep infinity

Ascend:

docker run -d \
  --device=/dev/davinci0 --device=/dev/davinci1 ... \
  --device=/dev/davinci_manager \
  --device=/dev/devmm_svm \
  --device=/dev/hisi_hdc \
  -v /usr/local/Ascend:/usr/local/Ascend:ro \
  -v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi:ro \
  --name pytorch-gpu \
  --shm-size=16g \
  -v \x3Cdata_disk>:/data \
  \x3Cimage> sleep infinity

Metax:

docker run -d \
  --device=/dev/mx0 --device=/dev/mx1 ... \
  -v /opt/metax:/opt/metax:ro \
  --name pytorch-gpu \
  --shm-size=16g \
  -v \x3Cdata_disk>:/data \
  \x3Cimage> sleep infinity

Iluvatar:

docker run -d \
  --device=/dev/bi0 --device=/dev/bi1 ... \
  -v /opt/iluvatar:/opt/iluvatar:ro \
  --name pytorch-gpu \
  --shm-size=16g \
  -v \x3Cdata_disk>:/data \
  \x3Cimage> sleep infinity

Step 6: Start Container

Execute the docker run command. If container with same name exists:

Check if it's running - offer to use existing or replace
If stopped - offer to restart or replace

Step 7: Validate PyTorch GPU

Copy and run validation script inside container:

docker cp .claude/skills/gpu-container-setup/scripts/validate_pytorch.py pytorch-gpu:/tmp/
docker exec pytorch-gpu python3 /tmp/validate_pytorch.py

Expected output:

{
  "status": "PASS",
  "backend": "npu",
  "device_count": 8,
  "device_names": ["Ascend 910B", ...],
  "tests": {
    "device_detection": true,
    "tensor_creation": true,
    "matrix_multiply": true,
    "gpu_to_cpu_transfer": true
  }
}

Step 8: Report Results

Summarize to user:

GPU vendor and devices detected
Container name and image used
Data mount path
Validation status
How to access: docker exec -it pytorch-gpu bash

Error Handling

Error	Action
No GPU detected	Ask user for vendor or check drivers
Image pull fails	Try alternative registry or web search
Container start fails	Check device permissions, show error
Validation fails	Show detailed error, suggest fixes

Reference Files

references/gpu-detection.md - Detection methods by vendor
references/image-sources.md - Image discovery guide (registry APIs, priority order, selection criteria)
references/mount-requirements.md - Vendor mount specifications

Example Usage

User: /gpu-container-setup
User: setup a pytorch container
User: start container with ascend GPU
User: /gpu-container-setup --image nvcr.io/nvidia/pytorch:24.01-py3
User: /gpu-container-setup --image harbor.baai.ac.cn/flagrelease-public/ngctorch:2601

Usage Guidance

Treat this as an incomplete review: the command runner failed before metadata.json or artifact files could be inspected, so installation should wait for a successful artifact review.

Capability Assessment

ℹ Purpose & Capability

Artifact review was blocked by sandbox execution failure, so purpose and capabilities could not be confirmed from metadata.json or artifact files.

ℹ Instruction Scope

Instruction scope could not be evaluated from artifact text because local inspection commands failed before file contents could be read.

ℹ Install Mechanism

Install mechanism could not be evaluated because artifact files were not accessible through the available command runner.

ℹ Credentials

Environment access could not be assessed from evidence; no artifact-backed mismatch was available.

ℹ Persistence & Privilege

Persistence or privilege behavior could not be assessed from artifact evidence; no concrete risky behavior was observed.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install gpu-container-setup-flagos
After installation, invoke the skill by name or use /gpu-container-setup-flagos
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

gpu-container-setup-flagos v1.0.0 - Initial release: automates setup of PyTorch containers, auto-detecting GPU vendor and configuring appropriate images, mounts, and validations. - Supports NVIDIA, AMD/ROCm, Ascend, Metax, and Iluvatar GPUs. - Multi-step workflow: argument parsing, vendor/disk/image detection, container launch, and GPU validation. - Robust image source priority: primary vendor registry → BAAI Harbor → web search → local images → ask user. - Features self-updating mechanism to improve image sources discovered via web search. - Includes detailed error handling and vendor-specific container requirements.

Metadata

Slug gpu-container-setup-flagos

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Gpu Container Setup Flagos?

Automatically detect GPU vendor, find appropriate PyTorch container image, launch with correct mounts, and validate GPU functionality. Supports NVIDIA, Ascen... It is an AI Agent Skill for Claude Code / OpenClaw, with 74 downloads so far.

How do I install Gpu Container Setup Flagos?

Run "/install gpu-container-setup-flagos" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Gpu Container Setup Flagos free?

Yes, Gpu Container Setup Flagos is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Gpu Container Setup Flagos support?

Gpu Container Setup Flagos is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Gpu Container Setup Flagos?

It is built and maintained by Flagos (@wbavon); the current version is v1.0.0.

More Skills

Gpu Container Setup Flagos

GPU Container Setup Skill

Supported GPU Vendors

Execution Flow

Step 1: Parse Arguments

Step 2: Detect GPU Vendor

Step 3: Find Data Disk

Step 4: Find Container Image

Step 4.1: Primary Vendor Hub (hardcoded URLs)

Step 4.2: BAAI Harbor (fallback)

Step 4.3: Web Search (fallback)

Step 4.4: Local Images (fallback)

Test Before Use

Step 4.5: Update Skill (self-improvement)

Step 5: Build Docker Command

Step 6: Start Container

Step 7: Validate PyTorch GPU

Step 8: Report Results

Error Handling

Reference Files

Example Usage

What is Gpu Container Setup Flagos?

How do I install Gpu Container Setup Flagos?

Is Gpu Container Setup Flagos free?

Which platforms does Gpu Container Setup Flagos support?

Who created Gpu Container Setup Flagos?

💬 Comments