← 返回 Skills 市场

Vllm Plugin Fl Setup Flagos

Name: Vllm Plugin Fl Setup Flagos
Author: wbavon

作者 Flagos · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install vllm-plugin-fl-setup-flagos

功能描述

Install and configure vLLM-Plugin-FL for multiple hardware backends including NVIDIA, Ascend and etc. Use when setting up vllm-plugin-fl, configuring the env...

使用说明 (SKILL.md)

vLLM-Plugin-FL Setup

Overview

vLLM-Plugin-FL extends vLLM to support model inference/serving across diverse hardware backends (NVIDIA, Ascend, MetaX, Iluvatar, etc.) via FlagOS's unified operator library FlagGems and communication library FlagCX. This skill covers installation, hardware-specific environment configuration, and dependency setup.

Prerequisites

Linux OS (Ubuntu 20.04+ recommended)
Python 3.10+
vLLM v0.13.0 — install from the official v0.13.0 release or the fork vllm-FL
GPU with appropriate drivers (NVIDIA CUDA, Huawei Ascend, etc.)
pip package manager
Git

Verify vLLM version before proceeding:

python -c "import vllm; print(vllm.__version__)"
# Expected output: 0.13.0

Installation Workflow

Step 1: Identify Hardware Backend

# NVIDIA GPU
nvidia-smi

# Huawei NPU
npu-smi info

# Moore Threads GPU
mthreads-gmi

# Iluvatar GPU
ixsmi

Step 2: Install vLLM-Plugin-FL

First create a workspace directory and try cloning the source code:

mkdir -p ~/flagos-workspace && cd ~/flagos-workspace
git clone https://github.com/flagos-ai/vllm-plugin-FL

If git clone fails due to network issues, ask the user for their network proxy settings (e.g. http_proxy / https_proxy), configure the proxy, then retry the clone.

Then install from the source directory:

cd vllm-plugin-FL
pip install -r requirements.txt
pip install --no-build-isolation .
# Required to enable vLLM-Plugin-FL when running vLLM
export VLLM_PLUGINS='fl'

Verify vLLM-Plugin-FL installation:

python -c "import vllm_fl; print('vllm-plugin-FL installed successfully')"

Step 3: Install FlagGems

Ascend NPU users: Before installing FlagGems, you must first install FlagTree. See references/npu.md and complete the FlagTree installation step there before proceeding. Otherwise the FlagGems verification will fail repeatedly and keep reinstalling Triton.

# Install build dependencies
pip install -U scikit-build-core==0.11 pybind11 ninja cmake

# Clone FlagGems source code
cd ~/flagos-workspace
git clone https://github.com/flagos-ai/FlagGems

If git clone fails due to network issues, ask the user for their network proxy settings (e.g. http_proxy / https_proxy), configure the proxy, then retry the clone.

Then install from the source directory:

cd FlagGems
pip install --no-build-isolation .

Verify FlagGems installation:

python -c "import flag_gems; print('FlagGems installed successfully')"

Step 4: (Optional) Install FlagCX

FlagCX is a unified communication library for multi-device distributed inference, supporting both homogeneous and heterogeneous setups. Skip this step if running on a single device.

Note: Ascend NPU does not need FlagCX — skip this step for Ascend backends.

cd ~/flagos-workspace
git clone https://github.com/flagos-ai/FlagCX.git

If git clone fails due to network issues, ask the user for their network proxy settings (e.g. http_proxy / https_proxy), configure the proxy, then retry the clone.

Then build and install from the source directory:

cd FlagCX

git submodule update --init --recursive

# Build for your platform (e.g. USE_NVIDIA=1 for NVIDIA)
make USE_NVIDIA=1

export FLAGCX_PATH="$PWD"

# Install Python binding (replace [xxx] with your platform: nvidia, ascend, etc.)
cd plugin/torch/
FLAGCX_ADAPTOR=[xxx] pip install --no-build-isolation .

Verify FlagCX installation:

python -c "import flagcx; print('FlagCX installed successfully')"

Step 5: Backend-Specific Setup

Some hardware backends require additional setup. See the corresponding reference document:

Backend	Chip Vendor	Reference
Ascend NPU	Huawei	references/npu.md
MetaX GPU	MetaX	TBD
Iluvatar GPU (BI-V150)	Iluvatar	references/iluvatar_gpu.md
Pingtouge-Zhenwu	Pingtouge	TBD
Tsingmicro	Tsingmicro	TBD
Moore Threads GPU	Moore Threads	references/mthreads_gpu.md
Hygon DCU	Hygon	TBD

Quick Test

Ask the user for the model name they want to test (e.g. Qwen3-4B, DeepSeek-R1).

Search the machine for a local copy of that model:

find / -maxdepth 5 -type d -name "\x3Cuser_provided_model_name>" 2>/dev/null

If found, use the discovered path. If not found, tell the user and ask them to provide a different model name or a full local path, then repeat the search. If after 3 attempts no valid model is found, skip the quick test and inform the user to prepare a model before retrying.

Ensure the FL plugin is enabled before running inference:

export VLLM_PLUGINS='fl'

For Moore Threads GPU, also set:

export USE_FLAGGEMS=1
export FLAGCX_PATH=/workspace/FlagCX  # MUST point to the actual FlagCX installation directory; this is only an example
export VLLM_MUSA_ENABLE_MOE_TRITON=1

Once a valid model path is resolved, run offline batched inference to verify the full stack:

from vllm import LLM, SamplingParams

model_path = "\x3Cresolved_model_path>"
prompts = [
    "Hello, my name is",
]
sampling_params = SamplingParams(max_tokens=10, temperature=0.0)

# For Moore Threads GPU, add: enforce_eager=True, block_size=64, attention_config={"backend": "TORCH_SDPA"}
# For Iluvatar BI-V150, add: enforce_eager=True
llm = LLM(model=model_path, max_num_batched_tokens=16384, max_num_seqs=2048)
outputs = llm.generate(prompts, sampling_params)
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

Troubleshooting

Out of memory on model load: Use gpu_memory_utilization parameter to limit memory. Start with 0.8 and adjust:

from vllm import LLM
llm = LLM(model="...", gpu_memory_utilization=0.8)

FlagGems build failures: Ensure build dependencies are installed (scikit-build-core, pybind11, ninja, cmake). Check that your compiler supports C++17.

Plugin not loaded: If vLLM does not use the FL plugin, verify that VLLM_PLUGINS='fl' is set in your environment.

FlagCX communication errors: Ensure FLAGCX_PATH is correctly set and the library was built for your platform. For NVIDIA, verify with make USE_NVIDIA=1.

Ascend-specific issues: See references/npu.md for Ascend NPU troubleshooting, including FlagTree setup and eager execution requirements.

Cannot connect to GitHub: Ask the user for their network proxy settings (e.g. http_proxy / https_proxy), configure the proxy, then retry the git clone command.

References

vLLM-Plugin-FL GitHub
FlagGems GitHub
FlagCX GitHub
For non-NVIDIA chips, refer to the references directory for hardware-specific configurations and setup instructions

安全使用建议

Treat this as an incomplete low-confidence review, not a clearance. Re-run ClawScan in an environment where metadata.json and artifact/ can be read before installing or publishing this skill.

能力标签

cryptocan-make-purchases

能力评估

ℹ Purpose & Capability

The requested metadata.json and artifact/ contents could not be read, so purpose and capability coherence could not be confirmed from artifacts.

ℹ Instruction Scope

Runtime instructions could not be inspected; no instruction-scope concern is supported by artifact evidence available to this review.

ℹ Install Mechanism

Install specifications could not be inspected; no install-mechanism concern is supported by artifact evidence available to this review.

ℹ Credentials

Environment access and proportionality could not be assessed from artifacts because local file reads failed before shell execution.

ℹ Persistence & Privilege

Persistence or privilege behavior could not be confirmed; no artifact-backed persistence or privilege concern was available.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install vllm-plugin-fl-setup-flagos
安装完成后，直接呼叫该 Skill 的名称或使用 /vllm-plugin-fl-setup-flagos 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of vllm-plugin-fl-setup-flagos. - Provides guided installation and configuration of vLLM-Plugin-FL, FlagGems, and FlagCX for multiple hardware backends (NVIDIA, Ascend, Moore Threads, etc.). - Suggests specific backend workflows and highlights situations such as network proxy setup and Ascend-specific requirements. - Includes troubleshooting for installation, build errors, environment variables, and backend-specific issues. - Offers quick model inference test steps to verify successful setup. - Lists reference documents for further backend and troubleshooting guidance.

元数据

Slug vllm-plugin-fl-setup-flagos

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题