← 返回 Skills 市场
wbavon

Vllm Plugin Fl Setup Flagos

作者 Flagos · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
69
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install vllm-plugin-fl-setup-flagos
功能描述
Install and configure vLLM-Plugin-FL for multiple hardware backends including NVIDIA, Ascend and etc. Use when setting up vllm-plugin-fl, configuring the env...
使用说明 (SKILL.md)

vLLM-Plugin-FL Setup

Overview

vLLM-Plugin-FL extends vLLM to support model inference/serving across diverse hardware backends (NVIDIA, Ascend, MetaX, Iluvatar, etc.) via FlagOS's unified operator library FlagGems and communication library FlagCX. This skill covers installation, hardware-specific environment configuration, and dependency setup.

Prerequisites

  • Linux OS (Ubuntu 20.04+ recommended)
  • Python 3.10+
  • vLLM v0.13.0 — install from the official v0.13.0 release or the fork vllm-FL
  • GPU with appropriate drivers (NVIDIA CUDA, Huawei Ascend, etc.)
  • pip package manager
  • Git

Verify vLLM version before proceeding:

python -c "import vllm; print(vllm.__version__)"
# Expected output: 0.13.0

Installation Workflow

Step 1: Identify Hardware Backend

# NVIDIA GPU
nvidia-smi

# Huawei NPU
npu-smi info

# Moore Threads GPU
mthreads-gmi

# Iluvatar GPU
ixsmi

Step 2: Install vLLM-Plugin-FL

First create a workspace directory and try cloning the source code:

mkdir -p ~/flagos-workspace && cd ~/flagos-workspace
git clone https://github.com/flagos-ai/vllm-plugin-FL

If git clone fails due to network issues, ask the user for their network proxy settings (e.g. http_proxy / https_proxy), configure the proxy, then retry the clone.

Then install from the source directory:

cd vllm-plugin-FL
pip install -r requirements.txt
pip install --no-build-isolation .
# Required to enable vLLM-Plugin-FL when running vLLM
export VLLM_PLUGINS='fl'

Verify vLLM-Plugin-FL installation:

python -c "import vllm_fl; print('vllm-plugin-FL installed successfully')"

Step 3: Install FlagGems

Ascend NPU users: Before installing FlagGems, you must first install FlagTree. See references/npu.md and complete the FlagTree installation step there before proceeding. Otherwise the FlagGems verification will fail repeatedly and keep reinstalling Triton.

# Install build dependencies
pip install -U scikit-build-core==0.11 pybind11 ninja cmake

# Clone FlagGems source code
cd ~/flagos-workspace
git clone https://github.com/flagos-ai/FlagGems

If git clone fails due to network issues, ask the user for their network proxy settings (e.g. http_proxy / https_proxy), configure the proxy, then retry the clone.

Then install from the source directory:

cd FlagGems
pip install --no-build-isolation .

Verify FlagGems installation:

python -c "import flag_gems; print('FlagGems installed successfully')"

Step 4: (Optional) Install FlagCX

FlagCX is a unified communication library for multi-device distributed inference, supporting both homogeneous and heterogeneous setups. Skip this step if running on a single device.

Note: Ascend NPU does not need FlagCX — skip this step for Ascend backends.

cd ~/flagos-workspace
git clone https://github.com/flagos-ai/FlagCX.git

If git clone fails due to network issues, ask the user for their network proxy settings (e.g. http_proxy / https_proxy), configure the proxy, then retry the clone.

Then build and install from the source directory:

cd FlagCX

git submodule update --init --recursive

# Build for your platform (e.g. USE_NVIDIA=1 for NVIDIA)
make USE_NVIDIA=1

export FLAGCX_PATH="$PWD"

# Install Python binding (replace [xxx] with your platform: nvidia, ascend, etc.)
cd plugin/torch/
FLAGCX_ADAPTOR=[xxx] pip install --no-build-isolation .

Verify FlagCX installation:

python -c "import flagcx; print('FlagCX installed successfully')"

Step 5: Backend-Specific Setup

Some hardware backends require additional setup. See the corresponding reference document:

Backend Chip Vendor Reference
Ascend NPU Huawei references/npu.md
MetaX GPU MetaX TBD
Iluvatar GPU (BI-V150) Iluvatar references/iluvatar_gpu.md
Pingtouge-Zhenwu Pingtouge TBD
Tsingmicro Tsingmicro TBD
Moore Threads GPU Moore Threads references/mthreads_gpu.md
Hygon DCU Hygon TBD

Quick Test

  1. Ask the user for the model name they want to test (e.g. Qwen3-4B, DeepSeek-R1).
  2. Search the machine for a local copy of that model:
    find / -maxdepth 5 -type d -name "\x3Cuser_provided_model_name>" 2>/dev/null
    
  3. If found, use the discovered path. If not found, tell the user and ask them to provide a different model name or a full local path, then repeat the search. If after 3 attempts no valid model is found, skip the quick test and inform the user to prepare a model before retrying.
  4. Ensure the FL plugin is enabled before running inference:
    export VLLM_PLUGINS='fl'
    
    For Moore Threads GPU, also set:
    export USE_FLAGGEMS=1
    export FLAGCX_PATH=/workspace/FlagCX  # MUST point to the actual FlagCX installation directory; this is only an example
    export VLLM_MUSA_ENABLE_MOE_TRITON=1
    
  5. Once a valid model path is resolved, run offline batched inference to verify the full stack:
from vllm import LLM, SamplingParams

model_path = "\x3Cresolved_model_path>"
prompts = [
    "Hello, my name is",
]
sampling_params = SamplingParams(max_tokens=10, temperature=0.0)

# For Moore Threads GPU, add: enforce_eager=True, block_size=64, attention_config={"backend": "TORCH_SDPA"}
# For Iluvatar BI-V150, add: enforce_eager=True
llm = LLM(model=model_path, max_num_batched_tokens=16384, max_num_seqs=2048)
outputs = llm.generate(prompts, sampling_params)
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

Troubleshooting

Out of memory on model load: Use gpu_memory_utilization parameter to limit memory. Start with 0.8 and adjust:

from vllm import LLM
llm = LLM(model="...", gpu_memory_utilization=0.8)

FlagGems build failures: Ensure build dependencies are installed (scikit-build-core, pybind11, ninja, cmake). Check that your compiler supports C++17.

Plugin not loaded: If vLLM does not use the FL plugin, verify that VLLM_PLUGINS='fl' is set in your environment.

FlagCX communication errors: Ensure FLAGCX_PATH is correctly set and the library was built for your platform. For NVIDIA, verify with make USE_NVIDIA=1.

Ascend-specific issues: See references/npu.md for Ascend NPU troubleshooting, including FlagTree setup and eager execution requirements.

Cannot connect to GitHub: Ask the user for their network proxy settings (e.g. http_proxy / https_proxy), configure the proxy, then retry the git clone command.

References

安全使用建议
Treat this as an incomplete low-confidence review, not a clearance. Re-run ClawScan in an environment where metadata.json and artifact/ can be read before installing or publishing this skill.
能力标签
cryptocan-make-purchases
能力评估
Purpose & Capability
The requested metadata.json and artifact/ contents could not be read, so purpose and capability coherence could not be confirmed from artifacts.
Instruction Scope
Runtime instructions could not be inspected; no instruction-scope concern is supported by artifact evidence available to this review.
Install Mechanism
Install specifications could not be inspected; no install-mechanism concern is supported by artifact evidence available to this review.
Credentials
Environment access and proportionality could not be assessed from artifacts because local file reads failed before shell execution.
Persistence & Privilege
Persistence or privilege behavior could not be confirmed; no artifact-backed persistence or privilege concern was available.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install vllm-plugin-fl-setup-flagos
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /vllm-plugin-fl-setup-flagos 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of vllm-plugin-fl-setup-flagos. - Provides guided installation and configuration of vLLM-Plugin-FL, FlagGems, and FlagCX for multiple hardware backends (NVIDIA, Ascend, Moore Threads, etc.). - Suggests specific backend workflows and highlights situations such as network proxy setup and Ascend-specific requirements. - Includes troubleshooting for installation, build errors, environment variables, and backend-specific issues. - Offers quick model inference test steps to verify successful setup. - Lists reference documents for further backend and troubleshooting guidance.
元数据
Slug vllm-plugin-fl-setup-flagos
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Vllm Plugin Fl Setup Flagos 是什么?

Install and configure vLLM-Plugin-FL for multiple hardware backends including NVIDIA, Ascend and etc. Use when setting up vllm-plugin-fl, configuring the env... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 69 次。

如何安装 Vllm Plugin Fl Setup Flagos?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install vllm-plugin-fl-setup-flagos」即可一键安装,无需额外配置。

Vllm Plugin Fl Setup Flagos 是免费的吗?

是的,Vllm Plugin Fl Setup Flagos 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Vllm Plugin Fl Setup Flagos 支持哪些平台?

Vllm Plugin Fl Setup Flagos 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Vllm Plugin Fl Setup Flagos?

由 Flagos(@wbavon)开发并维护,当前版本 v1.0.0。

💬 留言讨论