← Back to Skills Marketplace

Gpu Deploy

Name: Gpu Deploy
Author: wang-junjian

by 军舰 · GitHub ↗ · v0.1.0

cross-platform ⚠ suspicious

512

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install gpu-deploy

Description

在 GPU 服务器上部署 vLLM 模型服务。支持多服务器配置，自动检查 GPU 和端口占用，一键部署流行的开源模型。

README (SKILL.md)

🚀 GPU 部署技能

在 GPU 服务器上快速部署 vLLM 模型服务。

✨ 功能特点

🖥️ 多服务器支持 - 配置多个 GPU 服务器，灵活选择
🔍 自动检查 - 一键检查 GPU 状态和端口占用
🤖 模型库 - 预置流行模型配置
⚡ 快速部署 - 简单命令即可启动服务

📋 快速开始

1. 配置服务器

创建 ~/.config/gpu-deploy/servers.json：

{
  "servers": {
    "gpu1": {
      "host": "gpu1",
      "user": "lnsoft",
      "gpu_count": 4,
      "model_path": "/data/models/llm"
    },
    "my-gpu": {
      "host": "192.168.1.100",
      "user": "ubuntu",
      "gpu_count": 2,
      "model_path": "/home/ubuntu/models"
    }
  },
  "default_server": "gpu1"
}

2. 检查服务器状态

# 使用默认服务器
gpu-deploy check

# 指定服务器
gpu-deploy check --server gpu1

3. 部署模型

# 部署预设模型
gpu-deploy deploy deepseek-r1-32b

# 指定端口
gpu-deploy deploy deepseek-r1-32b --port 8112

🎛️ 可用命令

`check` - 检查服务器状态

检查 GPU 显存和端口占用情况。

gpu-deploy check [--server NAME] [--port PORT]

输出示例：

✅ GPU 状态正常
- 4 × Tesla T4 (15GB)
- 显存占用: 12.6GB/卡
- 温度: 51-55°C

✅ 端口 8111 可用

`deploy` - 部署模型

启动 vLLM 模型服务。

gpu-deploy deploy \x3CMODEL_NAME> [--server NAME] [--port PORT]

支持的模型：

deepseek-r1-32b - DeepSeek-R1-Distill-Qwen-32B-AWQ
llama-3-8b - Llama 3 8B
qwen-7b - Qwen 7B
mistral-7b - Mistral 7B

`list` - 列出可用模型

gpu-deploy list

`ps` - 查看运行中的服务

gpu-deploy ps [--server NAME]

`stop` - 停止服务

gpu-deploy stop [--server NAME] [--port PORT]

🔧 手动使用（无脚本）

如果不想用封装脚本，也可以直接用原始命令：

检查 GPU

ssh \x3Cuser>@\x3Chost> nvidia-smi

检查端口

ssh \x3Cuser>@\x3Chost> "lsof -i :\x3Cport> 2>/dev/null || echo '端口可用'"

部署模型（DeepSeek R1 32B）

ssh \x3Cuser>@\x3Chost> "tmux new-session -d -s vllm '
source /data/miniconda3/etc/profile.d/conda.sh && \
conda activate vllm && \
cd /data/models/llm && \
vllm serve /data/models/llm/deepseek/DeepSeek-R1-Distill-Qwen-32B-AWQ/ \
  --tensor-parallel-size 4 \
  --max-model-len 102400 \
  --dtype half \
  --port 8111 \
  --served-model-name gpt-4o-mini
'"

📦 添加自定义模型

在 ~/.config/gpu-deploy/models.json 中添加：

{
  "my-model": {
    "name": "My Awesome Model",
    "path": "/path/to/model",
    "tensor_parallel_size": 2,
    "max_model_len": 8192,
    "dtype": "half",
    "port": 8111,
    "served_model_name": "my-model"
  }
}

⚠️ 注意事项

部署前检查 - 总是先运行 check 确认资源可用
后台运行 - 建议使用 tmux/screen 保持服务运行
端口管理 - 不同模型使用不同端口
显存估算 - 7B 模型约需 8-10GB，32B 约需 10-14GB/卡

🔗 相关链接

vLLM 文档: https://docs.vllm.ai
模型下载: https://huggingface.co/models
问题反馈: https://github.com/your-username/gpu-deploy-skill

由 OpenClaw 社区贡献 🦞

Usage Guidance

This skill appears to be what it says: a set of instructions for deploying vLLM via SSH. Before using it, verify the following: (1) There is no provided 'gpu-deploy' script — either create/obtain a trusted script or run the shown SSH commands manually. (2) Confirm remote paths (conda path, /data/models/llm) and the user account used for SSH have the necessary permissions. (3) Inspect any commands you copy/paste, especially the tmux/conda/vllm serve line, to ensure the model path and port are correct. (4) Use SSH keys and least-privilege accounts; do not run unknown commands on hosts you don't control. (5) Verify model binaries/download sources (Hugging Face links) independently and ensure vLLM and dependencies on the host are from trusted sources. If you need the convenience script, request a packaged implementation from the maintainer or review its content before adding it to your PATH.

Capability Analysis

Type: OpenClaw Skill Name: gpu-deploy Version: 0.1.0 The skill is classified as suspicious due to its core functionality involving remote command execution via SSH, as detailed in `SKILL.md`. While the explicit use of `ssh` for deploying services on remote GPU servers aligns with the skill's stated purpose, the actual `gpu-deploy` script (which would construct and execute these commands based on user input) is not provided. This creates a significant risk of shell injection vulnerabilities if user inputs (e.g., model names, server details, ports) are not rigorously sanitized before being interpolated into the complex `ssh` commands shown in the '手动使用（无脚本）' section of `SKILL.md`. There is no evidence of intentional malicious behavior like data exfiltration or malicious prompt injection in the provided files, but the high-risk nature of remote execution and the potential for vulnerabilities in the missing implementation warrant a 'suspicious' classification.

Capability Assessment

ℹ Purpose & Capability

The name/description (deploy vLLM to GPU servers) matches the instructions: SSH into hosts, check GPUs/ports, and run vllm serve. Requiring ssh is appropriate. Minor inconsistency: the README and examples reference a local 'gpu-deploy' script to put on PATH, but no such script is bundled in this package (skill is instruction-only).

ℹ Instruction Scope

Runtime instructions are narrowly scoped to remote operations over SSH (nvidia-smi, lsof, tmux + conda + vllm serve). They do not attempt to read unrelated local files or exfiltrate data. Note that many commands assume specific paths (e.g., /data/miniconda3, /data/models/llm) and elevated access on remote hosts; users should verify and adapt these before running.

ℹ Install Mechanism

There is no install spec (instruction-only), which reduces install-time risk. However, documentation suggests copying a 'gpu-deploy' script into ~/.local/bin, yet no script is provided in the files — the skill will not install a helper binary for you.

✓ Credentials

No environment variables, secrets, or config paths are requested. SSH-based access is implied (user/host in servers.json) which is appropriate for remote deployment; no unrelated credentials are asked for.

✓ Persistence & Privilege

always:false and no install/spec writing to system-wide configs. The skill does not request persistent elevated privileges or attempt to modify other skills' configurations.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install gpu-deploy
After installation, invoke the skill by name or use /gpu-deploy
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.1.0

Initial release of the gpu-deploy skill. - Deploy vLLM model services on GPU servers with multi-server support. - Automated GPU status and port availability checks. - Preset configurations for popular open-source models. - One-command deployment and management (check, deploy, list, ps, stop). - Custom model configuration supported via JSON files.

Metadata

Slug gpu-deploy

Version 0.1.0

License —

All-time Installs 2

Active Installs 2

Total Versions 1

Frequently Asked Questions

What is Gpu Deploy?

在 GPU 服务器上部署 vLLM 模型服务。支持多服务器配置，自动检查 GPU 和端口占用，一键部署流行的开源模型。 It is an AI Agent Skill for Claude Code / OpenClaw, with 512 downloads so far.

How do I install Gpu Deploy?

Run "/install gpu-deploy" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Gpu Deploy free?

Yes, Gpu Deploy is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Gpu Deploy support?

Gpu Deploy is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Gpu Deploy?

It is built and maintained by 军舰 (@wang-junjian); the current version is v0.1.0.

More Skills