← 返回 Skills 市场

GPU Keepalive with KeepGPU

Name: GPU Keepalive with KeepGPU
Author: wangmerlyn

作者 Wangmerlyn · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

364

总下载

当前安装

版本数

在 OpenClaw 中安装

/install gpu-keepalive-with-keepgpu

功能描述

Install and operate KeepGPU for GPU keep-alive with both blocking CLI and non-blocking service workflows. Use when users ask for keep-gpu command constructio...

使用说明 (SKILL.md)

KeepGPU CLI Operator

Use this workflow to run keep-gpu safely and effectively.

Prerequisites

Confirm at least one GPU is visible (python -c "import torch; print(torch.cuda.device_count())").
Run commands in a shell where CUDA/ROCm drivers are already available.
Use Ctrl+C to stop KeepGPU and release memory cleanly.

Install KeepGPU

Install PyTorch first for your platform, then install KeepGPU.

Option A: Install from package index

# CUDA example (change cu121 to your CUDA version)
pip install --index-url https://download.pytorch.org/whl/cu121 torch
pip install keep-gpu

# ROCm example (change rocm6.1 to your ROCm version)
pip install --index-url https://download.pytorch.org/whl/rocm6.1 torch
pip install keep-gpu[rocm]

Option B: Install directly from Git URL (no local clone)

Prefer this option when users only need the CLI and do not need local source edits. This avoids checkout directory and cleanup overhead.

pip install "git+https://github.com/Wangmerlyn/KeepGPU.git"

If SSH access is configured:

pip install "git+ssh://[email protected]/Wangmerlyn/KeepGPU.git"

ROCm variant from Git URL:

pip install "keep_gpu[rocm] @ git+https://github.com/Wangmerlyn/KeepGPU.git"

Option C: Install from a local source checkout (explicit path)

Use this option only when users already have a local checkout or plan to edit source.

git clone https://github.com/Wangmerlyn/KeepGPU.git
cd KeepGPU
pip install -e .

If the checkout already exists somewhere else, install by absolute path:

pip install -e /absolute/path/to/KeepGPU

For ROCm users from local checkout:

pip install -e ".[rocm]"

Verify installation:

keep-gpu --help

Command model

KeepGPU supports two execution modes.

Blocking mode (compatibility)

keep-gpu --gpu-ids 0 --vram 1GiB --interval 60 --busy-threshold 25

Use when users intentionally want one foreground process and manual Ctrl+C stop.

Non-blocking mode (recommended for agents)

keep-gpu start --gpu-ids 0 --vram 1GiB --interval 60 --busy-threshold 25
keep-gpu status
keep-gpu stop --all
keep-gpu service-stop

start auto-starts local service when unavailable.

Ctrl+C stops only foreground blocking runs. For service mode sessions started by keep-gpu start, use keep-gpu status, keep-gpu stop, and keep-gpu service-stop.

CLI options to tune:

--gpu-ids: comma-separated IDs (0, 0,1). If omitted, KeepGPU uses all visible GPUs.
--vram: VRAM to hold per GPU (512MB, 1GiB, or raw bytes).
--interval: seconds between keep-alive cycles.
--busy-threshold (--util-threshold alias): if utilization is above this percent, KeepGPU backs off.

Legacy compatibility:

--threshold is deprecated but still accepted.
Numeric --threshold maps to busy threshold.
String --threshold maps to VRAM.

Agent workflow

Collect workload intent: target GPUs, hold duration, and whether node is shared.
Choose mode:
- blocking mode for manual shell sessions,
- non-blocking mode for agent pipelines (default recommendation).
Choose safe defaults when unspecified: --vram 1GiB, --interval 60-120, --busy-threshold 25.
Provide command sequence with verification and stop command.
For non-blocking mode, include status, stop, and daemon shutdown (service-stop).

Command templates

Single GPU while preprocessing (blocking):

keep-gpu --gpu-ids 0 --vram 1GiB --interval 60 --busy-threshold 25

All visible GPUs with lighter load (blocking):

keep-gpu --vram 512MB --interval 180

Agent-friendly non-blocking sequence:

keep-gpu start --gpu-ids 0 --vram 1GiB --interval 60 --busy-threshold 25
keep-gpu status
keep-gpu stop --job-id \x3Cjob_id>
keep-gpu service-stop

Open dashboard:

http://127.0.0.1:8765/

Remote sessions (preferred: tmux for visibility and control):

tmux new -s keepgpu
keep-gpu --gpu-ids 0 --vram 1GiB --interval 300
# Detach with Ctrl+b then d; reattach with: tmux attach -t keepgpu

Fallback when tmux is unavailable:

nohup keep-gpu --gpu-ids 0 --vram 1GiB --interval 300 > keepgpu.log 2>&1 &
echo $! > keepgpu.pid
# Monitor: tail -f keepgpu.log
# Stop: kill "$(cat keepgpu.pid)"

Troubleshooting

Invalid --gpu-ids: ensure comma-separated integers only.
Allocation failure / OOM: reduce --vram or free memory first.
No utilization telemetry: ensure nvidia-ml-py works and nvidia-smi is available.
No GPUs detected: verify drivers, CUDA/ROCm runtime, and torch.cuda.device_count().

Example

User request: "Install KeepGPU from GitHub and keep GPU 0 alive while I preprocess."

Suggested response shape:

Install: pip install "git+https://github.com/Wangmerlyn/KeepGPU.git"
Run: keep-gpu start --gpu-ids 0 --vram 1GiB --interval 60 --busy-threshold 25
Verify: keep-gpu status or dashboard http://127.0.0.1:8765/; stop session with keep-gpu stop --job-id \x3Cjob_id> and daemon with keep-gpu service-stop.

Limitations

KeepGPU is not a scheduler; it only keeps already accessible GPUs active.
KeepGPU behavior depends on cluster policy; some schedulers require higher VRAM or tighter intervals.

安全使用建议

This skill appears to do what it says: install and run KeepGPU. Before installing, consider: (1) prefer the published PyPI release if available; (2) if you use pip install from the GitHub URL, review the repository (setup scripts and entry points) because pip install from a remote repo runs code on your machine; (3) run installs in a virtualenv or container if you are unsure; (4) be aware that service mode spawns persistent background processes and exposes a local dashboard on port 8765 — ensure this fits your environment and cluster policies; (5) verify the repository owner/maintainer and check for recent activity or issues if you will install on a production node.

功能分析

Type: OpenClaw Skill Name: gpu-keepalive-with-keepgpu Version: 1.0.0 The skill bundle provides instructions for the agent to install and execute the KeepGPU utility, which involves high-risk capabilities such as shell command execution and installing software directly from a third-party GitHub repository (Wangmerlyn/KeepGPU). It also guides the agent in setting up background processes using tmux or nohup and accessing a local web dashboard on port 8765. While these actions are plausibly necessary for the stated purpose of GPU keep-alive management, the combination of external code installation and persistent background execution represents a significant security risk without clear evidence of intentional malice.

能力评估

✓ Purpose & Capability

Name and description match the instructions: the SKILL.md explains installing KeepGPU, checking for GPUs, and running blocking or service modes. Required resources (CUDA/ROCm, PyTorch) are appropriate for a GPU keep-alive tool.

✓ Instruction Scope

Runtime instructions are narrowly scoped to installing, starting, inspecting, and stopping KeepGPU; referenced commands and files (torch.cuda.device_count(), nvidia-smi, nohup/tmux, keepgpu.log, keepgpu.pid) are relevant to the stated task. There are no instructions to read unrelated user files or exfiltrate data.

ℹ Install Mechanism

The skill recommends pip installs from PyTorch's wheel index and either PyPI or a GitHub repo. These are expected for Python tooling, but pip installing directly from a Git URL will execute the package's install scripts on the machine — the user should trust the repository or prefer an official PyPI release or review the source before installing.

✓ Credentials

No environment variables, credentials, or config paths are requested. The instructions only require local GPU drivers/runtimes and typical command-line tools, which are proportional to the functionality.

ℹ Persistence & Privilege

The skill does not request elevated platform privileges and 'always' is false. However, service/non-blocking usage will create background processes and may open a local dashboard port (127.0.0.1:8765); users should be aware these processes persist until stopped and may conflict with cluster policies.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install gpu-keepalive-with-keepgpu
安装完成后，直接呼叫该 Skill 的名称或使用 /gpu-keepalive-with-keepgpu 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release from KeepGPU repository

元数据

Slug gpu-keepalive-with-keepgpu

版本 1.0.0

许可证 —

累计安装 0

当前安装数 0

历史版本数 1

常见问题