功能描述

Train high-performance medical LLMs on consumer GPUs using parameter-efficient fine-tuning

使用说明 (SKILL.md)

Skill: Low-Resource AI Researcher

Name: Low-Resource AI Researcher
Author: aipoch-ai

ID: 215
Category: AI/ML Research
Language: Python
Framework: PyTorch + PEFT (LoRA/QLoRA) + Transformers

Overview

Based on Parameter-Efficient Fine-Tuning (PEFT) technology, trains high-performance medical domain large language models on consumer-grade GPUs or single A100. Supports advanced fine-tuning methods such as LoRA, QLoRA, optimized for medical text understanding and generation tasks.

Features

🚀 Parameter-Efficient Fine-Tuning: LoRA, QLoRA, DoRA support
🏥 Medical Domain Optimized: Pre-configured for medical QA, diagnosis, clinical notes
💻 Low-Resource Ready: Optimized for consumer GPUs (RTX 3090/4090) and single A100
📊 Quantization: 4-bit/8-bit quantization with bitsandbytes
🔄 Multi-Task: Supports SFT, DPO, and medical instruction tuning
📝 Medical Datasets: Built-in support for PubMedQA, MedQA, MIMIC-III

Installation

# Core dependencies
pip install torch transformers datasets accelerate peft bitsandbytes

# Optional for training optimization
pip install flash-attn --no-build-isolation
pip install wandb tensorboard

# Medical NLP utilities
pip install scispacy scikit-learn

Quick Start

from skills.low_resource_ai_researcher.scripts.main import MedicalPEFTTrainer

# Initialize trainer
trainer = MedicalPEFTTrainer(
    model_name="meta-llama/Llama-2-7b-hf",
    task="medical_qa"
)

# Train with LoRA
trainer.train(
    output_dir="./medical_lora_model",
    num_epochs=3,
    batch_size=4,
    use_qlora=True  # 4-bit quantization
)

Configuration

Hardware Profiles

Profile	GPU Memory	Quantization	Max Model Size	Batch Size
consumer-24g	24GB (RTX 3090/4090)	QLoRA 4-bit	70B	1-2
a100-40g	40GB (A100)	LoRA 8-bit	70B	4-8
a100-80g	80GB (A100)	LoRA 16-bit	70B	8-16
multi-gpu	2x A100	LoRA 16-bit	70B+	16+

LoRA Config

lora:
  r: 64              # LoRA rank
  lora_alpha: 128    # Scaling factor
  target_modules:    # Modules to apply LoRA
    - q_proj
    - v_proj
    - k_proj
    - o_proj
    - gate_proj
    - up_proj
    - down_proj
  lora_dropout: 0.05
  bias: "none"
  task_type: "CAUSAL_LM"

CLI Usage

# Basic training
python scripts/main.py \
    --model_name_or_path meta-llama/Llama-2-7b-hf \
    --dataset medical_qa \
    --output_dir ./output \
    --use_qlora \
    --per_device_train_batch_size 4

# With custom config
python scripts/main.py --config configs/medical_qlora.yaml

# Resume training
python scripts/main.py --resume_from_checkpoint ./output/checkpoint-1000

API Reference

MedicalPEFTTrainer

trainer = MedicalPEFTTrainer(
    model_name: str,              # Base model name/path
    task: str,                    # Task type: medical_qa, diagnosis, clinical_note
    lora_r: int = 64,             # LoRA rank
    lora_alpha: int = 128,        # LoRA alpha
    use_qlora: bool = False,      # Use 4-bit quantization
    target_modules: List[str] = None,
    device_map: str = "auto",
    trust_remote_code: bool = True
)

Methods

Method	Description
`train()`	Start fine-tuning with configured parameters
`evaluate()`	Evaluate on medical benchmark datasets
`merge_and_save()`	Merge LoRA weights and save full model
`load_model()`	Load a trained model for inference
`generate()`	Generate medical text/responses

Supported Models

LLaMA 2/3 (7B, 13B, 70B)
Mistral (7B, 8x7B)
Yi (6B, 34B)
Qwen (7B, 14B, 72B)
Baichuan (7B, 13B)
ChatGLM (6B)

Medical Datasets

Dataset	Description	Size
PubMedQA	Biomedical QA	1k QA pairs
MedQA	USMLE-style questions	61k
MedMCQA	Medical entrance exam QA	194k
MIMIC-III	Clinical notes	De-identified
CMeEE	Chinese medical NER	15k
Huatuo-26M	Chinese medical corpus	26M samples

Performance Benchmarks

Model	Method	GPU	Training Time	MedQA Acc
LLaMA-2-7B	LoRA	A100-40G	2h	58.2%
LLaMA-2-7B	QLoRA	RTX 4090	3h	57.8%
LLaMA-2-13B	QLoRA	A100-40G	4h	62.5%
Mistral-7B	LoRA	A100-40G	2.5h	61.3%

Best Practices

Gradient Accumulation: Use for effective larger batch sizes
Learning Rate: Start with 2e-4 for LoRA, 1e-4 for full fine-tuning
Warmup Steps: 100 steps for medical domain adaptation
Max Length: 2048-4096 for clinical notes, 512-1024 for QA
Data Quality: Filter out low-quality medical data carefully

Troubleshooting

Out of Memory

# Enable gradient checkpointing
trainer.train(gradient_checkpointing=True)

# Reduce sequence length
trainer.train(max_seq_length=1024)

# Use DeepSpeed ZeRO-3 for large models

Slow Training

# Enable Flash Attention
trainer.train(use_flash_attention=True)

# Use bf16 on Ampere GPUs
trainer.train(bf16=True)

License

This skill follows the license of the underlying models used. Medical applications require compliance with HIPAA/GDPR regulations.

References

Hu et al. (2021) - LoRA: Low-Rank Adaptation of Large Language Models
Dettmers et al. (2023) - QLoRA: Efficient Finetuning of Quantized LLMs
Singhal et al. (2023) - Large Language Models Encode Clinical Knowledge

Risk Assessment

Risk Indicator	Assessment	Level
Code Execution	Python/R scripts executed locally	Medium
Network Access	No external API calls	Low
File System Access	Read input files, write output files	Medium
Instruction Tampering	Standard prompt guidelines	Low
Data Exposure	Output files saved to workspace	Low

Security Checklist

No hardcoded credentials or API keys
No unauthorized file system access (../)
Output does not expose sensitive information
Prompt injection protections in place
Input file paths validated (no ../ traversal)
Output directory restricted to workspace
Script execution in sandboxed environment
Error messages sanitized (no stack traces exposed)
Dependencies audited

Prerequisites

# Python dependencies
pip install -r requirements.txt

Evaluation Criteria

Success Metrics

Successfully executes main functionality
Output meets quality standards
Handles edge cases gracefully
Performance is acceptable

Test Cases

Basic Functionality: Standard input → Expected output
Edge Case: Invalid input → Graceful error handling
Performance: Large dataset → Acceptable processing time

Lifecycle Status

Current Stage: Draft
Next Review Date: 2026-03-06
Known Issues: None
Planned Improvements:
- Performance optimization
- Additional feature support

安全使用建议

This skill appears to implement what it claims (PEFT training for medical models) but has several implicit risks you should consider before installing or running it: - trust_remote_code=True: The trainer defaults allow executing arbitrary Python stored in remote model repositories when you load a model. Only load models from fully trusted sources or set trust_remote_code=False and use vetted model code. - Implicit network/credentials: The skill will download models and datasets and may require Hugging Face tokens or wandb API keys for private models or telemetry; these are not declared. Do not provide secrets unless you understand where they are sent. - Telemetry and third-party services: The docs recommend installing wandb/tensorboard; if you enable reporting, metrics may be transmitted off-host. Review/report_to settings and disable telemetry if you don't want external data flows. - Run in a sandbox: Because model loading and some dependencies (flash_attn, bitsandbytes) execute native code, run this in an isolated environment (container or VM) with limited network access until you audit the model sources and code. - HIPAA/data risk: The skill targets medical data. Ensure training data is de-identified/allowed, and confirm compliance (HIPAA/GDPR) before using on real patient data. If you want to proceed safely: review the full scripts/main.py (complete file), set trust_remote_code=False, pin explicit vetted model repo URLs, avoid private model downloads unless necessary, and run the code in an isolated machine or container.

功能分析

Type: OpenClaw Skill Name: low-resource-ai-researcher Version: 1.0.0 The skill provides a functional framework for fine-tuning medical LLMs but contains high-risk security configurations. Specifically, `scripts/main.py` defaults to `trust_remote_code=True` and utilizes `datasets.load_dataset`, both of which are known vectors for Remote Code Execution (RCE) when interacting with untrusted models or datasets. Additionally, `requirements.txt` includes a dependency on a generic `skills` package from PyPI, which appears unnecessary for the stated logic and could lead to the execution of unintended third-party code during installation.

能力评估

ℹ Purpose & Capability

The name/description (medical PEFT training on consumer GPUs) aligns with the included code and instructions: it downloads models/datasets, applies LoRA/QLoRA, and runs training. However some declared defaults (trust_remote_code=True, inclusion of wandb/flash_attn in install suggestions) are broader than strictly necessary for 'low-resource' training and increase the attack surface.

⚠ Instruction Scope

SKILL.md and scripts instruct the agent to download models and datasets from the network and to load model code with trust_remote_code enabled. Loading remote model repositories with trust_remote_code=True allows arbitrary Python in the model repo to run locally — this is a scope expansion beyond simply training a model and is potentially dangerous if untrusted model names are used. The skill also suggests installing wandb (telemetry) and can load local files specified by train_file/validation_file, which is expected but increases data-access scope.

ℹ Install Mechanism

There is no registry install spec (instruction-only), which lowers install-time risk. However requirements.txt lists heavy ML packages (flash_attn, bitsandbytes implied in docs) and an odd 'skills' package; installing these can be complex and may compile native extensions. No external arbitrary download URLs are present in the manifest.

⚠ Credentials

The skill declares no required environment variables or primary credential, yet the code will fetch models/datasets from remote hubs and references wandb/Transformers features that commonly require API tokens (Hugging Face token for private models, WANDB_API_KEY). These credentials are not documented/declared; the default trust_remote_code=True increases the need for cautious credential handling. In short: network access and potential credential use are implicit but not represented.

✓ Persistence & Privilege

The skill is not forced-always, does not request system-wide config paths, and does not declare any special persistent privileges. Autonomous invocation is allowed (platform default), which is expected for a tool-style skill.

版本历史

v1.0.0

- Initial release of Low-Resource AI Researcher skill (v1.0.0) - Enables training of high-performance medical large language models on consumer GPUs using parameter-efficient fine-tuning (LoRA, QLoRA) - Supports popular medical datasets (PubMedQA, MedQA, MIMIC-III) and multiple LLM architectures (LLaMA, Mistral, Yi, Qwen, Baichuan, ChatGLM) - Provides configurable hardware profiles, quantization options, and CLI/API usage for flexible training and evaluation - Includes security, troubleshooting, and best practice guidelines for safe and efficient usage

元数据

Slug low-resource-ai-researcher

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Low-Resource AI Researcher 是什么？

Train high-performance medical LLMs on consumer GPUs using parameter-efficient fine-tuning. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 104 次。

如何安装 Low-Resource AI Researcher？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install low-resource-ai-researcher」即可一键安装，无需额外配置。

Low-Resource AI Researcher 是免费的吗？

是的，Low-Resource AI Researcher 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Low-Resource AI Researcher 支持哪些平台？

Low-Resource AI Researcher 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Low-Resource AI Researcher？

由 AIpoch（@aipoch-ai）开发并维护，当前版本 v1.0.0。

Low-Resource AI Researcher