← Back to Skills Marketplace
112
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install github-experiment-accuracy
Description
GitHub仓库项目准确度验证实验 - 给定GitHub仓库地址和数据文件,运行项目并验证预测准确度,生成详细流程报告和准确度报告。用于:1) 用户给出一个GitHub仓库+数据文件进行实验;2) 验证算法对目标数据的预测准确度;3) 生成包含流程+准确度的完整实验报告。
README (SKILL.md)
GitHub仓库项目准确度验证实验
快速开始
用户给出一个GitHub仓库URL和一个数据文件,运行项目并验证准确度,生成报告。
# 输入参数
github_url: https://github.com/username/repo
data_file: D:/path/to/data.xls
完整流程
Step 1: 仓库获取
- 优先尝试 git clone
git clone {github_url} {project_dir} --depth 1 - 失败则用 Web 下载
Invoke-WebRequest -Uri "{github_url}/archive/refs/heads/main.zip" -OutFile "outputs/repo.zip" Expand-Archive -Path "outputs/repo.zip" -DestinationPath "outputs/projects"
Step 2: 数据准备
# 1. 复制数据到项目
Copy-Item {data_file} -Destination "{project_dir}/data/"
# 2. 读取数据
df = pd.read_excel(data_file)
# 3. 数据清洗
valid_idx = ~df.iloc[:, 1:].isna().any(axis=1)
df_valid = df[valid_idx]
Step 3: 查找并加载模型
# 查找模型文件
model_file = glob.glob(f"{project_dir}/**/model.pt", recursive=True)[0]
# 加载模型
net = torch.load(model_file, map_location='cpu', weights_only=False)
net.eval()
Step 4: 特征工程
分析项目README或代码,构建特征:
# 典型特征 (需根据项目调整)
- 历史数据特征: 前N天 × 维度 = X维
- 天气特征: N维
- 时间特征: 星期(7) + 节假日(2) = 9维
Step 5: 预测验证
# 测试集: 最后30天
test_days = 30
predictions = []
actuals = []
for i in range(len(df_valid) - test_days, len(df_valid)):
# 构建特征
features = build_features(df_valid.iloc[i])
# 预测
pred = net(features)
predictions.append(pred)
actuals.append(df_valid.iloc[i].values)
Step 6: 计算准确度
mae = mean(|predictions - actuals|)
rmse = sqrt(mean((predictions - actuals)²))
mape = mean(|predictions - actuals| / actuals) * 100
accuracy = 100 - mape
Step 7: 生成报告
# 实验报告
## 1. 实验信息
- GitHub仓库: {url}
- 数据文件: {file}
- 测试集: {N}天
## 2. 运作流程 (详细步骤)
### Step 1: 仓库获取
...
### Step 2: 数据准备
...
### Step 3: 模型加载
...
### Step 4: 特征工程
...
### Step 5: 预测验证
...
## 3. 准确度结果
| 指标 | 值 |
|------|-----|
| MAE | {mae:.2f} |
| RMSE | {rmse:.2f} |
| MAPE | {mape:.2f}% |
| 准确度 | {accuracy:.2f}% |
关键代码片段
模型加载
from torch import load, Tensor, no_grad
import torch.nn as nn
# 加载 (新版torch需要weights_only=False)
net = load(model_file, map_location='cpu', weights_only=False)
net.eval()
数据处理
import pandas as pd
import numpy as np
df = pd.read_excel(data_file, sheet_name=0, header=None)
# 清洗NaN
valid = ~df.isna().any(axis=1)
df = df[valid]
准确度计算
mae = np.mean(np.abs(pred - actual))
rmse = np.sqrt(np.mean((pred - actual)**2))
mape = np.mean(np.abs((pred - actual) / actual)) * 100
报告位置
报告保存在:
{project_dir}/accuracy_report.md{project_dir}/experiment_report.md{project_dir}/outputs/daily_results.json
常见问题
- 网络超时: 使用 Web 下载 ZIP 方式
- 模型加载错误: 使用
weights_only=False - 数据找不到: 复制到项目 data/ 目录
- 特征构建: 参考项目 README 或源码
Usage Guidance
This skill does what it says (clone a GitHub repo and evaluate a model), but it instructs you to load and run artifacts from untrusted repositories and model files without any safety checks. That is dangerous: torch.load(model.pt, weights_only=False) can execute arbitrary code embedded in the model, and cloned repositories may contain scripts that read or exfiltrate files (including the data you copy into them). Before using this skill: 1) only run it on repos/models you trust; 2) run the workflow in an isolated environment (ephemeral VM, container, or sandbox) with no access to secrets or sensitive files; 3) avoid copying sensitive data into the project directory — use sanitized or synthetic test data instead; 4) prefer safer model-loading patterns (e.g., load state_dict into a known architecture or validate model files/signatures) rather than arbitrary pickle-based loads; 5) inspect the repository contents and model files first, and restrict network access from the sandbox to prevent exfiltration. If you cannot run it in a sandbox or verify artifacts, do not use this skill on sensitive data.
Capability Assessment
Purpose & Capability
The name/description (validate a GitHub repo's predictive accuracy given a data file) aligns with the instructions: clone/download a repo, copy the data, find model files, load the model, run predictions, compute metrics, and produce a report. Requiring git/HTTP downloads and reading a local data file is expected. The only notable omission is explicit guidance to run these steps in an isolated/sandboxed environment or to verify/trust the repository and model artifacts before executing them.
Instruction Scope
The SKILL.md instructs the agent to clone arbitrary GitHub repos, copy user data into the project directory, locate model files (model.pt) and call torch.load(..., weights_only=False) and then run predictions. torch.load (which uses pickle semantics) can execute arbitrary code embedded in model files; cloning arbitrary repos and placing user data inside them can enable those repos or models to exfiltrate data or execute malicious actions. The instructions provide no checks, verification, or sandboxing, nor do they warn about executing untrusted model artifacts or repository code.
Install Mechanism
This is an instruction-only skill with no install spec or downloaded third-party installer; nothing is written to disk by the skill itself beyond the user's normal interactions (git clone / unzip the repo) — lowest install risk. The risk arises from the runtime actions the instructions recommend, not from any package install step in the skill bundle.
Credentials
The skill requests no environment variables, credentials, or config paths — this is proportionate to its stated purpose. However, because the instructions cause the agent to place local user data into a cloned repository and to execute/deserialize model artifacts, there is an implicit risk to any sensitive data present in the provided data_file or in the working environment.
Persistence & Privilege
The skill is not always-enabled and does not request persistent privileges. It does not modify other skills or request system-wide configuration. Autonomous invocation is enabled by default on the platform but is not combined here with other privileged requests.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install github-experiment-accuracy - After installation, invoke the skill by name or use
/github-experiment-accuracy - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of github-experiment-accuracy skill.
- Allows users to validate prediction accuracy of any GitHub project using a provided dataset.
- Automates cloning/downloading repositories, preparing data, loading models, and running predictions.
- Calculates and reports key accuracy metrics (MAE, RMSE, MAPE, and overall accuracy).
- Generates a comprehensive experiment report, including step-by-step workflow and results.
- Provides troubleshooting guidance for common issues.
Metadata
Frequently Asked Questions
What is Github Experiment Accuracy?
GitHub仓库项目准确度验证实验 - 给定GitHub仓库地址和数据文件,运行项目并验证预测准确度,生成详细流程报告和准确度报告。用于:1) 用户给出一个GitHub仓库+数据文件进行实验;2) 验证算法对目标数据的预测准确度;3) 生成包含流程+准确度的完整实验报告。 It is an AI Agent Skill for Claude Code / OpenClaw, with 112 downloads so far.
How do I install Github Experiment Accuracy?
Run "/install github-experiment-accuracy" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Github Experiment Accuracy free?
Yes, Github Experiment Accuracy is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Github Experiment Accuracy support?
Github Experiment Accuracy is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Github Experiment Accuracy?
It is built and maintained by Kevinyyc (@kevinyyc); the current version is v1.0.0.
More Skills