← 返回 Skills 市场
jpengcheng523-netizen

feedback-loop-fine-tuner

作者 jpengcheng523-netizen · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
147
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install jpeng-feedback-loop-fine-tuner
功能描述
Provides tools for implementing feedback loops to fine-tune LLM agents using user feedback for continuous personalization and improvement, including training...
使用说明 (SKILL.md)

Feedback Loop Fine-Tuner

Implement feedback loops to fine-tune LLM agents using user feedback for continuous personalization and improvement.

When to Use

  • Collecting user feedback from agent interactions
  • Generating training datasets for fine-tuning
  • Optimizing prompts based on feedback
  • Tracking improvement metrics over time
  • Running A/B tests for prompt variants
  • Implementing RLHF preference learning

Usage

const fineTuner = require('./skills/feedback-loop-fine-tuner');

// Collect feedback
const feedback = fineTuner.collectFeedback({
  conversationId: 'conv_123',
  messageId: 'msg_456',
  query: 'What is machine learning?',
  response: 'Machine learning is...',
  rating: 'positive',
  model: 'llama-3',
  temperature: 0.7
});

// Generate training data
const dataset = fineTuner.generateTrainingData(feedbackHistory, {
  format: 'openai',
  includeCorrections: true
});

// Optimize prompts
const optimization = fineTuner.optimizePrompts(feedbackHistory, {
  'default': 'You are a helpful assistant.',
  'detailed': 'You are a detailed, thorough assistant.'
});

// Track improvement
const improvement = fineTuner.trackImprovement(beforeMetrics, afterMetrics);

// Create A/B test
const experiment = fineTuner.createABTest('prompt_test', [
  { name: 'control', template: 'You are helpful.' },
  { name: 'variant', template: 'You are a detailed, helpful assistant.' }
]);

API

collectFeedback(interaction)

Collect feedback from a user interaction.

const feedback = collectFeedback({
  conversationId: 'conv_123',
  messageId: 'msg_456',
  query: 'Explain quantum computing',
  response: 'Quantum computing uses...',
  rating: 'positive', // 'positive', 'negative', 'neutral', 'correction'
  userCorrection: null, // Optional corrected response
  model: 'llama-3-70b',
  temperature: 0.7,
  promptTemplate: 'default',
  responseTime: 1500,
  tokensUsed: 256
});

generateTrainingData(feedbackHistory, options)

Generate fine-tuning dataset from feedback history.

const dataset = generateTrainingData(feedbackHistory, {
  includeNegative: false,
  includeCorrections: true,
  minRating: 'neutral',
  format: 'jsonl' // 'jsonl', 'openai', 'llama', 'alpaca'
});

Supported formats:

  • jsonl: { "prompt": "...", "completion": "..." }
  • openai: { "messages": [{ "role": "user", "content": "..." }, ...] }
  • llama: Llama 3 chat format with special tokens
  • alpaca: { "instruction": "...", "input": "", "output": "..." }

optimizePrompts(feedbackHistory, templates)

Optimize prompts based on feedback analysis.

const result = optimizePrompts(feedbackHistory, {
  'concise': 'Answer briefly.',
  'detailed': 'Answer with full details.',
  'friendly': 'Answer in a friendly tone.'
});

console.log(result.bestTemplate); // 'detailed'
console.log(result.suggestions); // [{ type: 'length', suggestion: '...' }]
console.log(result.optimizedVariant); // Optimized prompt template

trackImprovement(before, after)

Track improvement metrics between two snapshots.

const improvement = trackImprovement(
  { qualityScore: 0.65, positiveRate: 0.70 },
  { qualityScore: 0.82, positiveRate: 0.85 }
);

console.log(improvement.qualityScore);
// { baseline: 0.65, current: 0.82, change: 0.17, percentChange: 26.15, improved: true }

generateImprovementReport(metricsHistory)

Generate comprehensive improvement report.

const report = generateImprovementReport([
  { qualityScore: 0.65, positiveRate: 0.70 },
  { qualityScore: 0.72, positiveRate: 0.75 },
  { qualityScore: 0.82, positiveRate: 0.85 }
]);

console.log(report.trends.qualityScore.direction); // 'improving'
console.log(report.summary.latestQualityScore); // 0.82

createABTest(name, variants, config)

Create an A/B test experiment for prompt variants.

const experiment = createABTest('tone_test', [
  { name: 'formal', template: 'You are a formal assistant.' },
  { name: 'casual', template: 'You are a friendly, casual assistant.' }
], {
  trafficSplit: [0.5, 0.5],
  minSamples: 100,
  confidenceLevel: 0.95
});

Classes

FeedbackCollector

Collect and aggregate user feedback.

const collector = new FeedbackCollector();

// Collect individual feedback
const fb = collector.collectFeedback(interaction);

// Batch collect
collector.batchCollect(interactions);

// Aggregate by category
const aggregation = collector.aggregateByCategory({
  start: Date.now() - 7 * 24 * 60 * 60 * 1000, // Last 7 days
  end: Date.now()
});

// Export for analysis
const csv = collector.exportFeedback('csv');

TrainingDatasetGenerator

Generate fine-tuning datasets from feedback.

const generator = new TrainingDatasetGenerator();

// Generate training data
const dataset = generator.generateTrainingData(feedbackHistory, { format: 'openai' });

// Generate preference pairs for RLHF
const pairs = generator.generatePreferencePairs(feedbackHistory);

// Split into train/validation
const { train, validation } = generator.splitDataset(examples, 0.8);

PromptOptimizer

Optimize prompts based on feedback.

const optimizer = new PromptOptimizer();

// Register templates
optimizer.registerTemplate('default', 'You are helpful.');
optimizer.registerTemplate('detailed', 'You are detailed and thorough.');

// Update performance
optimizer.updatePerformance('default', feedback);

// Get best template
const best = optimizer.getBestTemplate();

// Get improvement suggestions
const suggestions = optimizer.suggestImprovements(feedbackHistory, 'default');

// Generate optimized variant
const variant = optimizer.generateVariant('default', suggestions);

ImprovementTracker

Track improvement metrics over time.

const tracker = new ImprovementTracker();

// Set baseline
tracker.setBaseline('initial', { qualityScore: 0.5 });

// Record snapshots
tracker.recordSnapshot({ qualityScore: 0.6 });
tracker.recordSnapshot({ qualityScore: 0.7 });

// Calculate improvement
const improvement = tracker.calculateImprovement({ qualityScore: 0.8 }, 'initial');

// Get trend
const trend = tracker.getTrend('qualityScore', 10);

// Generate report
const report = tracker.generateReport();

ABTester

Run A/B tests for prompt variants.

const tester = new ABTester();

// Create experiment
tester.createExperiment('tone_test', [
  { name: 'formal', template: 'Be formal.' },
  { name: 'casual', template: 'Be casual.' }
]);

// Assign variant
const variant = tester.assignVariant('tone_test');

// Record result
tester.recordResult('tone_test', variant.variantIndex, {
  rating: 'positive',
  responseTime: 1200
});

// Analyze results
const analysis = tester.analyzeResults('tone_test');

// Stop experiment
tester.stopExperiment('tone_test');

Example: Complete Feedback Loop

const fineTuner = require('./skills/feedback-loop-fine-tuner');

// 1. Initialize components
const collector = new fineTuner.FeedbackCollector();
const generator = new fineTuner.TrainingDatasetGenerator();
const optimizer = new fineTuner.PromptOptimizer();
const tracker = new fineTuner.ImprovementTracker();

// 2. Register prompt templates
optimizer.registerTemplate('v1', 'You are a helpful assistant.');
optimizer.registerTemplate('v2', 'You are a detailed, helpful assistant.');

// 3. Set baseline
tracker.setBaseline('initial', {
  qualityScore: 0.5,
  positiveRate: 0.5,
  avgResponseTime: 2000
});

// 4. Collect feedback (simulated)
const interactions = [
  { conversationId: 'c1', query: 'What is AI?', response: 'AI is...', rating: 'positive' },
  { conversationId: 'c2', query: 'Explain ML', response: 'ML is...', rating: 'negative' },
  { conversationId: 'c3', query: 'What is DL?', response: 'DL is...', rating: 'positive', userCorrection: 'Deep learning is a subset of ML that uses neural networks...' }
];

for (const interaction of interactions) {
  const feedback = collector.collectFeedback(interaction);
  optimizer.updatePerformance(interaction.promptTemplate || 'v1', feedback);
}

// 5. Generate training data
const feedbackHistory = collector.feedbackStore;
const trainingData = generator.generateTrainingData(feedbackHistory, {
  format: 'openai',
  includeCorrections: true
});

console.log('Training examples:', trainingData.split('\
').length);

// 6. Optimize prompts
const optimization = fineTuner.optimizePrompts(feedbackHistory, {
  'v1': 'You are a helpful assistant.',
  'v2': 'You are a detailed, helpful assistant.'
});

console.log('Best template:', optimization.bestTemplate);
console.log('Suggestions:', optimization.suggestions);

// 7. Track improvement
const aggregation = collector.aggregateByCategory();
tracker.recordSnapshot({
  qualityScore: aggregation.qualityScore,
  positiveRate: aggregation.byRating.positive?.length / aggregation.total || 0,
  avgResponseTime: aggregation.avgResponseTime,
  totalFeedback: aggregation.total
});

const report = tracker.generateReport();
console.log('Improvement trend:', report.trends.qualityScore?.direction);

Example: RLHF Preference Learning

const fineTuner = require('./skills/feedback-loop-fine-tuner');
const generator = new fineTuner.TrainingDatasetGenerator();

// Collect feedback with comparisons
const feedbackHistory = [
  { query: 'Explain AI', rating: 'positive', response: 'AI is artificial intelligence...' },
  { query: 'Explain AI', rating: 'negative', response: 'AI means artificial intelligence.' }
];

// Generate preference pairs
const pairs = generator.generatePreferencePairs(feedbackHistory);

console.log('Preference pairs:');
for (const pair of pairs) {
  console.log(`Prompt: ${pair.prompt}`);
  console.log(`Chosen: ${pair.chosen.substring(0, 50)}...`);
  console.log(`Rejected: ${pair.rejected.substring(0, 50)}...`);
}

Notes

  • Feedback ratings: 'positive', 'negative', 'neutral', 'correction'
  • User corrections are treated as high-quality training examples
  • Preference pairs are generated from positive/negative feedback on similar queries
  • A/B testing uses simplified statistical significance (use proper libraries for production)
  • Training data formats support OpenAI, Llama 3, and Alpaca fine-tuning
  • All metrics are calculated locally without external dependencies
安全使用建议
This skill appears to do what it says: local collection, analysis, and formatting of user feedback for dataset preparation. Before installing or using it, consider: (1) Privacy — the skill will aggregate user interactions and can export datasets (JSON/CSV/jsonl) that may include PII or sensitive conversation content; ensure you filter or redact data before training or sharing. (2) Scope — the module prepares data but does not perform model training or upload to external services, so plan how/where you'll run fine-tuning or RLHF steps. (3) Code review — although included code shows no network calls or secret access, review the full (non-truncated) index.js to confirm there are no hidden endpoints or telemetry. (4) Test in a sandboxed environment and enforce policies about what feedback may be captured (e.g., do not collect credentials). If you need automatic cloud training integrations, prefer a skill that explicitly requests and documents the required credentials and endpoints.
功能分析
Type: OpenClaw Skill Name: jpeng-feedback-loop-fine-tuner Version: 1.0.0 The skill provides a comprehensive set of tools for managing LLM feedback loops, including feedback collection, training dataset generation (JSONL, OpenAI, Llama formats), and prompt optimization. Analysis of index.js and SKILL.md reveals no network activity, filesystem access, or use of dangerous functions like eval or exec. The code logic is transparent, aligns perfectly with the stated purpose, and contains no indicators of malicious intent or prompt injection vulnerabilities.
能力评估
Purpose & Capability
The name/description (feedback-loop fine-tuner) matches the included SKILL.md and index.js: the code implements feedback collection, aggregation, dataset generation (jsonl/openai/llama/alpaca), preference-pair generation, prompt optimization, and metrics tracking. One note: the skill describes 'fine-tuning' and 'RLHF' workflows but the implementation focuses on data preparation and analysis (no built-in training calls or cloud upload). That is a legitimate design choice for a local library, but users expecting automated model training integrations should not assume those are present.
Instruction Scope
SKILL.md instructions are narrowly scoped to collecting feedback, generating datasets, optimizing prompts, tracking metrics, and running A/B tests. They do not instruct reading arbitrary system files, contacting external endpoints, or accessing environment variables beyond what the module exposes. The example usage assumes requiring the module from a local path, which is normal for a Node library.
Install Mechanism
No install spec is provided (instruction-only plus a local index.js), so nothing will be downloaded or installed by the platform. The package.json is minimal and the code is included in the bundle. This is low-risk from an install/execution vector perspective.
Credentials
The skill declares no required environment variables, credentials, or config paths and the code does not reference process.env or external secrets. That matches the stated purpose (local data processing) and is proportionate.
Persistence & Privilege
The skill does not request always:true or other privileged persistent presence. It keeps feedback in an in-memory store (feedbackStore) and provides export functions; it does not modify other skills or system-wide agent settings. Autonomous invocation is allowed by platform default but there's no additional persistence or privilege escalation requested by the skill.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install jpeng-feedback-loop-fine-tuner
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /jpeng-feedback-loop-fine-tuner 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of Feedback Loop Fine-Tuner. - Introduces tools for collecting and aggregating user feedback to improve LLM agents. - Supports generation of fine-tuning datasets from feedback history in multiple formats. - Enables prompt optimization using feedback data and analysis. - Provides improvement tracking and reporting functionality over time. - Adds A/B testing for prompt template variants with experiment management. - Includes modular classes: FeedbackCollector, TrainingDatasetGenerator, PromptOptimizer, ImprovementTracker, and ABTester.
元数据
Slug jpeng-feedback-loop-fine-tuner
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

feedback-loop-fine-tuner 是什么?

Provides tools for implementing feedback loops to fine-tune LLM agents using user feedback for continuous personalization and improvement, including training... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 147 次。

如何安装 feedback-loop-fine-tuner?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install jpeng-feedback-loop-fine-tuner」即可一键安装,无需额外配置。

feedback-loop-fine-tuner 是免费的吗?

是的,feedback-loop-fine-tuner 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

feedback-loop-fine-tuner 支持哪些平台?

feedback-loop-fine-tuner 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 feedback-loop-fine-tuner?

由 jpengcheng523-netizen(@jpengcheng523-netizen)开发并维护,当前版本 v1.0.0。

💬 留言讨论