← 返回 Skills 市场
jpengcheng523-netizen

agent-metrics-monitor

作者 jpengcheng523-netizen · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
168
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install jpeng-agent-metrics-monitor
功能描述
Provides monitoring and alerting for agent abnormal behavior metrics with Prometheus and Grafana support, including P99 latency, error rates, anomaly detecti...
使用说明 (SKILL.md)

Agent Metrics Monitor

Monitor and alert agent abnormal behavior metrics with Prometheus and Grafana support.

When to Use

  • Monitoring agent operation latencies (P50, P95, P99)
  • Tracking error rates and success rates
  • Detecting anomalies in agent behavior
  • Generating Prometheus-compatible metrics
  • Creating Grafana dashboard configurations
  • Setting up alert rules for abnormal behavior

Usage

const monitor = require('./skills/agent-metrics-monitor');

// Create metrics collector
const collector = monitor.createMetricsCollector({ serviceName: 'my-agent' });

// Record latency
collector.recordLatency('tool_call', 150);
collector.recordLatency('tool_call', 250);

// Record errors and successes
collector.recordError('tool_call', 'timeout');
collector.recordSuccess('tool_call');

// Get error rate
console.log('Error rate:', collector.getErrorRate('tool_call'));

// Export Prometheus format
console.log(collector.exportPrometheus());

// Generate Grafana dashboard
const dashboard = collector.generateGrafanaDashboard({ title: 'My Agent' });

API

createMetricsCollector(options)

Create a metrics collector instance.

const collector = monitor.createMetricsCollector({
  serviceName: 'my-agent',
  prefix: 'agent',
  timeSeries: {
    maxPoints: 10000,
    retentionMs: 86400000 // 24 hours
  }
});

createHistogram(options)

Create a histogram for latency tracking.

const hist = monitor.createHistogram({
  buckets: [1, 5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, 10000],
  maxValues: 10000
});

hist.observe(150);
console.log('P99:', hist.p99());
console.log('P95:', hist.p95());
console.log('P50:', hist.p50());

createCounter(name, labels)

Create a counter for tracking occurrences.

const counter = monitor.createCounter('requests_total', { service: 'api' });
counter.inc();
counter.inc(5);
console.log(counter.get()); // 6

createGauge(name, labels)

Create a gauge for point-in-time values.

const gauge = monitor.createGauge('active_connections', { host: 'localhost' });
gauge.set(10);
gauge.inc();
gauge.dec();
console.log(gauge.get()); // 10

createAlertRule(options)

Create an alert rule.

const rule = monitor.createAlertRule({
  name: 'high_error_rate',
  metric: 'error_rate',
  condition: 'gt', // 'gt', 'lt', 'eq', 'gte', 'lte'
  threshold: 0.05,
  duration: 60000, // 1 minute
  severity: 'warning', // 'info', 'warning', 'critical'
  message: 'Error rate exceeds 5%'
});

createAnomalyDetector(options)

Create an anomaly detector.

const detector = monitor.createAnomalyDetector({
  windowSize: 100,
  zScoreThreshold: 3
});

const result = detector.check('latency', 500);
console.log(result.anomaly); // true/false
console.log(result.zScore); // z-score value

quickMonitor(serviceName, operations)

Create a simple monitoring setup with default alert rules.

const collector = monitor.quickMonitor('my-agent');
// Pre-configured with high_error_rate and high_p99_latency alerts

Classes

Histogram

Track latency percentiles.

const hist = new monitor.Histogram({ buckets: [10, 50, 100, 500, 1000] });

hist.observe(150);
hist.observe(250);
hist.observe(350);

const stats = hist.getStats();
// {
//   count: 3,
//   sum: 750,
//   mean: 250,
//   p50: 250,
//   p95: 350,
//   p99: 350,
//   buckets: [...]
// }

Counter

Monotonically increasing value.

const counter = new monitor.Counter('requests', { service: 'api' });
counter.inc();
counter.inc(10);
console.log(counter.get()); // 11
counter.reset();
console.log(counter.get()); // 0

Gauge

Point-in-time value.

const gauge = new monitor.Gauge('temperature');
gauge.set(25);
gauge.inc(2);
gauge.dec(1);
console.log(gauge.get()); // 26

AlertRule

Define alert conditions.

const rule = new monitor.AlertRule({
  name: 'high_latency',
  metric: 'latency_p99',
  condition: 'gt',
  threshold: 1000,
  duration: 60000,
  severity: 'warning',
  message: 'P99 latency exceeds 1 second'
});

const result = rule.evaluate(1500);
// {
//   name: 'high_latency',
//   state: 'firing', // 'inactive', 'pending', 'firing'
//   value: 1500,
//   threshold: 1000,
//   severity: 'warning',
//   message: 'P99 latency exceeds 1 second'
// }

MetricsCollector

Collect and aggregate metrics.

const collector = new monitor.MetricsCollector({ serviceName: 'agent' });

// Record operations
collector.recordLatency('tool_call', 150);
collector.recordError('tool_call', 'timeout');
collector.recordSuccess('tool_call');

// Get rates
const errorRate = collector.getErrorRate('tool_call');
const successRate = collector.getSuccessRate('tool_call');

// Add alert rules
collector.addAlertRule({
  name: 'high_error_rate',
  metric: 'tool_call_errors_total',
  condition: 'gt',
  threshold: 10,
  severity: 'warning'
});

// Evaluate alerts
const alerts = collector.evaluateAlerts();

// Export Prometheus format
const prometheus = collector.exportPrometheus();

// Generate Grafana dashboard
const dashboard = collector.generateGrafanaDashboard();

// Get summary
const summary = collector.getSummary();

AnomalyDetector

Detect anomalies using z-score.

const detector = new monitor.AnomalyDetector({
  windowSize: 100,
  zScoreThreshold: 3
});

// Feed values
for (let i = 0; i \x3C 50; i++) {
  detector.check('latency', 100 + Math.random() * 50);
}

// Check for anomaly
const result = detector.check('latency', 500); // Unusual value
console.log(result.anomaly); // true if z-score > 3

// Get baseline
const baseline = detector.getBaseline('latency');
// { mean: 125, stdDev: 14.4, min: 100, max: 150, count: 51 }

Example: Complete Monitoring Setup

const monitor = require('./skills/agent-metrics-monitor');

// Create collector
const collector = monitor.createMetricsCollector({
  serviceName: 'production-agent',
  prefix: 'agent'
});

// Add alert rules
collector.addAlertRule({
  name: 'high_p99_latency',
  metric: 'tool_call_latency',
  condition: 'gt',
  threshold: 2000,
  duration: 60000,
  severity: 'critical',
  message: 'P99 latency exceeds 2 seconds'
});

collector.addAlertRule({
  name: 'high_error_rate',
  metric: 'tool_call_errors_total',
  condition: 'gt',
  threshold: 100,
  duration: 300000, // 5 minutes
  severity: 'warning',
  message: 'More than 100 errors in 5 minutes'
});

// Simulate operations
const operations = ['tool_call', 'llm_request', 'memory_access'];

for (let i = 0; i \x3C 100; i++) {
  const op = operations[i % 3];
  const latency = 50 + Math.random() * 200;
  
  collector.recordLatency(op, latency);
  
  if (Math.random() \x3C 0.05) {
    collector.recordError(op, 'timeout');
  } else {
    collector.recordSuccess(op);
  }
}

// Get metrics
console.log('Tool call error rate:', collector.getErrorRate('tool_call'));
console.log('LLM request P99:', collector.histogram('llm_request_latency').p99());

// Evaluate alerts
const alerts = collector.evaluateAlerts();
for (const alert of alerts) {
  if (alert.state === 'firing') {
    console.log(`ALERT: ${alert.name} - ${alert.message}`);
  }
}

// Export Prometheus format
console.log('\
--- Prometheus Metrics ---');
console.log(collector.exportPrometheus());

// Generate Grafana dashboard
const dashboard = collector.generateGrafanaDashboard({
  title: 'Production Agent Dashboard'
});
console.log('\
--- Grafana Dashboard ---');
console.log(JSON.stringify(dashboard, null, 2));

Example: Anomaly Detection

const monitor = require('./skills/agent-metrics-monitor');

const collector = monitor.createMetricsCollector({ serviceName: 'agent' });
const detector = monitor.createAnomalyDetector({ zScoreThreshold: 2.5 });

// Train with normal values
console.log('Training with normal values...');
for (let i = 0; i \x3C 100; i++) {
  const latency = 100 + Math.random() * 50; // 100-150ms
  collector.recordLatency('api_call', latency);
  detector.check('api_call_latency', latency);
}

// Get baseline
const baseline = detector.getBaseline('api_call_latency');
console.log('Baseline:', baseline);

// Test with anomalies
console.log('\
Testing for anomalies...');
const testValues = [120, 135, 500, 1000, 125];

for (const value of testValues) {
  const result = detector.check('api_call_latency', value);
  console.log(`Value: ${value}ms, Anomaly: ${result.anomaly}, Z-Score: ${result.zScore?.toFixed(2)}`);
}

Example: Prometheus Export

const monitor = require('./skills/agent-metrics-monitor');

const collector = monitor.createMetricsCollector({
  serviceName: 'my-agent',
  prefix: 'agent'
});

// Record some metrics
collector.recordLatency('tool_call', 150);
collector.recordLatency('tool_call', 250);
collector.recordError('tool_call', 'timeout');
collector.recordSuccess('tool_call');

const gauge = collector.gauge('active_sessions', { region: 'us-east' });
gauge.set(42);

// Export Prometheus format
const prometheus = collector.exportPrometheus();
console.log(prometheus);

// Output:
// # TYPE agent_tool_call_errors_total counter
// agent_tool_call_errors_total{error_type="timeout"} 1
// # TYPE agent_tool_call_success_total counter
// agent_tool_call_success_total 1
// # TYPE agent_tool_call_total counter
// agent_tool_call_total 2
// # TYPE agent_active_sessions gauge
// agent_active_sessions{region="us-east"} 42
// # TYPE agent_tool_call_latency histogram
// agent_tool_call_latency_bucket{le="1"} 0
// ...

Example: Grafana Dashboard Generation

const monitor = require('./skills/agent-metrics-monitor');

const collector = monitor.createMetricsCollector({ serviceName: 'api-agent' });

// Generate dashboard configuration
const dashboard = collector.generateGrafanaDashboard({
  title: 'API Agent Metrics',
  uid: 'api-agent-metrics'
});

// Save to file for Grafana provisioning
const fs = require('fs');
fs.writeFileSync('grafana-dashboard.json', JSON.stringify(dashboard, null, 2));

console.log('Dashboard generated with panels:');
for (const panel of dashboard.dashboard.panels) {
  console.log(`  - ${panel.title} (${panel.type})`);
}

Alert Rule Conditions

  • gt - Greater than
  • lt - Less than
  • eq - Equal to
  • gte - Greater than or equal
  • lte - Less than or equal

Alert Severities

  • info - Informational
  • warning - Warning condition
  • critical - Critical condition requiring immediate attention

Notes

  • Histograms use bucket-based storage for Prometheus compatibility
  • Percentiles are calculated from stored values for accuracy
  • Time series data has configurable retention
  • Anomaly detection uses z-score method
  • Grafana dashboards are generated in JSON format for provisioning
  • Prometheus export follows standard exposition format
安全使用建议
This skill appears internally consistent with its stated purpose, but it includes executable JavaScript (index.js). Before installing: (1) inspect the complete index.js yourself (or ask the author) to confirm there are no network calls, telemetry, shell execs, or file writes you don't expect; (2) run it in a sandbox or test agent with no sensitive credentials; (3) prefer code from a known publisher or pin a vetted package version; (4) if you need stronger assurance, request a full code review or ask the publisher for explanation of any non-obvious behavior. My confidence is medium because the index.js content provided in the prompt was truncated near the end, so the final sections of the file were not visible for review.
功能分析
Type: OpenClaw Skill Name: jpeng-agent-metrics-monitor Version: 1.0.0 The agent-metrics-monitor skill is a legitimate utility for tracking performance metrics like latency and error rates. The code (index.js) is self-contained, using in-memory data structures to calculate percentiles and detect anomalies without any external network calls, file system access, or risky execution patterns. The SKILL.md documentation is purely descriptive and contains no prompt-injection attempts or malicious instructions.
能力评估
Purpose & Capability
Name and description match the code and SKILL.md: the project contains APIs for histograms, counters, gauges, time-series storage, alert rules, and Prometheus/Grafana export. No unrelated binaries, credentials, or config paths are requested.
Instruction Scope
SKILL.md instructions are scoped to creating and using the monitoring APIs, generating Prometheus output and Grafana dashboards, and configuring alert rules. The instructions do not ask the agent to read arbitrary files, access secrets, or send data to external endpoints.
Install Mechanism
No install spec is declared (instruction-only style), but the package includes index.js and package.json — the skill will execute bundled JavaScript when required by the agent. No external downloads, installers, or archives are used.
Credentials
The skill declares no required environment variables, credentials, or config paths. Its behavior appears to be in-memory metrics collection and local export; there are no disproportionate secret requests.
Persistence & Privilege
Flags show always:false and normal agent invocation. The code appears to use in-memory data structures and does not declare writing to agent configs or requesting persistent system privileges in the provided portion.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install jpeng-agent-metrics-monitor
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /jpeng-agent-metrics-monitor 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of agent-metrics-monitor. - Provides agent behavior metrics monitoring and alerting with Prometheus and Grafana support. - Supports latency tracking (P50, P95, P99), error rates, anomaly detection, and custom alert rules. - Includes utilities for creating counters, gauges, histograms, alert rules, and anomaly detectors. - Offers API for exporting Prometheus metrics and generating Grafana dashboards. - Example usage and class APIs for rapid instrumentation and visualization.
元数据
Slug jpeng-agent-metrics-monitor
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

agent-metrics-monitor 是什么?

Provides monitoring and alerting for agent abnormal behavior metrics with Prometheus and Grafana support, including P99 latency, error rates, anomaly detecti... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 168 次。

如何安装 agent-metrics-monitor?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install jpeng-agent-metrics-monitor」即可一键安装,无需额外配置。

agent-metrics-monitor 是免费的吗?

是的,agent-metrics-monitor 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

agent-metrics-monitor 支持哪些平台?

agent-metrics-monitor 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 agent-metrics-monitor?

由 jpengcheng523-netizen(@jpengcheng523-netizen)开发并维护,当前版本 v1.0.0。

💬 留言讨论