← Back to Skills Marketplace
jpengcheng523-netizen

agent-metrics-monitor

by jpengcheng523-netizen · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
168
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install jpeng-agent-metrics-monitor
Description
Provides monitoring and alerting for agent abnormal behavior metrics with Prometheus and Grafana support, including P99 latency, error rates, anomaly detecti...
README (SKILL.md)

Agent Metrics Monitor

Monitor and alert agent abnormal behavior metrics with Prometheus and Grafana support.

When to Use

  • Monitoring agent operation latencies (P50, P95, P99)
  • Tracking error rates and success rates
  • Detecting anomalies in agent behavior
  • Generating Prometheus-compatible metrics
  • Creating Grafana dashboard configurations
  • Setting up alert rules for abnormal behavior

Usage

const monitor = require('./skills/agent-metrics-monitor');

// Create metrics collector
const collector = monitor.createMetricsCollector({ serviceName: 'my-agent' });

// Record latency
collector.recordLatency('tool_call', 150);
collector.recordLatency('tool_call', 250);

// Record errors and successes
collector.recordError('tool_call', 'timeout');
collector.recordSuccess('tool_call');

// Get error rate
console.log('Error rate:', collector.getErrorRate('tool_call'));

// Export Prometheus format
console.log(collector.exportPrometheus());

// Generate Grafana dashboard
const dashboard = collector.generateGrafanaDashboard({ title: 'My Agent' });

API

createMetricsCollector(options)

Create a metrics collector instance.

const collector = monitor.createMetricsCollector({
  serviceName: 'my-agent',
  prefix: 'agent',
  timeSeries: {
    maxPoints: 10000,
    retentionMs: 86400000 // 24 hours
  }
});

createHistogram(options)

Create a histogram for latency tracking.

const hist = monitor.createHistogram({
  buckets: [1, 5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, 10000],
  maxValues: 10000
});

hist.observe(150);
console.log('P99:', hist.p99());
console.log('P95:', hist.p95());
console.log('P50:', hist.p50());

createCounter(name, labels)

Create a counter for tracking occurrences.

const counter = monitor.createCounter('requests_total', { service: 'api' });
counter.inc();
counter.inc(5);
console.log(counter.get()); // 6

createGauge(name, labels)

Create a gauge for point-in-time values.

const gauge = monitor.createGauge('active_connections', { host: 'localhost' });
gauge.set(10);
gauge.inc();
gauge.dec();
console.log(gauge.get()); // 10

createAlertRule(options)

Create an alert rule.

const rule = monitor.createAlertRule({
  name: 'high_error_rate',
  metric: 'error_rate',
  condition: 'gt', // 'gt', 'lt', 'eq', 'gte', 'lte'
  threshold: 0.05,
  duration: 60000, // 1 minute
  severity: 'warning', // 'info', 'warning', 'critical'
  message: 'Error rate exceeds 5%'
});

createAnomalyDetector(options)

Create an anomaly detector.

const detector = monitor.createAnomalyDetector({
  windowSize: 100,
  zScoreThreshold: 3
});

const result = detector.check('latency', 500);
console.log(result.anomaly); // true/false
console.log(result.zScore); // z-score value

quickMonitor(serviceName, operations)

Create a simple monitoring setup with default alert rules.

const collector = monitor.quickMonitor('my-agent');
// Pre-configured with high_error_rate and high_p99_latency alerts

Classes

Histogram

Track latency percentiles.

const hist = new monitor.Histogram({ buckets: [10, 50, 100, 500, 1000] });

hist.observe(150);
hist.observe(250);
hist.observe(350);

const stats = hist.getStats();
// {
//   count: 3,
//   sum: 750,
//   mean: 250,
//   p50: 250,
//   p95: 350,
//   p99: 350,
//   buckets: [...]
// }

Counter

Monotonically increasing value.

const counter = new monitor.Counter('requests', { service: 'api' });
counter.inc();
counter.inc(10);
console.log(counter.get()); // 11
counter.reset();
console.log(counter.get()); // 0

Gauge

Point-in-time value.

const gauge = new monitor.Gauge('temperature');
gauge.set(25);
gauge.inc(2);
gauge.dec(1);
console.log(gauge.get()); // 26

AlertRule

Define alert conditions.

const rule = new monitor.AlertRule({
  name: 'high_latency',
  metric: 'latency_p99',
  condition: 'gt',
  threshold: 1000,
  duration: 60000,
  severity: 'warning',
  message: 'P99 latency exceeds 1 second'
});

const result = rule.evaluate(1500);
// {
//   name: 'high_latency',
//   state: 'firing', // 'inactive', 'pending', 'firing'
//   value: 1500,
//   threshold: 1000,
//   severity: 'warning',
//   message: 'P99 latency exceeds 1 second'
// }

MetricsCollector

Collect and aggregate metrics.

const collector = new monitor.MetricsCollector({ serviceName: 'agent' });

// Record operations
collector.recordLatency('tool_call', 150);
collector.recordError('tool_call', 'timeout');
collector.recordSuccess('tool_call');

// Get rates
const errorRate = collector.getErrorRate('tool_call');
const successRate = collector.getSuccessRate('tool_call');

// Add alert rules
collector.addAlertRule({
  name: 'high_error_rate',
  metric: 'tool_call_errors_total',
  condition: 'gt',
  threshold: 10,
  severity: 'warning'
});

// Evaluate alerts
const alerts = collector.evaluateAlerts();

// Export Prometheus format
const prometheus = collector.exportPrometheus();

// Generate Grafana dashboard
const dashboard = collector.generateGrafanaDashboard();

// Get summary
const summary = collector.getSummary();

AnomalyDetector

Detect anomalies using z-score.

const detector = new monitor.AnomalyDetector({
  windowSize: 100,
  zScoreThreshold: 3
});

// Feed values
for (let i = 0; i \x3C 50; i++) {
  detector.check('latency', 100 + Math.random() * 50);
}

// Check for anomaly
const result = detector.check('latency', 500); // Unusual value
console.log(result.anomaly); // true if z-score > 3

// Get baseline
const baseline = detector.getBaseline('latency');
// { mean: 125, stdDev: 14.4, min: 100, max: 150, count: 51 }

Example: Complete Monitoring Setup

const monitor = require('./skills/agent-metrics-monitor');

// Create collector
const collector = monitor.createMetricsCollector({
  serviceName: 'production-agent',
  prefix: 'agent'
});

// Add alert rules
collector.addAlertRule({
  name: 'high_p99_latency',
  metric: 'tool_call_latency',
  condition: 'gt',
  threshold: 2000,
  duration: 60000,
  severity: 'critical',
  message: 'P99 latency exceeds 2 seconds'
});

collector.addAlertRule({
  name: 'high_error_rate',
  metric: 'tool_call_errors_total',
  condition: 'gt',
  threshold: 100,
  duration: 300000, // 5 minutes
  severity: 'warning',
  message: 'More than 100 errors in 5 minutes'
});

// Simulate operations
const operations = ['tool_call', 'llm_request', 'memory_access'];

for (let i = 0; i \x3C 100; i++) {
  const op = operations[i % 3];
  const latency = 50 + Math.random() * 200;
  
  collector.recordLatency(op, latency);
  
  if (Math.random() \x3C 0.05) {
    collector.recordError(op, 'timeout');
  } else {
    collector.recordSuccess(op);
  }
}

// Get metrics
console.log('Tool call error rate:', collector.getErrorRate('tool_call'));
console.log('LLM request P99:', collector.histogram('llm_request_latency').p99());

// Evaluate alerts
const alerts = collector.evaluateAlerts();
for (const alert of alerts) {
  if (alert.state === 'firing') {
    console.log(`ALERT: ${alert.name} - ${alert.message}`);
  }
}

// Export Prometheus format
console.log('\
--- Prometheus Metrics ---');
console.log(collector.exportPrometheus());

// Generate Grafana dashboard
const dashboard = collector.generateGrafanaDashboard({
  title: 'Production Agent Dashboard'
});
console.log('\
--- Grafana Dashboard ---');
console.log(JSON.stringify(dashboard, null, 2));

Example: Anomaly Detection

const monitor = require('./skills/agent-metrics-monitor');

const collector = monitor.createMetricsCollector({ serviceName: 'agent' });
const detector = monitor.createAnomalyDetector({ zScoreThreshold: 2.5 });

// Train with normal values
console.log('Training with normal values...');
for (let i = 0; i \x3C 100; i++) {
  const latency = 100 + Math.random() * 50; // 100-150ms
  collector.recordLatency('api_call', latency);
  detector.check('api_call_latency', latency);
}

// Get baseline
const baseline = detector.getBaseline('api_call_latency');
console.log('Baseline:', baseline);

// Test with anomalies
console.log('\
Testing for anomalies...');
const testValues = [120, 135, 500, 1000, 125];

for (const value of testValues) {
  const result = detector.check('api_call_latency', value);
  console.log(`Value: ${value}ms, Anomaly: ${result.anomaly}, Z-Score: ${result.zScore?.toFixed(2)}`);
}

Example: Prometheus Export

const monitor = require('./skills/agent-metrics-monitor');

const collector = monitor.createMetricsCollector({
  serviceName: 'my-agent',
  prefix: 'agent'
});

// Record some metrics
collector.recordLatency('tool_call', 150);
collector.recordLatency('tool_call', 250);
collector.recordError('tool_call', 'timeout');
collector.recordSuccess('tool_call');

const gauge = collector.gauge('active_sessions', { region: 'us-east' });
gauge.set(42);

// Export Prometheus format
const prometheus = collector.exportPrometheus();
console.log(prometheus);

// Output:
// # TYPE agent_tool_call_errors_total counter
// agent_tool_call_errors_total{error_type="timeout"} 1
// # TYPE agent_tool_call_success_total counter
// agent_tool_call_success_total 1
// # TYPE agent_tool_call_total counter
// agent_tool_call_total 2
// # TYPE agent_active_sessions gauge
// agent_active_sessions{region="us-east"} 42
// # TYPE agent_tool_call_latency histogram
// agent_tool_call_latency_bucket{le="1"} 0
// ...

Example: Grafana Dashboard Generation

const monitor = require('./skills/agent-metrics-monitor');

const collector = monitor.createMetricsCollector({ serviceName: 'api-agent' });

// Generate dashboard configuration
const dashboard = collector.generateGrafanaDashboard({
  title: 'API Agent Metrics',
  uid: 'api-agent-metrics'
});

// Save to file for Grafana provisioning
const fs = require('fs');
fs.writeFileSync('grafana-dashboard.json', JSON.stringify(dashboard, null, 2));

console.log('Dashboard generated with panels:');
for (const panel of dashboard.dashboard.panels) {
  console.log(`  - ${panel.title} (${panel.type})`);
}

Alert Rule Conditions

  • gt - Greater than
  • lt - Less than
  • eq - Equal to
  • gte - Greater than or equal
  • lte - Less than or equal

Alert Severities

  • info - Informational
  • warning - Warning condition
  • critical - Critical condition requiring immediate attention

Notes

  • Histograms use bucket-based storage for Prometheus compatibility
  • Percentiles are calculated from stored values for accuracy
  • Time series data has configurable retention
  • Anomaly detection uses z-score method
  • Grafana dashboards are generated in JSON format for provisioning
  • Prometheus export follows standard exposition format
Usage Guidance
This skill appears internally consistent with its stated purpose, but it includes executable JavaScript (index.js). Before installing: (1) inspect the complete index.js yourself (or ask the author) to confirm there are no network calls, telemetry, shell execs, or file writes you don't expect; (2) run it in a sandbox or test agent with no sensitive credentials; (3) prefer code from a known publisher or pin a vetted package version; (4) if you need stronger assurance, request a full code review or ask the publisher for explanation of any non-obvious behavior. My confidence is medium because the index.js content provided in the prompt was truncated near the end, so the final sections of the file were not visible for review.
Capability Analysis
Type: OpenClaw Skill Name: jpeng-agent-metrics-monitor Version: 1.0.0 The agent-metrics-monitor skill is a legitimate utility for tracking performance metrics like latency and error rates. The code (index.js) is self-contained, using in-memory data structures to calculate percentiles and detect anomalies without any external network calls, file system access, or risky execution patterns. The SKILL.md documentation is purely descriptive and contains no prompt-injection attempts or malicious instructions.
Capability Assessment
Purpose & Capability
Name and description match the code and SKILL.md: the project contains APIs for histograms, counters, gauges, time-series storage, alert rules, and Prometheus/Grafana export. No unrelated binaries, credentials, or config paths are requested.
Instruction Scope
SKILL.md instructions are scoped to creating and using the monitoring APIs, generating Prometheus output and Grafana dashboards, and configuring alert rules. The instructions do not ask the agent to read arbitrary files, access secrets, or send data to external endpoints.
Install Mechanism
No install spec is declared (instruction-only style), but the package includes index.js and package.json — the skill will execute bundled JavaScript when required by the agent. No external downloads, installers, or archives are used.
Credentials
The skill declares no required environment variables, credentials, or config paths. Its behavior appears to be in-memory metrics collection and local export; there are no disproportionate secret requests.
Persistence & Privilege
Flags show always:false and normal agent invocation. The code appears to use in-memory data structures and does not declare writing to agent configs or requesting persistent system privileges in the provided portion.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install jpeng-agent-metrics-monitor
  3. After installation, invoke the skill by name or use /jpeng-agent-metrics-monitor
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of agent-metrics-monitor. - Provides agent behavior metrics monitoring and alerting with Prometheus and Grafana support. - Supports latency tracking (P50, P95, P99), error rates, anomaly detection, and custom alert rules. - Includes utilities for creating counters, gauges, histograms, alert rules, and anomaly detectors. - Offers API for exporting Prometheus metrics and generating Grafana dashboards. - Example usage and class APIs for rapid instrumentation and visualization.
Metadata
Slug jpeng-agent-metrics-monitor
Version 1.0.0
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is agent-metrics-monitor?

Provides monitoring and alerting for agent abnormal behavior metrics with Prometheus and Grafana support, including P99 latency, error rates, anomaly detecti... It is an AI Agent Skill for Claude Code / OpenClaw, with 168 downloads so far.

How do I install agent-metrics-monitor?

Run "/install jpeng-agent-metrics-monitor" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is agent-metrics-monitor free?

Yes, agent-metrics-monitor is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does agent-metrics-monitor support?

agent-metrics-monitor is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created agent-metrics-monitor?

It is built and maintained by jpengcheng523-netizen (@jpengcheng523-netizen); the current version is v1.0.0.

💬 Comments