Prometheus Monitoring
Metric Types
| Type | Description | Example |
|---|---|---|
| Counter | Monotonically increasing; never decreases | http_requests_total, errors_total |
| Gauge | Can go up and down | memory_usage_bytes, active_connections |
| Histogram | Observes and buckets values; calculates quantiles | http_request_duration_seconds |
| Summary | Client-side quantile calculation over sliding window | rpc_duration_seconds |
PromQL Examples
# Instant vector - current value
http_requests_total
# With label filter
http_requests_total{job="api", status="200"}
# Range vector - last 5 minutes
http_requests_total[5m]
# Rate (per-second rate over 5m)
rate(http_requests_total[5m])
# Error rate
rate(http_requests_errors_total[5m]) / rate(http_requests_total[5m])
# 95th percentile latency
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
# Sum by label
sum(rate(http_requests_total[5m])) by (service)
# Average memory usage
avg(container_memory_usage_bytes) by (pod)
# CPU usage percentage
100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
Alerting Rules
# alerts.yml
groups:
- name: api-alerts
rules:
- alert: HighErrorRate
expr: |
rate(http_requests_errors_total[5m])
/ rate(http_requests_total[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate on {{ $labels.service }}"
description: "Error rate is {{ $value | humanizePercentage }} (threshold: 5%)"
- alert: HighLatency
expr: |
histogram_quantile(0.95,
rate(http_request_duration_seconds_bucket[5m])
) > 2
for: 10m
labels:
severity: warning
annotations:
summary: "High p95 latency: {{ $value }}s"
- alert: InstanceDown
expr: up == 0
for: 1m
labels:
severity: critical