Prometheus & Grafana

Metric Types & Exposition

18 min Lesson 2 of 32

Metric Types & Exposition

Prometheus defines four fundamental metric types, each modeling a different aspect of system behavior. Choosing the wrong type is one of the most common mistakes new users make — a mislabeled metric will produce subtly wrong query results in ways that are hard to debug under production pressure. This lesson covers what each type is, when to use it, how its wire format works, and what breaks when you get it wrong.

Counter

A counter is a monotonically increasing integer or float that only goes up (or resets to zero on process restart). It is the right choice for any total-since-start measurement: requests served, bytes sent, errors encountered, tasks completed.

The critical rule: never use a counter for something that can decrease. Current queue depth is not a counter. CPU temperature is not a counter. If you find yourself needing to subtract two counter readings, you are using the type correctly — rate() and increase() are designed for exactly that.

Why counters reset: When a process restarts, Prometheus detects the reset by observing a scrape value lower than the previous one. Functions like rate() automatically handle counter resets by splitting the interval and summing the partial rates. This is why you must never use delta() or raw subtraction on a counter — those functions do not handle resets.

Naming convention: counters always end in _total. For example: http_requests_total, process_cpu_seconds_total, rpc_errors_total.

Gauge

A gauge is a value that can go up and down arbitrarily. It represents a snapshot of a current state. Examples: memory in use, number of goroutines, active connections, current temperature, queue depth, cache hit ratio, disk usage percentage.

Gauges are used with functions like avg_over_time(), max_over_time(), and plain arithmetic — no rate needed. They are also the right type for configuration values, version numbers (via label tricks), and boolean state encoded as 0/1.

Production pattern — saturation signals: Google SRE's Four Golden Signals names saturation as a key signal. Every resource you own (CPU, memory, thread pool, connection pool, disk) should expose a gauge showing current utilization as a fraction of capacity. Threshold alerting (gauge > 0.85) on these gauges is the most reliable early-warning mechanism before a resource exhausts.

Histogram

A histogram samples observations and counts them into configurable buckets. It also tracks the running total count and sum of all observed values, enabling accurate percentile estimation and average computation across time series aggregations.

A single histogram named http_request_duration_seconds automatically creates three exposed series:

  • http_request_duration_seconds_bucket{le="0.1"} — count of requests that finished in ≤ 100ms
  • http_request_duration_seconds_count — total number of observations
  • http_request_duration_seconds_sum — sum of all observed durations

The bucket label le (less-than-or-equal) is mandatory. Prometheus always appends an le="+Inf" bucket that equals _count. Percentile estimation uses histogram_quantile(), which performs linear interpolation within a bucket — accuracy depends entirely on bucket placement relative to where your data actually falls.

Bucket misconfiguration causes misleading SLOs: If your SLO is "99% of requests under 200ms" but your histogram buckets are [0.1, 0.5, 1.0, 5.0], the nearest bucket boundary is 500ms. histogram_quantile(0.99, ...) will return a value somewhere between 100ms and 500ms based on linear interpolation — not a measurement of where your traffic actually is. At Google scale, this has led to teams believing they were meeting latency SLOs while actually violating them. Always place at least two bucket boundaries within your SLO window.

Default bucket recommendations for HTTP latency (.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10 seconds) suit typical web services. For internal RPC or database calls, shift the buckets an order of magnitude lower. Define buckets in your instrumentation before first deployment — changing them retroactively breaks existing dashboards and recording rules.

Summary

A summary also samples observations but computes configurable quantiles on the client side over a sliding time window. It exposes pre-computed quantile labels like {quantile="0.99"} alongside _count and _sum.

The key tradeoff versus histograms:

  • Summaries are accurate at the quantile value but cannot be aggregated across instances. sum(rate(...)) works on _count and _sum, but you cannot meaningfully average the {quantile="0.99"} series from 50 pods — the result is statistically nonsensical.
  • Histograms are approximate (bucket-interpolated) but fully aggregatable. You can call histogram_quantile() across the merged buckets from all instances to get a fleet-wide percentile.
Industry consensus: For any microservice running multiple replicas — which is every production service — use histograms over summaries. Summaries are most appropriate for single-process tools (CLI utilities, batch jobs) where accurate client-side percentiles are needed without a Prometheus backend. The Go and Java client libraries default to histograms for this reason.

The Exposition Format

Prometheus scrapes metrics over HTTP from the /metrics endpoint. The wire format is a plain-text line protocol defined in the OpenMetrics specification. Each metric family is prefixed with a HELP line (human description) and a TYPE declaration, followed by one data line per time series.

# HELP http_requests_total Total HTTP requests received since process start. # TYPE http_requests_total counter http_requests_total{method="GET",status="200"} 1027453 http_requests_total{method="GET",status="500"} 312 http_requests_total{method="POST",status="200"} 88921 # HELP http_request_duration_seconds HTTP request latency histogram. # TYPE http_request_duration_seconds histogram http_request_duration_seconds_bucket{handler="/api/v1/query",le="0.05"} 24054 http_request_duration_seconds_bucket{handler="/api/v1/query",le="0.1"} 33444 http_request_duration_seconds_bucket{handler="/api/v1/query",le="0.25"} 100392 http_request_duration_seconds_bucket{handler="/api/v1/query",le="+Inf"} 144320 http_request_duration_seconds_sum{handler="/api/v1/query"} 53423.147 http_request_duration_seconds_count{handler="/api/v1/query"} 144320 # HELP go_goroutines Number of goroutines currently running. # TYPE go_goroutines gauge go_goroutines 231

Key format rules: metric names must match [a-zA-Z_:][a-zA-Z0-9_:]*. Label values are UTF-8 strings quoted with double quotes; internal backslashes and double-quotes are escaped. A timestamp (Unix milliseconds) is optionally appended as a third field but is rarely used — Prometheus tracks its own scrape timestamp. Comment lines start with #; any other line starting with # is invalid.

To inspect any target's raw exposition output, curl it directly. This is your first debugging step whenever a metric is missing from a dashboard:

curl -s http://localhost:9090/metrics | grep -E '^(#|http_)' # For a remote target via port-forward (Kubernetes): kubectl port-forward svc/my-app 8080:8080 curl -s http://localhost:8080/metrics | grep 'TYPE\|HELP' | head -30
Prometheus metric types and their time-series shapes Prometheus Metric Types — Shapes Over Time Counter Monotonic · resets on restart Gauge Arbitrary up/down · snapshot Histogram le= .05 .5 Buckets · aggregatable p99 Summary p99 p90 p50 Client-side quantiles · single instance Text Exposition Format (wire protocol) # HELP http_requests_total Total HTTP requests since start. # TYPE http_requests_total counter http_requests_total{method="GET",status="200"} 1027453 [optional_timestamp_ms] Metric name Label set (key=value pairs) Sample value
The four Prometheus metric types and the text exposition format they share on the /metrics endpoint.

Instrumentation Best Practices at Scale

At Google, Lyft, and similar companies, the instrumentation contract is part of the service's API. A few hard-won rules:

  • Use base units always: seconds, bytes, ratios (0–1). Never milliseconds or megabytes — PromQL and Grafana assume SI base units, and mixing them creates dashboard scaling errors that waste on-call hours.
  • Label cardinality kills TSDB: Never use a user ID, IP address, request URL, or any unbounded value as a label. Each unique label set creates a new time series in Prometheus's TSDB. A service with 100k users exposing {user_id=...} will OOM a Prometheus server within hours. Cap label cardinality at a few hundred distinct values per label.
  • Prefix metric names with the subsystem: myservice_http_requests_total not requests_total. The prefix survives federation and remote-write pipelines where metrics from many services merge.
  • Expose a build-info gauge: A myservice_build_info{version="1.4.2", commit="a3f91c"} 1 gauge (always value 1) enables version-aware alerting and rollback correlation in dashboards without requiring log scraping.
Test your exposition format locally: Run promtool check metrics < /path/to/metrics.txt before deploying any new instrumentation. It catches type mismatches, illegal label names, missing _total suffixes on counters, and histogram bucket ordering errors — all of which silently corrupt query results in production.