Distributed Tracing & OpenTelemetry

The OpenTelemetry Collector

18 min Lesson 5 of 28

The OpenTelemetry Collector

Every service you instrument eventually needs to get its telemetry somewhere useful — a tracing backend, a metrics store, a log aggregation system. You could wire each SDK directly to each backend, but that approach collapses under operational reality: credentials scattered across every pod, no way to enrich or filter data before it leaves the application, and a re-deploy every time you change backends. The OpenTelemetry Collector solves all three problems at once. It is a vendor-neutral, production-grade telemetry pipeline that receives signals from your applications, transforms them, and routes them to one or more backends — all without touching application code.

At Google-scale organisations, the Collector is not optional. It is the central nervous system of the observability stack: the single plane where data governance (sampling, PII scrubbing, cost control) is enforced before anything reaches a paid backend.

Architecture: Receivers, Processors, Exporters

The Collector is a composable pipeline. Every pipeline has three stages in order:

Receivers — ingest telemetry from sources. The otlp receiver accepts OTLP over gRPC (port 4317) and HTTP (port 4318). Other receivers pull from Prometheus endpoints, Jaeger, Zipkin, Kafka, Fluent Bit, and more. A Collector instance can run many receivers simultaneously.
Processors — transform, filter, batch, and enrich data in flight. Processors are the operational muscle of the pipeline: they drop spans you do not need, add Kubernetes metadata, cap attribute counts, and batch exports for throughput efficiency.
Exporters — push transformed data to backends. The OTLP exporter speaks to Grafana Tempo, Honeycomb, and any OTel-native backend. The Prometheus exporter exposes a scrape endpoint. The debug exporter prints to stdout — invaluable during development.

Pipelines are declared per signal type (traces, metrics, logs) and can fan-out to multiple exporters simultaneously. Connecting the same processor to multiple pipelines lets you enforce a single normalisation rule across all signal types.

Extensions are a fourth top-level concept: they add auxiliary capabilities such as health checks (health_check), a pprof profiler endpoint (pprof), and a zPages debug UI (zpages). Extensions run alongside pipelines but are not part of the data flow.

The Collector pipeline: sources feed receivers, processors enrich and filter in-flight data, exporters deliver to multiple backends simultaneously.

A Production-Grade Collector Configuration

The following is a realistic otelcol-config.yaml that you would deploy as a DaemonSet or sidecar in Kubernetes. It covers the most important processors and a multi-backend export setup.

# otelcol-config.yaml — production baseline
extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  pprof:
    endpoint: 0.0.0.0:1777
  zpages:
    endpoint: 0.0.0.0:55679

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

  # Scrape the Collector's own internal metrics
  prometheus:
    config:
      scrape_configs:
        - job_name: otel-collector
          scrape_interval: 15s
          static_configs:
            - targets: [0.0.0.0:8888]

processors:
  # CRITICAL: always first — prevents OOM if a burst overwhelms the queue
  memory_limiter:
    check_interval: 1s
    limit_percentage: 75
    spike_limit_percentage: 15

  # Batch for throughput; 512KB / 5s flush
  batch:
    send_batch_size: 8192
    timeout: 5s
    send_batch_max_size: 16384

  # Attach Kubernetes pod/namespace/node labels from the downward API
  k8sattributes:
    auth_type: serviceAccount
    passthrough: false
    extract:
      metadata: [k8s.pod.name, k8s.namespace.name, k8s.node.name,
                 k8s.deployment.name, k8s.container.name]
      labels:
        - tag_name: app.version
          key: app.kubernetes.io/version
          from: pod

  # Drop health-check and liveness probe noise
  filter/drop_health:
    traces:
      span:
        - 'attributes["http.route"] == "/healthz"'
        - 'attributes["http.route"] == "/readyz"'

  # Scrub PII from span attributes before they leave the cluster
  attributes/scrub_pii:
    actions:
      - key: user.email
        action: delete
      - key: http.request.header.authorization
        action: delete

  # Normalise resource attributes for consistent dashboards
  resource:
    attributes:
      - key: cloud.provider
        value: aws
        action: insert

exporters:
  # Traces -> Grafana Tempo via OTLP
  otlp/tempo:
    endpoint: tempo.monitoring.svc.cluster.local:4317
    tls:
      insecure: false
      ca_file: /var/run/secrets/tls/ca.crt
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s

  # Metrics -> Prometheus remote-write
  prometheusremotewrite:
    endpoint: http://prometheus.monitoring.svc.cluster.local:9090/api/v1/write
    tls:
      insecure: true

  # Logs -> Loki
  loki:
    endpoint: http://loki.monitoring.svc.cluster.local:3100/loki/api/v1/push
    default_labels_enabled:
      exporter: false
      job: true

  # Dev/debug only — never enable in production at INFO level
  debug:
    verbosity: basic

service:
  extensions: [health_check, pprof, zpages]

  pipelines:
    traces:
      receivers:  [otlp]
      processors: [memory_limiter, k8sattributes, filter/drop_health,
                   attributes/scrub_pii, batch]
      exporters:  [otlp/tempo]

    metrics:
      receivers:  [otlp, prometheus]
      processors: [memory_limiter, k8sattributes, resource, batch]
      exporters:  [prometheusremotewrite]

    logs:
      receivers:  [otlp]
      processors: [memory_limiter, k8sattributes, attributes/scrub_pii, batch]
      exporters:  [loki]

Always put memory_limiter first in every pipeline. If a traffic spike overwhelms the Collector's internal queue, the exporter will back-pressure and eventually drop data. Without memory_limiter, the process OOM-kills itself — and drops everything in its queue. With it, the Collector starts refusing new data gracefully (returning a retryable error to the SDK) before it runs out of memory. Omitting this processor is the single most common production Collector misconfiguration.

Deployment Patterns

How you deploy the Collector determines its operational characteristics. Three patterns dominate production environments:

DaemonSet (Agent mode) — one Collector pod per node. Each pod receives telemetry from all applications on that node. Low network hops, can enrich spans with node-level metadata, tolerates Collector restarts with minimal blast radius. The recommended default in Kubernetes. Managed by the OpenTelemetry Operator via the OpenTelemetryCollector CRD with mode: daemonset.
Sidecar mode — one Collector container per application pod. Maximum isolation; ideal for multi-tenant clusters where teams must not share a pipeline. Higher resource overhead. Use for security-sensitive workloads or when you need per-service sampling policies.
Gateway (Deployment) mode — a central, horizontally-scaled Collector fleet. All node-level Collectors forward to it via OTLP. The gateway enforces cluster-wide sampling, PII scrubbing, and fan-out to multiple backends. Enables stateful processors like tail_sampling that need to see all spans of a trace before making a sampling decision. In large clusters (100+ nodes), this two-tier topology — agent + gateway — is standard.

Two-tier topology: lightweight DaemonSet agents collect per-node, the central Gateway applies cluster-wide sampling and PII scrubbing before fan-out to backends.

Key Processors You Must Know

Beyond the basics, three processors define production-quality pipelines:

k8sattributes — auto-enriches every span and log with k8s.pod.name, k8s.namespace.name, k8s.deployment.name, and labels like app.version. Requires a ClusterRole with get/list/watch on pods. Without this, correlating a Tempo trace to the Kubernetes workload that produced it requires painful manual cross-referencing.
tail_sampling — makes sampling decisions after seeing the complete trace (unlike head-based sampling which decides at the first span). Policy types include latency (keep any trace over 200 ms), error (keep all traces with at least one error span), probabilistic (keep 1% of healthy fast traces). Must run in Gateway mode so the Collector can buffer all spans of a trace before deciding. This is the most operationally powerful processor — it lets you sample intelligently without losing the traces you actually need.
spanmetrics (connector, not processor) — derives RED metrics (rate, error rate, duration histogram) directly from trace spans, without extra application instrumentation. Emits traces_spanmetrics_calls_total and traces_spanmetrics_duration_milliseconds. This is how large teams get service-level metrics for free from the tracing pipeline.

Validate your Collector config before deploying. Run otelcol validate --config otelcol-config.yaml locally or in CI. The Collector will exit with a clear error message for typos, unknown component names, or pipeline wiring mistakes. Adding this as a CI step prevents rolling out a broken pipeline to production — a misconfigured Collector silently drops all telemetry, which you may not discover until an incident when you go looking for traces.

# Install the otelcol binary locally (macOS / Linux)
brew install opentelemetry-collector   # macOS
# or download from https://github.com/open-telemetry/opentelemetry-collector-releases

# Validate config syntax and component names
otelcol validate --config otelcol-config.yaml

# Run locally with debug output for development
otelcol --config otelcol-config.yaml

# Kubernetes: deploy via the OTel Operator CRD
kubectl apply -f - <<EOF
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel-agent
  namespace: monitoring
spec:
  mode: daemonset
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
    processors:
      memory_limiter:
        limit_percentage: 75
        spike_limit_percentage: 15
        check_interval: 1s
      batch:
        timeout: 5s
    exporters:
      otlp:
        endpoint: otel-gateway.monitoring.svc.cluster.local:4317
        tls:
          insecure: true
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [otlp]
EOF

# Check Collector's own health
curl -s http://localhost:13133/   # {"status":"Server available","upSince":"..."}

# Monitor Collector internal metrics (scrape at :8888)
curl -s http://localhost:8888/metrics | grep otelcol_exporter_sent_spans

The OpenTelemetry Collector is deceptively simple to start with — a single binary, a YAML file — and extremely powerful to operate at scale. Mastering its processor chain and deployment topology is a core DevOps skill: it is the difference between an observability stack that degrades under load and one that remains the last reliable source of truth exactly when you need it most.