Spans, Traces & Context Propagation
Spans, Traces & Context Propagation
A distributed trace is not a single monolithic record — it is a directed acyclic graph of spans, stitched together by identifiers that travel with every request across every service boundary. To use distributed tracing effectively in production, you need to understand the data model precisely: what a span contains, how spans form a tree, and how trace context is propagated across network hops so that spans created in completely different processes can be assembled into a coherent picture.
Anatomy of a Span
A span is the atomic unit of distributed tracing. It represents a single unit of work: an inbound HTTP handler, an outbound gRPC call, a database query, a cache lookup, a background job step. Every span carries a fixed set of fields defined by the OpenTelemetry specification:
- Trace ID — a 128-bit (16-byte) globally unique identifier for the entire request journey. All spans belonging to the same request share this ID. Typically encoded as a 32-character lowercase hex string:
4bf92f3577b34da6a3ce929d0e0e4736. - Span ID — a 64-bit (8-byte) unique identifier for this specific span within its trace. Encoded as 16 hex characters:
00f067aa0ba902b7. - Parent Span ID — the span ID of the immediate parent. The root span (entry point) has no parent (or an all-zero parent ID). This field is what creates the parent-child tree.
- Operation Name — a human-readable name describing the work:
HTTP GET /api/orders,db.query SELECT orders,redis.get order:8821. - Start Time — high-resolution timestamp (nanoseconds since Unix epoch).
- Duration — elapsed wall-clock time from start to end in nanoseconds.
- Status — one of
UNSET,OK, orERROR. SettingERRORon a span is what makes it surfaceable in backend UIs and tail-based sampling policies. - Attributes (formerly "tags") — key-value pairs of structured metadata. OTel defines semantic conventions for common attributes:
http.method,http.status_code,db.system,db.statement,net.peer.name. Add your own:order.id,user.tier,feature.flag. - Events (formerly "logs") — timestamped annotations within a span's duration: exception stack traces, cache misses, retry attempts. Not separate records — they live inside the span.
- Links — references to spans in other traces, used for message queues and async workflows where a consumer span is causally related to a producer span but not a direct child.
- Kind — role classification:
SERVER(handles an inbound call),CLIENT(makes an outbound call),PRODUCER/CONSUMER(message queue),INTERNAL(in-process work).
Parent-Child Relationships and the Trace Tree
Spans form a tree rooted at a single entry-point span. Every span except the root has exactly one parent. This structure gives you the waterfall view you see in Jaeger and Tempo: a visual timeline showing which spans ran sequentially and which ran in parallel, and exactly how much of the total request latency each span contributed.
Consider a checkout request flowing through four services. The API gateway creates the root span. It calls two downstream services concurrently — order-service and inventory-service — each creating a child span. Order-service then calls the payments database, creating a grandchild span. The resulting tree has four nodes, and the total request duration is determined by the critical path: the longest chain of sequential spans from root to leaf.
W3C Trace Context: The traceparent Header
For traces to work across service boundaries, the trace context must travel with the request. If Service A creates a root span and makes an HTTP call to Service B, Service B must receive the trace ID and the parent span ID so that the span it creates is correctly linked to Service A's span in the same trace. Without this propagation, you get disconnected islands of spans — useless for root cause analysis.
The W3C Trace Context standard (RFC published 2021, now universally supported by OTel, Jaeger, Zipkin, Datadog, and all major APM vendors) defines two HTTP headers for this purpose:
traceparent— carries the core context: version, trace ID, parent span ID, and trace flags.tracestate— optional vendor-specific key-value pairs (Datadog sampling priority, B3 flags, etc.) that travel alongside without conflicting with the standard.
The traceparent header has a precisely defined format: version-traceId-parentSpanId-flags. In practice it looks like this:
01 flag in traceparent signals "I sampled this trace — downstream services, please also sample and report spans." But a downstream service is free to ignore it (e.g. if it is overloaded). In practice, production systems respect the flag to ensure all spans for a sampled trace are collected. When the flag is 00, downstream services typically do not report spans, keeping overhead near zero for unsampled traffic. The OTel SDK handles all of this automatically when you use the W3CPropagator.
Context Propagation in Practice
Context propagation is the mechanism by which trace context is injected into outgoing requests and extracted from incoming ones. The OTel SDK provides a propagator API that handles injection and extraction for different transport formats. The W3C Trace Context propagator is the default for HTTP. For message queues (Kafka, RabbitMQ, SQS), the same IDs are placed into message headers or attributes.
Span Events and Attributes: Production Patterns
Two span features that are consistently underused but critical in production: span events and carefully chosen attributes.
A span event is a timestamped annotation attached to a span. Rather than emitting a separate log line for "cache miss, falling back to database," record it as an event on the active span. This keeps the data collocated — when you are looking at a slow span in the trace UI, you see exactly what happened and when, without having to pivot to the log store and correlate by timestamp.
Attributes should capture the business context that turns "this database query was slow" into "this database query was slow for Premium users in Germany requesting more than 50 items." Add at most 20-30 attributes per span — each attribute is indexed in the backend and has a storage cost. Avoid high-cardinality values (raw SQL query bodies, full HTTP response bodies) as span attributes; truncate or omit them if needed.
context.Context (Go), Context (Java), or contextvars.Context (Python) explicitly through async boundaries. If you start a goroutine or a thread pool task, capture the current span context before the async boundary and restore it inside. The OTel SDK cannot do this automatically — it is one of the most common causes of "broken traces" in production where parent-child links are missing.
What Breaks Traces in Production
Understanding the failure modes is as important as understanding the happy path. Common causes of broken or incomplete traces:
- Missing propagation at a single hop: One service — often a legacy system, a load balancer, or an API gateway — strips or ignores
traceparent. All downstream spans still have a trace ID but their parent link points to a span that the backend never received, creating a disconnected subtree. Fix: audit every service boundary and every HTTP proxy configuration. - Async context loss: A span is started in thread A, work is queued to thread B, and thread B creates child spans — but without the context being passed across the async boundary, thread B creates a new root span instead of a child. The trace splits into two unrelated trees.
- Clock skew: Span timestamps come from the host where the SDK runs. If hosts have clock drift (NTP not configured), spans appear to start before their parent ends — a physically impossible state. Production fix: run
chronyorntpdon all nodes; the OTel Collector can apply a clock skew correction heuristic. - Sampling mismatch: Head-based sampling with different rates per service means Service A samples 10% and Service B samples 5%. A trace sampled at A may not be sampled at B, creating an incomplete trace. Use tail-based sampling at the Collector layer to make the keep/drop decision once, centrally, for the entire trace.
- Span batch dropped under load: The OTel SDK batches spans in memory before exporting. Under a traffic spike, if the batch queue fills faster than the exporter can drain it, spans are dropped. Monitor
otelcol_exporter_send_failed_spans_totalin the Collector and size your batch processor buffer (queue_size) for your peak load. A dead Collector or network partition silently drops all spans — build alerting on export failure metrics.
In the next lesson we move to the OpenTelemetry standard itself — its component model (SDK, API, Collector, semantic conventions), how it achieved vendor neutrality, and how to evaluate it against proprietary agents like the Datadog tracer or Dynatrace OneAgent for a greenfield service or a migration.