Project: A Resilient, Observable Service
This capstone lesson pulls together every pattern from the tutorial into a single, production-ready Spring Boot 3 service. You will build an Order Processing Service that calls a downstream Inventory Service over HTTP, publishes domain events to Kafka, exposes Micrometer metrics, integrates Resilience4j circuit-breakers and retries, and emits distributed traces that Zipkin can collect. By the end you will have a service you can run, break, and watch recover — the essential skill for operating microservices at scale.
Project Architecture
The service has three concerns wired together:
- Inbound REST API — accepts
POST /orders requests from clients.
- Resilient downstream call — queries
inventory-service with a circuit breaker, retry, and timeout.
- Event publishing — emits an
OrderPlaced event to a Kafka topic after a successful reservation.
Observability is not bolted on at the end — it is built in from line one via Micrometer, Spring Boot Actuator, and Micrometer Tracing.
Dependencies (pom.xml)
<dependencies>
<!-- Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Resilience4j via Spring Cloud -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId>
</dependency>
<!-- Kafka -->
<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
</dependency>
<!-- Actuator + Micrometer Prometheus -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<!-- Distributed tracing (Brave / Zipkin) -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-tracing-bridge-brave</artifactId>
</dependency>
<dependency>
<groupId>io.zipkin.reporter2</groupId>
<artifactId>zipkin-reporter-brave</artifactId>
</dependency>
</dependencies>
application.yml — Central Configuration
Keeping resilience policy, Kafka bootstrap, and observability sampling in one file makes the service's behaviour self-documenting.
spring:
application:
name: order-service
kafka:
bootstrap-servers: localhost:9092
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus,circuitbreakers
tracing:
sampling:
probability: 1.0 # 100 % in dev; set 0.1 in prod
resilience4j:
circuitbreaker:
instances:
inventory:
slidingWindowSize: 10
failureRateThreshold: 50
waitDurationInOpenState: 10s
permittedNumberOfCallsInHalfOpenState: 3
retry:
instances:
inventory:
maxAttempts: 3
waitDuration: 500ms
retryExceptions:
- java.io.IOException
- org.springframework.web.client.ResourceAccessException
timelimiter:
instances:
inventory:
timeoutDuration: 2s
inventory:
base-url: http://localhost:8081
Domain Model
public record OrderRequest(String productId, int quantity) {}
public record OrderResult(String orderId, String status, String message) {}
public record InventoryResponse(String productId, boolean available, int stock) {}
public record OrderPlacedEvent(String orderId, String productId, int quantity, long timestamp) {}
InventoryClient — Resilient HTTP Call
The client wraps RestTemplate with Resilience4j annotations. The @CircuitBreaker and @Retry annotations are applied at the method level; the @TimeLimiter is wired in configuration. The fallback method signature must mirror the protected method plus a Throwable parameter.
@Component
public class InventoryClient {
private final RestTemplate restTemplate;
@Value("${inventory.base-url}")
private String baseUrl;
public InventoryClient(RestTemplateBuilder builder) {
this.restTemplate = builder.build();
}
@CircuitBreaker(name = "inventory", fallbackMethod = "inventoryFallback")
@Retry(name = "inventory")
public InventoryResponse checkStock(String productId) {
String url = baseUrl + "/inventory/" + productId;
return restTemplate.getForObject(url, InventoryResponse.class);
}
// Called when the circuit is open OR all retries are exhausted
public InventoryResponse inventoryFallback(String productId, Throwable ex) {
// Degrade gracefully — assume unavailable rather than crashing
return new InventoryResponse(productId, false, 0);
}
}
Fallback must not do I/O. If your fallback itself calls a database or another service, you risk cascading failures. Return a cached value, a safe default, or a circuit-open error response — never another network hop.
OrderService — Business Logic with Metrics
@Service
public class OrderService {
private final InventoryClient inventoryClient;
private final KafkaTemplate<String, OrderPlacedEvent> kafkaTemplate;
private final MeterRegistry meterRegistry;
private final Counter ordersPlaced;
private final Counter ordersRejected;
public OrderService(InventoryClient inventoryClient,
KafkaTemplate<String, OrderPlacedEvent> kafkaTemplate,
MeterRegistry meterRegistry) {
this.inventoryClient = inventoryClient;
this.kafkaTemplate = kafkaTemplate;
this.meterRegistry = meterRegistry;
this.ordersPlaced = Counter.builder("orders.placed")
.description("Successfully placed orders")
.register(meterRegistry);
this.ordersRejected = Counter.builder("orders.rejected")
.description("Orders rejected due to stock")
.register(meterRegistry);
}
@Observed(name = "order.place", contextualName = "placing-order")
public OrderResult placeOrder(OrderRequest request) {
InventoryResponse stock = inventoryClient.checkStock(request.productId());
if (!stock.available() || stock.stock() < request.quantity()) {
ordersRejected.increment();
return new OrderResult(null, "REJECTED", "Insufficient stock");
}
String orderId = UUID.randomUUID().toString();
OrderPlacedEvent event = new OrderPlacedEvent(
orderId, request.productId(), request.quantity(),
System.currentTimeMillis());
kafkaTemplate.send("orders.placed", orderId, event);
ordersPlaced.increment();
// Record a distribution summary for order size
meterRegistry.summary("order.quantity").record(request.quantity());
return new OrderResult(orderId, "PLACED", "Order accepted");
}
}
@Observed creates a span automatically. Micrometer Tracing picks up the @Observed annotation (via an AOP aspect) and wraps the method in a trace span. You get distributed traces without writing Tracer boilerplate in every method.
OrderController
@RestController
@RequestMapping("/orders")
public class OrderController {
private final OrderService orderService;
public OrderController(OrderService orderService) {
this.orderService = orderService;
}
@PostMapping
public ResponseEntity<OrderResult> place(@RequestBody @Valid OrderRequest request) {
OrderResult result = orderService.placeOrder(request);
HttpStatus status = "PLACED".equals(result.status())
? HttpStatus.CREATED : HttpStatus.UNPROCESSABLE_ENTITY;
return ResponseEntity.status(status).body(result);
}
}
Observing the Circuit Breaker in Action
Spring Boot Actuator exposes a dedicated endpoint. While the downstream service is down, send several requests and watch the circuit open:
# Check circuit breaker state via Actuator
curl http://localhost:8080/actuator/circuitbreakers
# Sample response (OPEN state after threshold exceeded):
# {
# "circuitBreakers": {
# "inventory": {
# "failureRate": "60.0%",
# "state": "OPEN",
# "bufferedCalls": 10,
# "failedCalls": 6
# }
# }
# }
Prometheus Metrics Endpoint
Scrape /actuator/prometheus to see all Micrometer meters, including the custom counters and the Resilience4j integration metrics:
# Custom application counters
orders_placed_total 42.0
orders_rejected_total 7.0
# Resilience4j circuit-breaker metrics (auto-registered)
resilience4j_circuitbreaker_calls_seconds_count{kind="successful",name="inventory"} 38.0
resilience4j_circuitbreaker_calls_seconds_count{kind="failed",name="inventory"} 4.0
resilience4j_circuitbreaker_state{name="inventory",state="closed"} 1.0
# Order quantity distribution
order_quantity_count 42.0
order_quantity_sum 187.0
order_quantity_max 15.0
Distributed Trace Flow
Every inbound HTTP request automatically gets a traceId injected by Micrometer Tracing. The trace propagates into the Kafka record header (via the Brave Kafka instrumentation) and into the downstream HTTP call (via the instrumented RestTemplate). In Zipkin you see:
- Span 1:
POST /orders — the root span.
- Span 2:
placing-order — the @Observed span inside OrderService.
- Span 3:
GET inventory/{productId} — the outbound call, including Resilience4j retry attempts as child spans.
- Span 4:
orders.placed send — the Kafka publish span.
Testing Resilience
The cleanest way to test the circuit breaker is with @SpringBootTest plus a WireMock stub that returns 500 errors:
@SpringBootTest(webEnvironment = RANDOM_PORT)
@AutoConfigureWireMock(port = 8081)
class OrderServiceResilienceTest {
@Autowired TestRestTemplate client;
@Autowired CircuitBreakerRegistry cbRegistry;
@Test
void circuitOpensAfterRepeatedFailures() {
// Stub inventory to always fail
stubFor(get(urlPathMatching("/inventory/.*"))
.willReturn(serverError()));
// Drive 10 calls through the service (matches slidingWindowSize)
for (int i = 0; i < 10; i++) {
client.postForEntity("/orders",
new OrderRequest("PROD-1", 1), OrderResult.class);
}
CircuitBreaker cb = cbRegistry.circuitBreaker("inventory");
assertThat(cb.getState()).isEqualTo(CircuitBreaker.State.OPEN);
}
}
Test failure modes, not just happy paths. A service that passes all green-path tests but has never been tested under circuit-open conditions, timeout expiry, or Kafka broker unavailability is not production-ready. Chaos-test your service in CI, not in prod.
Production Checklist
- Set
management.tracing.sampling.probability to 0.05–0.10 in production — 100 % sampling creates significant overhead and storage cost.
- Never expose
/actuator/* to the public internet. Place it on an internal port or protect it with Spring Security.
- Alert on
resilience4j_circuitbreaker_state transitions — an open circuit is a production incident, not just a metric blip.
- Use
DeadLetterPublishingRecoverer for Kafka send failures so no event is silently dropped.
- Tag all custom metrics with meaningful labels (
region, env) so dashboards can slice by deployment.
Summary
You have assembled a service that tolerates downstream failures gracefully via circuit breakers and retries, decouples side effects via Kafka events, and surfaces its internal state to operations teams via Prometheus metrics, Actuator endpoints, and Zipkin traces. Each pattern from the earlier lessons has a concrete, testable role here. This is what production-grade microservice code looks like: not individually clever, but collectively robust and transparent.