The Circuit Breaker Pattern
The Circuit Breaker Pattern
In a distributed system, remote calls can fail. A downstream service might be slow, unreachable, or returning errors. Without a protection mechanism, a single misbehaving dependency can cascade: threads pile up waiting for a timeout, the thread pool is exhausted, and your service dies too — even though your code was perfectly fine. The circuit breaker is the standard solution to this problem.
How a Circuit Breaker Works
A circuit breaker wraps a remote call and tracks its outcomes. It operates in three states:
- CLOSED — The normal operating state. Calls pass through. The breaker counts failures within a sliding window.
- OPEN — The failure threshold has been breached. The breaker immediately rejects all calls with a
CallNotPermittedException, without even trying the remote service. This gives the downstream system time to recover. - HALF_OPEN — After a configurable wait duration the breaker allows a limited number of probe calls through. If they succeed, it transitions back to CLOSED. If they fail, it returns to OPEN.
Resilience4j: The Modern Java Choice
Netflix Hystrix, once the standard, is in maintenance mode. Resilience4j is its lightweight, functional successor, designed for Java 8+ with no external dependencies. Spring Cloud CircuitBreaker provides an abstraction layer, but it is common and acceptable to use the Resilience4j annotations and configuration directly in a Spring Boot 3 application — and that is what we will do here.
Add the starter to your pom.xml:
Configuring the Circuit Breaker
Breakers are configured by name in application.yml. Each name corresponds to one or more annotated methods in your code.
slidingWindowSize with care. A window of 5 opens the breaker after just 3 failures — too trigger-happy for a service that has bursts of legitimate errors. A window of 100 makes the breaker slow to react. For most services, 10–20 with a 50% threshold is a sensible starting point.
Applying the Annotation
Annotate the method that makes the remote call. The name must match an instance key in your YAML. The fallbackMethod is called whenever the breaker is OPEN or the call itself throws a recorded exception.
Throwable appended as the final argument. If the signatures do not match, Resilience4j silently ignores the fallback and rethrows the exception — which can be baffling to debug.
State Transitions in Practice
Resilience4j publishes circuit breaker events you can subscribe to for logging or metrics. In Spring Boot, events are also exposed automatically on the /actuator/circuitbreakers endpoint when spring-boot-starter-actuator is present.
Security and Distributed-Systems Trade-offs
Circuit breakers have important security implications that are easy to overlook:
- Authentication bypass risk: A fallback that returns a permissive or empty response must never bypass authorization checks. If your fallback silently returns "payment accepted" when the real service is down, an attacker can exploit the outage. Fallbacks should return safe, explicit failure states — never silent success.
- Information leakage: The
CallNotPermittedExceptionand its message should not be propagated to the API client. Translate it at the controller layer into a generic 503 response. - Consistency: An OPEN breaker that rejects writes while a downstream service recovers can leave your database in a partially-committed state. Design your fallback with idempotency and compensating transactions in mind.
RestClient or Feign client to bound how long a single call waits, and a circuit breaker to stop making calls once a threshold of failures accumulates. They operate at different time scales and complement each other.
Summary
The circuit breaker pattern protects your service from cascading failures by failing fast when a downstream dependency is unhealthy. Resilience4j implements it via a simple annotation (@CircuitBreaker), a named YAML configuration, and an optional fallback method. Key decisions are: sliding window size, failure rate threshold, wait duration, and what constitutes a recordable failure. Pair a circuit breaker with a meaningful fallback, wire up the event publisher for observability, and expose the state via Actuator so your operations team can see real-time health. The next lesson adds retries, timeouts, and bulkheads to complete the resilience toolkit.