The Need for Service Discovery
The Need for Service Discovery
When you split a monolith into microservices you gain independent deployability, independent scaling, and team autonomy. You also gain a new class of problem that the monolith never had: how does Service A find Service B at runtime? In a monolith the answer is a local method call. In a distributed system the answer must account for the fact that every service can be restarted, scaled, moved to a different host, or replaced at any moment — often without any human intervention.
The Hard-Coded Address Trap
The most natural first instinct is to put a URL directly in a configuration file:
Then, in the calling service:
This works perfectly in a lab. It falls apart the moment you operate in anything resembling a real environment.
Why Hard-Coded Addresses Break in the Cloud
Cloud and container platforms are designed around ephemerality: instances come and go, IP addresses are assigned dynamically, and no single host is considered permanent. Hard-coded addresses violate every one of these assumptions.
Dynamic IP Assignment
Every time a Docker container or Kubernetes pod restarts, the platform assigns it a new IP address from a pool. The address 192.168.1.42 that was valid this morning may belong to a completely different service — or no service at all — by afternoon. Your hard-coded URL now points at nothing, or worse, at the wrong thing.
Horizontal Scaling
Suppose traffic spikes and your platform auto-scales the Inventory service from one instance to three. You now have three valid addresses — :8081, :8082, :8083 — but your configuration file still mentions only the original one. The extra capacity is completely invisible to your callers. You pay for three instances and use one.
Rolling Deployments and Zero-Downtime Updates
During a rolling deployment, old and new instances coexist temporarily. If a load balancer or the caller itself tracks only one address, requests pile up against a single instance during the transition. Other instances — some running the old version, some the new — are again invisible.
Multi-Environment Configuration Drift
With hard-coded addresses, every environment (dev, staging, production) needs its own version of every property file. Developers frequently forget to update one environment, deploying a staging build that still calls a production address — or vice versa. The blast radius of that mistake is wide.
What Service Discovery Solves
Service discovery replaces hard-coded addresses with a registry — a shared, always-current directory of running service instances. Instead of asking "what is the fixed address of Inventory?", a caller asks "give me the address of any healthy Inventory instance right now."
The core contract has two sides:
- Registration: when a service starts, it publishes its own address, port, and health-check URL to the registry. When it shuts down cleanly, it de-registers. When it crashes, the registry evicts it after a configurable timeout.
- Discovery (lookup): when a caller needs to reach a service, it queries the registry by logical name (e.g.,
inventory-service) and receives back a list of healthy addresses. It can then pick one — round-robin, random, or by latency — without knowing anything about the underlying infrastructure.
Client-Side vs. Server-Side Discovery
There are two broad patterns for using a registry, and it matters which one your framework implements:
- Server-side discovery: the caller sends a request to a load balancer or gateway, which queries the registry and forwards the request. The caller does not know about the registry at all. AWS Application Load Balancer and Kubernetes Services work this way.
- Client-side discovery: the caller itself queries the registry, selects an instance, and makes the HTTP call directly. Spring Cloud LoadBalancer (the replacement for Ribbon) implements this pattern. It is more flexible but puts registry awareness inside every service.
Spring Cloud Gateway (covered later in this tutorial) often blends both: the gateway acts as a server-side entry point for external traffic but uses client-side discovery internally to route to backend services.
The Distributed Systems Trade-Off
A service registry is itself a distributed component, which means it must be highly available. If the registry goes down, services that rely on it for routing decisions can no longer discover new peers. Spring Cloud Eureka addresses this with a local cache: each client caches the last-known registry snapshot and continues routing from that cache even when the registry server is temporarily unreachable. This is a deliberate trade-off between consistency (always seeing the latest list) and availability (being able to route at all) — exactly the AP corner of the CAP theorem.
A Concrete Before-and-After
Without service discovery — what a caller's code looks like:
With service discovery and Spring Cloud LoadBalancer — what the same call looks like:
The calling code never knows the real IP. It never changes when instances are added, removed, or replaced. The registry and the load-balancing layer handle all of that transparently.
Summary
Hard-coded service addresses are a deceptively simple solution that breaks as soon as your infrastructure becomes dynamic — which is the default in any cloud or container environment. The core problems are dynamic IP assignment, invisible horizontal scaling, rolling-deployment complexity, and multi-environment drift. Service discovery solves these by introducing a registry that maps logical service names to live instance addresses, shifts infrastructure knowledge out of application code, and enables the load-balancing and resilience patterns the following lessons build on. In the next lesson you will implement this registry using Spring Cloud Netflix Eureka.