Architecture Patterns

Strangler Fig: Migrating a Monolith

18 min Lesson 8 of 10

Strangler Fig: Migrating a Monolith

You have a working monolith. Real users depend on it. Revenue flows through it every hour. One day the pain becomes undeniable: deployments take four hours, a bug in the billing module takes down the entire platform, and five teams are stepping on each other in the same repository. You need to move to a modern, decomposed architecture — but you cannot stop the world to do it.

The Strangler Fig pattern, named by Martin Fowler after the tropical strangler fig tree that grows around a host tree and eventually replaces it, is the industry-standard technique for incrementally migrating a live monolith to microservices (or any newer architecture) without a risky, big-bang rewrite.

The "big-bang rewrite" failure mode: In 2000, Netscape attempted to rewrite its browser from scratch. The project took three years, cost hundreds of millions of dollars, and was largely credited with destroying the company's market lead. Joel Spolsky called it "the single worst strategic mistake a software company can make." The Strangler Fig pattern exists precisely to avoid this fate.

The Core Idea

Rather than rewriting the monolith all at once, you introduce a proxy or facade layer in front of it. New feature requests and gradually migrated capabilities are routed to new services. The monolith continues to handle everything else. Over time, piece by piece, you strangle the monolith — until eventually it can be decommissioned or it handles only a residual slice of traffic.

Three phases structure every Strangler Fig migration:

  1. Transform — build the new service in parallel, without touching the monolith.
  2. Co-exist — deploy the proxy in front of both, routing a fraction of traffic to the new service.
  3. Eliminate — once the new service handles 100% of a capability, remove that code from the monolith.
Strangler Fig migration phases — three-phase progression from monolith to decomposed services Phase 1: Transform Phase 2: Co-exist Phase 3: Eliminate Client Monolith Orders Users Billing Notifications Catalog New Catalog Service (built) (not yet live) Client Proxy / Facade 80% Monolith Orders Users Billing Notifications ~~Catalog~~ 20% Catalog Service Client Proxy / Facade Monolith Orders (shrinking) Cat User Notif Migration progresses over months or years →
The three phases of the Strangler Fig pattern: build new service, co-exist behind a proxy, then eliminate from the monolith.

The Proxy: Your Migration Control Plane

The proxy (sometimes called a facade or strangler facade) is the linchpin of the pattern. It sits between clients and the backend, inspecting each request and deciding whether to route it to the old monolith or the new service. In practice this is often implemented as:

  • An API Gateway (AWS API Gateway, Kong, Nginx) with path-based routing rules.
  • A reverse proxy that is part of your existing infrastructure and can be updated without code changes.
  • Feature flags embedded in the proxy to enable percentage-based canary rollouts (5% → 25% → 100%).

A critical rule: the proxy must add no business logic. It only routes. Business logic in the proxy becomes a new monolith of its own.

Data Migration: The Hard Part

Extracting a service is primarily a data problem. As long as the new service and the monolith share a database, you have not truly decoupled them — a schema change by one team breaks the other. The recommended approach is the Database Decomposition sequence:

  1. Duplicate — while co-existing, write all changes to both the monolith DB and the new service's DB (dual-write or event-driven sync).
  2. Verify — run a reconciliation job that confirms both stores are consistent for a period (days or weeks).
  3. Cut over — stop the monolith from writing to that table; the new service owns the data exclusively.
  4. Remove — drop the table from the monolith schema.
Dual-write is dangerous if not handled carefully. If a write succeeds in the monolith DB but fails in the new service DB, you have a consistency split. Use an outbox pattern or event sourcing to make the dual-write reliable: the monolith writes an event to an outbox table in the same transaction, and a relay process publishes it to a message broker that the new service consumes.
Safe data migration using the outbox pattern during Strangler Fig migration Proxy Facade Monolith writes order + outbox event (1 transaction) Monolith DB (+ outbox table) Outbox Relay Message Broker (Kafka) New Service consumes events New Service DB ① Atomic write to monolith DB + outbox ② Relay publishes → broker → new service syncs DB
Safe dual-write via the outbox pattern: the monolith writes to its own DB and an outbox table atomically; a relay then publishes events consumed by the new service.

A Real-World Example: Shopify

Shopify ran on a single Rails monolith for over a decade. By 2016, 600+ engineers were working in the same codebase and deployments had become the team's biggest bottleneck. Rather than a big-bang rewrite, they executed a Strangler Fig migration over several years:

  • First, they modularised the monolith internally — enforcing strict module boundaries and removing cross-module database calls. This was their "Transform" phase.
  • They extracted the highest-traffic, most-isolated capabilities first: the storefront rendering engine, payment processing, and their API platform.
  • An internal proxy layer (eventually their custom Golang service mesh) routed specific URL namespaces to new services while the monolith handled everything else.
  • Each extraction took a dedicated team 3–6 months including data migration, traffic migration, and the reconciliation period.

By 2020 Shopify had extracted dozens of services but the monolith — now a vastly smaller, modular core — still handled a significant portion of their traffic. The migration was not "finished" and may never be. That is intentional: the pattern is about continuous, risk-managed progress, not a destination.

Choosing What to Extract First

Not all modules are equal candidates for early extraction. Use this prioritisation framework:

  • High value, low coupling. A module that has few inbound dependencies from other modules but is causing significant pain (deployment bottlenecks, frequent outages) is the ideal first extraction. Notification systems and file-storage modules often fit this profile.
  • Seam availability. Fred Brooks' concept of a "seam" — a place where you can insert new behaviour without changing existing code — is critical. Extract along existing module boundaries, not arbitrary slices.
  • Data ownership clarity. Start with modules that own their data clearly and do not participate in complex multi-table transactions with other modules. Billing and payments, despite being high value, often have deeply tangled data dependencies and should not be first.
  • Team ownership alignment. Extract a module when a single team can take full ownership. A module owned by three different teams is a coordination nightmare to extract.
Run the "Death Star" diagram first. Before migrating, draw every module as a node and every inter-module call as a directed edge. The result is usually a dense tangle (engineers call it the "Death Star diagram"). Modules with few incoming edges are your safest early extractions; heavily connected core modules should be last.

Risks and Mitigations

  • Proxy becomes a bottleneck or single point of failure. Deploy the proxy in a cluster with auto-scaling; add health checks and circuit breakers on both routes.
  • Indefinite co-existence. Teams sometimes lose momentum after the first extraction and never complete the migration. Establish a written decommission date for each migrated capability and hold the team accountable to it.
  • Increased latency during co-existence. A synchronous call that was in-process in the monolith now travels the network. Profile the hot paths first; consider whether async communication is acceptable for the extracted module.
  • Distributed testing complexity. Integration tests that hit the proxy must now handle two different backends. Invest in contract testing (e.g., Pact) to verify the new service's API matches what the proxy and other callers expect.

Summary

The Strangler Fig pattern is not glamorous — it is slow, methodical work that spans months or years. But it is the only production-safe way to migrate a live monolith. The key principles are: always maintain a working system, use a proxy to control traffic split, move data ownership before decommissioning code, extract along seams with clear ownership, and keep momentum with written decommission milestones. Every company that has successfully decomposed a large monolith used some variation of this pattern.