From Monolith to Microservices: A Migration Story
How we safely migrated a 10-year-old monolith without downtime.
In 2024, we were brought in to help a logistics company migrate their 10-year-old Java monolith to a microservices architecture. The monolith was a 2-million-line codebase, handling everything from order management to route optimization to invoicing. It worked, but deployments took 4 hours, a bug in the invoicing module could bring down the entire system, and scaling meant provisioning ever-larger servers.
This is the story of how we migrated that monolith to 14 microservices over 18 months without a single minute of downtime. Not because we're exceptionally clever, but because we followed a disciplined, incremental approach that minimized risk at every step.
Phase 1: Understand Before You Act
The worst thing you can do is start extracting services on day one. We spent the first 6 weeks doing nothing but understanding the monolith: mapping dependencies between modules, identifying database coupling, documenting the implicit contracts between components, and — critically — understanding the business domains.
We used a combination of static analysis (dependency graphs from code), runtime analysis (distributed tracing to map actual call patterns), and domain expert interviews (talking to the team who built and maintained the monolith). The output was a domain map showing 14 distinct bounded contexts and a coupling matrix showing which modules were tightly intertwined.
Draw the dependency graph before drawing the service boundaries. Services should be cut along natural seams in the code, not along organizational boundaries or wishful-thinking architectures.
Phase 2: The Strangler Fig Pattern
We used the Strangler Fig pattern: instead of rewriting the monolith from scratch, we incrementally built new services around it, routing traffic to the new services as they became ready. The monolith continued serving traffic throughout the entire migration — there was never a "big bang" cutover.
The key infrastructure for this pattern is an API gateway that sits in front of both the monolith and the new services. Initially, 100% of traffic goes to the monolith. As each service is extracted and validated, the gateway routes that specific traffic to the new service. The monolith shrinks over time until it's no longer needed.
# API Gateway routing — gradual migration
routes:
# Fully migrated — 100% to new service
- path: /api/v1/notifications/*
service: notification-service
weight: 100
# In progress — canary at 10%
- path: /api/v1/invoicing/*
backends:
- service: invoicing-service
weight: 10
- service: monolith
weight: 90
# Not yet migrated — still on monolith
- path: /api/v1/orders/*
service: monolith
weight: 100
# Default fallback
- path: /api/v1/*
service: monolith
weight: 100Phase 3: Database Decomposition
The hardest part of any monolith migration isn't extracting the code — it's decomposing the shared database. Our monolith had 340 tables in a single PostgreSQL database, with joins spanning across what should have been separate domains. An order query would join across orders, customers, products, inventory, and shipping tables in a single SQL statement.
We used a three-step approach for each service extraction: (1) create a read-only API in the new service that reads from the shared database, (2) implement Change Data Capture (CDC) with Debezium to replicate relevant tables to the service's own database, (3) once the service's database is in sync and validated, switch reads to the local database and writes to the new service's API.
- Identify tables owned by the bounded context being extracted
- Create database views to abstract the query patterns used by other modules
- Set up CDC replication from shared DB to service-specific DB
- Validate data consistency between shared and service databases (automated nightly diff)
- Switch service reads to local database, maintain dual-write for safety
- After 2 weeks of dual-write validation, cut over writes to service API
- Deprecate direct database access from the monolith — route through service API
- Drop CDC replication once the monolith no longer accesses those tables
Database decomposition is where migrations fail. Plan for 60% of your total migration effort to be spent on data. Don't underestimate the complexity of untangling cross-domain joins and ensuring data consistency during the transition.
Phase 4: Observability First
Before extracting the first service, we invested heavily in observability. Distributed tracing (Jaeger), centralized logging (ELK stack), metrics (Prometheus + Grafana), and synthetic monitoring. This wasn't optional — it was a prerequisite. In a distributed system, you can't debug problems by reading a single log file. Without observability, you're flying blind.
import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';
import { JaegerExporter } from '@opentelemetry/exporter-jaeger';
import { registerInstrumentations } from '@opentelemetry/instrumentation';
import { HttpInstrumentation } from '@opentelemetry/instrumentation-http';
import { ExpressInstrumentation } from '@opentelemetry/instrumentation-express';
import { PgInstrumentation } from '@opentelemetry/instrumentation-pg';
// Auto-instrument HTTP, Express, and PostgreSQL
export function initTracing(serviceName: string) {
const provider = new NodeTracerProvider({
resource: { attributes: { 'service.name': serviceName } },
});
provider.addSpanProcessor(
new BatchSpanProcessor(new JaegerExporter())
);
provider.register();
registerInstrumentations({
instrumentations: [
new HttpInstrumentation(),
new ExpressInstrumentation(),
new PgInstrumentation(),
],
});
}Results and Lessons Learned
After 18 months, the monolith was fully decomposed into 14 microservices. The results were transformative:
- Deployment frequency: 1 deployment per week → 15+ deployments per day (per service)
- Deployment time: 4 hours → 8 minutes (per service)
- Incident blast radius: Full system outage → isolated service degradation
- Team autonomy: 1 team blocked by shared codebase → 5 independent teams shipping independently
- Scaling: Vertical only (bigger servers) → Horizontal per-service (route optimization scales independently from invoicing)
- Zero downtime during the entire 18-month migration
The biggest lesson: the migration succeeded not because of any technical cleverness, but because of discipline. We resisted the urge to rewrite everything from scratch. We invested in understanding before acting. We migrated incrementally, validating each step before moving to the next. And we invested in observability that let us catch problems before users noticed them.
“The strangler fig pattern isn't the fastest path to microservices. But it's the safest. And in a system that processes $50M in logistics transactions daily, safety isn't optional — it's the entire point.”
— Thomas Weber, Vaarak Architecture
Should You Migrate?
Not every monolith needs to become microservices. If your monolith is well-structured, your team is small, and your scaling needs are modest, a monolith might be the right architecture for years to come. Microservices add operational complexity — distributed transactions, network failures, data consistency challenges, deployment orchestration. Only migrate when the pain of the monolith (deployment speed, team coupling, scaling limitations) outweighs the complexity of distribution.
If you do decide to migrate, start with the approach we've outlined: understand first, strangle incrementally, decompose data carefully, and observe everything. The migration is a marathon, not a sprint. Plan accordingly.
Thomas Weber
Principal Software Architect