From Monolith to Microservices: A Migration Story

In 2024, we were brought in to help a logistics company migrate their 10-year-old Java monolith to a microservices architecture. The monolith was a 2-million-line codebase, handling everything from order management to route optimization to invoicing. It worked, but deployments took 4 hours, a bug in the invoicing module could bring down the entire system, and scaling meant provisioning ever-larger servers.

This is the story of how we migrated that monolith to 14 microservices over 18 months without a single minute of downtime. Not because we're exceptionally clever, but because we followed a disciplined, incremental approach that minimized risk at every step.

Complex software architecture visualization — The monolith served the company well for 10 years — migration was about future scalability, not fixing what was broken

Phase 1: Understand Before You Act

The worst thing you can do is start extracting services on day one. We spent the first 6 weeks doing nothing but understanding the monolith: mapping dependencies between modules, identifying database coupling, documenting the implicit contracts between components, and — critically — understanding the business domains.

We used a combination of static analysis (dependency graphs from code), runtime analysis (distributed tracing to map actual call patterns), and domain expert interviews (talking to the team who built and maintained the monolith). The output was a domain map showing 14 distinct bounded contexts and a coupling matrix showing which modules were tightly intertwined.

Draw the dependency graph before drawing the service boundaries. Services should be cut along natural seams in the code, not along organizational boundaries or wishful-thinking architectures.

Technology and architecture planning — 6 weeks of analysis saved us from 6 months of wrong decisions

Phase 2: The Strangler Fig Pattern

We used the Strangler Fig pattern: instead of rewriting the monolith from scratch, we incrementally built new services around it, routing traffic to the new services as they became ready. The monolith continued serving traffic throughout the entire migration — there was never a "big bang" cutover.

The key infrastructure for this pattern is an API gateway that sits in front of both the monolith and the new services. Initially, 100% of traffic goes to the monolith. As each service is extracted and validated, the gateway routes that specific traffic to the new service. The monolith shrinks over time until it's no longer needed.

gateway/routes.yaml

# API Gateway routing — gradual migration
routes:
  # Fully migrated — 100% to new service
  - path: /api/v1/notifications/*
    service: notification-service
    weight: 100

  # In progress — canary at 10%
  - path: /api/v1/invoicing/*
    backends:
      - service: invoicing-service
        weight: 10
      - service: monolith
        weight: 90

  # Not yet migrated — still on monolith
  - path: /api/v1/orders/*
    service: monolith
    weight: 100

  # Default fallback
  - path: /api/v1/*
    service: monolith
    weight: 100

Phase 3: Database Decomposition

The hardest part of any monolith migration isn't extracting the code — it's decomposing the shared database. Our monolith had 340 tables in a single PostgreSQL database, with joins spanning across what should have been separate domains. An order query would join across orders, customers, products, inventory, and shipping tables in a single SQL statement.

We used a three-step approach for each service extraction: (1) create a read-only API in the new service that reads from the shared database, (2) implement Change Data Capture (CDC) with Debezium to replicate relevant tables to the service's own database, (3) once the service's database is in sync and validated, switch reads to the local database and writes to the new service's API.

Identify tables owned by the bounded context being extracted
Create database views to abstract the query patterns used by other modules
Set up CDC replication from shared DB to service-specific DB
Validate data consistency between shared and service databases (automated nightly diff)
Switch service reads to local database, maintain dual-write for safety
After 2 weeks of dual-write validation, cut over writes to service API
Deprecate direct database access from the monolith — route through service API
Drop CDC replication once the monolith no longer accesses those tables

Database decomposition is where migrations fail. Plan for 60% of your total migration effort to be spent on data. Don't underestimate the complexity of untangling cross-domain joins and ensuring data consistency during the transition.

Phase 4: Observability First

Before extracting the first service, we invested heavily in observability. Distributed tracing (Jaeger), centralized logging (ELK stack), metrics (Prometheus + Grafana), and synthetic monitoring. This wasn't optional — it was a prerequisite. In a distributed system, you can't debug problems by reading a single log file. Without observability, you're flying blind.

lib/tracing.ts

import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';
import { JaegerExporter } from '@opentelemetry/exporter-jaeger';
import { registerInstrumentations } from '@opentelemetry/instrumentation';
import { HttpInstrumentation } from '@opentelemetry/instrumentation-http';
import { ExpressInstrumentation } from '@opentelemetry/instrumentation-express';
import { PgInstrumentation } from '@opentelemetry/instrumentation-pg';

// Auto-instrument HTTP, Express, and PostgreSQL
export function initTracing(serviceName: string) {
  const provider = new NodeTracerProvider({
    resource: { attributes: { 'service.name': serviceName } },
  });

  provider.addSpanProcessor(
    new BatchSpanProcessor(new JaegerExporter())
  );

  provider.register();

  registerInstrumentations({
    instrumentations: [
      new HttpInstrumentation(),
      new ExpressInstrumentation(),
      new PgInstrumentation(),
    ],
  });
}

Results and Lessons Learned

After 18 months, the monolith was fully decomposed into 14 microservices. The results were transformative:

Deployment frequency: 1 deployment per week → 15+ deployments per day (per service)
Deployment time: 4 hours → 8 minutes (per service)
Incident blast radius: Full system outage → isolated service degradation
Team autonomy: 1 team blocked by shared codebase → 5 independent teams shipping independently
Scaling: Vertical only (bigger servers) → Horizontal per-service (route optimization scales independently from invoicing)
Zero downtime during the entire 18-month migration

The biggest lesson: the migration succeeded not because of any technical cleverness, but because of discipline. We resisted the urge to rewrite everything from scratch. We invested in understanding before acting. We migrated incrementally, validating each step before moving to the next. And we invested in observability that let us catch problems before users noticed them.

“The strangler fig pattern isn't the fastest path to microservices. But it's the safest. And in a system that processes $50M in logistics transactions daily, safety isn't optional — it's the entire point.”
— Thomas Weber, Vaarak Architecture

Global technology network — The final architecture: 14 independently deployable services, each owned by a dedicated team

Should You Migrate?

Not every monolith needs to become microservices. If your monolith is well-structured, your team is small, and your scaling needs are modest, a monolith might be the right architecture for years to come. Microservices add operational complexity — distributed transactions, network failures, data consistency challenges, deployment orchestration. Only migrate when the pain of the monolith (deployment speed, team coupling, scaling limitations) outweighs the complexity of distribution.

If you do decide to migrate, start with the approach we've outlined: understand first, strangle incrementally, decompose data carefully, and observe everything. The migration is a marathon, not a sprint. Plan accordingly.