Don’t Build a Distributed Monolith

Definition: The Distributed Monolith

An architecture that combines all the network latency, failure modes, and operational complexity of a distributed system with the tight coupling and deployment bottlenecks of a monolith. It is the worst of both worlds.
It is one of the most common failure modes in modern system design. Teams, intending to build resilient and scalable microservices, accidentally create a far more brittle and complex system...

This note captures the core definitions and common mistakes that lead to this anti-pattern, providing clear concepts to help articulate and avoid these architectural pitfalls.

Pasted image 20250615175219.png

The Four Architectures: Good vs. Bad

The quality of an architecture depends on two axes: how it's logically structured (coupling) and how it's physically deployed.

  1. Ball of Mud Monolith (Bad): Physically monolithic, logically monolithic. The classic spaghetti-code monolith where everything is tightly coupled. Changes in one area create bugs in another.
  2. Modular Monolith (Good): Physically monolithic, logically modular. A single, deployable application with well-defined internal boundaries. This is often the best starting point for most systems.
  3. True Microservices (Good, if you need it): Physically distributed, logically modular. Independent services that can be developed, deployed, and scaled separately. This is the ideal, but it comes at a high cost.
  4. Distributed Monolith (The Monster 👹): Physically distributed, logically monolithic. The architecture we must avoid. It looks like microservices on the surface, but a change in one service requires changes and redeployments across many others.

The Great Trade-Off: Consistency vs. Availability

The fundamental reason to choose a microservice architecture is to make a trade-off. You are sacrificing Immediate Consistency to gain High Availability.

Pasted image 20250615175340.png

If your team or business problem does not require this specific trade-off, you likely do not need microservices.

How did we end up with a distributed monolith?

1. The Shared Database
This is the most common and destructive mistake. Multiple "microservices" all read from and write to the same database.

Pasted image 20250615175443.png

2. Chatty, Synchronous Communication
Services that make frequent, blocking (synchronous) calls to each other are a classic sign of incorrect boundaries.

Pasted image 20250615175551.png

3. Starting with a Greenfield Microservices Project
It is far harder to get service boundaries right when starting from scratch ("greenfield") than when migrating an existing system ("brownfield").

What About Our Event-Driven Services on Kubernetes?

As an organization that relies heavily on stream processors (pulling from Pub/Sub, processing, and publishing new events), you might think we're safe from the classic distributed monolith. We don't have chatty synchronous calls, right?

The danger is still very real. The anti-pattern just takes a different form. The tight coupling simply shifts from synchronous API calls to the data schemas, processing chains, and shared infrastructure that connect our services.

Here are the common pitfalls for our kind of architecture:

1. Coupling Through a Shared Event Schema

This is the event-driven equivalent of the Shared Database. One service publishes an event, and five other services consume it. The event's data structure (its schema) becomes a rigid, shared contract.

Pasted image 20250615180733.png

2. Coupling Through Long, Brittle Processing Chains

We often see a business process implemented as a long, sequential chain of services: Service A -> Service B -> Service C -> Service D. Each service pulls from the previous one's output topic, adds a little value, and publishes to the next topic.

3. Coupling Through The "God Topic"

This is the temptation to create a single, central Pub/Sub topic, like all-company-events, where every service publishes everything.

Actionable advice

Resources