XX:YY

Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

image/svg+xml

Disaster Recovery and Microservices:
The BAC Theorem

Cesare Pautasso
http://www.pautasso.info
c.pautasso@ieee.org
@pautasso

University of Lugano (USI)

Faculty of Informatics

Architecture, Design and Web Information Systems Engineering

http://design.inf.usi.ch

Disaster Recovery and Microservices: The BAC Theorem

Microservices follow the polyglot persistence principle, where every microservice manages its own persistence independently. In this talk we illustrate the ultimate consequences of these assumptions, which can be summarized using the BAC theorem: only two are possible out of 1) a backed up microservice architecture; 2) full availability during normal operations; and 3) consistency after recovery. In other words, we will show that only Microservices Architecture running without a Backup can be both Available while remaining Consistent after disaster strikes. We will present and compare several coping strategies to deal with this limitation and discuss how it affects the monolith decomposition process at design time and the operational coupling between different microservices at run time.

Abstractions

image/svg+xml Components Objects Services Resources

Microservices

Will this component
always terminate?

function f() {
     ...
     return 42;
}

Development

Will this service
run forever?

while (true) {
on f { return f() };
}

Operations

Will this microservice continuously change?

while (true) {
     on f { 
- return f()
+ //return f() + return f2()
}; }

DevOps

Microservices

The microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own container and communicating with lightweight mechanisms, often an HTTP resource API. These services are built around business capabilities and independently deployable by fully automated deployment machinery. There is a bare minimum of centralized management of these services, which may be written in different programming languages and use different data storage technologies.

Martin Fowler and James Lewis

Microservices

The microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own container and communicating with lightweight mechanisms, often an HTTP resource API. These services are built around business capabilities and independently deployable by fully automated deployment machinery. There is a bare minimum of centralized management of these services, which may be written in different programming languages and use different data storage technologies.

Martin Fowler and James Lewis

image/svg+xml Customer Order Code Build Test Release Plan Deploy Monitor Operate Monolith Code Build Test Release Plan Deploy Monitor Operate Customer Order Code Build Test Release Plan Deploy Monitor Operate Microservices

For us service orientation means encapsulating the data with the business logic that operates on the data, with the only access through a published service interface. No direct database access is allowed from outside the service, and there’s no data sharing among the services.

Werner Vogels, Interviews Web Services: Learning from the Amazon technology platform, ACM Queue, 4(4), June 30, 2006

How small is a Microservice?

image/svg+xml

Monolith

Macro

Micro

Nano

How small is a Microservice?

One team has full control of the entire devops (code, build, test, release, deploy and operate) cycle

Iterate fast: Many small frequent releases better than few large releases

image/svg+xml Size Cost Small Large ReleaseDelay Speed RuntimeOverhead Efficiency
image/svg+xml Coupling Cost Loose Tight ReleaseDelay Speed RuntimeOverhead Efficiency

Loosely Coupled Microservices

Avoid dependencies: If you have to hold a release until some other team is ready you do not have two separate microservices

Avoid cascading failures: A failed microservice should not bring down the whole system

Do you:

Operate more than one microservice? Use polyglot persistence? Avoid storing everything in the same database? Assume eventual consistency?

Microservices

Microservices prefer letting each service manage its own database, either different instances of the same database technology, or entirely different database systems - an approach called Polyglot Persistence.

M. Fowler, J. Lewis https://www.martinfowler.com/articles/microservices.html

Eventual Inconsistency

Microservice architectures are doomed to become inconsistent after disaster strikes

Devops meets Disaster Recovery

image/svg+xml Backup Recover Code Build Test Release Plan Deploy Monitor Operate

How do you back up a monolith?

image/svg+xml Monolith Database Backup

How do you back up one microservice?

image/svg+xml microservice Database Backup

How do you back up
an entire microservice architecture?

image/svg+xml MySQL MongoDB Neo4J Redis

Are you sure?

Example

image/svg+xml Customer Product Shipment Order

Data relationships across microservices = Hypermedia

Independent Backup

image/svg+xml new C/1 C/1/name new C/2 C/2/name C/3/name new C/3 new O/3 O/3 → C/3 new O/1 O/1 → C/1 new O/2 O/2 → C/2 new O/3 O/3 → C/3 Customer Order new C/1 1 2 3 4 C/1/name new C/2 5 6 C/2/name new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 O/2 → C/2 1 2 3 4 5 6 1 2 3 4 5 6

Backups taken independently at different times

Disaster Strikes

Disaster Strikes

image/svg+xml new O/1 O/1 → C/1 new O/2 O/2 → C/2 new O/3 O/3 → C/3 new C/1 C/1/name new C/2 C/2/name Customer Order new C/1 1 2 3 4 C/1/name new C/2 5 6 C/3/name C/2/name new C/3 new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 new O/3 O/2 → C/2 O/3 → C/3 1 2 3 4 5 6 1 2 3 4 5 6

One microservice is lost

Recovery from Backup

image/svg+xml new O/1 O/1 → C/1 new O/2 O/2 → C/2 new O/3 O/3 → C/3 new C/1 C/1/name new C/2 C/2/name Customer Order 1 2 3 4 5 6 new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 new O/3 O/2 → C/2 O/3 → C/3 1 2 3 4 5 6 1 2 3 4 5 6 new C/1 C/1/name new C/2 C/2/name O/1 → C/1 O/2 → C/2

Broken link after recovery

Eventual Inconsistency

Synchronized Backups

image/svg+xml Customer Order new C/1 1 2 3 4 C/1/name new C/2 5 6 C/2/name new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 O/2 → C/2 new C/1 1 2 3 4 C/1/name new C/2 5 6 C/2/name new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 O/2 → C/2

Backups of all microservices taken at the same time.

Limited Availability

image/svg+xml Customer Order new C/1 1 2 3 4 C/1/name new C/2 5 6 C/2/name new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 O/2 → C/2 new C/1 1 2 3 4 C/1/name new C/2 5 6 C/2/name new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 O/2 → C/2

No updates allowed anywhere while backing up the microservices

The BAC theorem

When Backing up a microservice architecture,
it is not possible to have both
Consistency and Availability

Consistency

During normal operations, each microservice will eventually reach a consistent state

Referential integrity: links across microservice boundaries are guaranteed not to be broken

Availability

It is possible to both read and update the state of any microservice at any time

Backup

While backing up the system, is it possible to take a consistent snapshot of all microservices without affecting their availability?

No.

Backup + Availability

Backing up each microservice independently will eventually lead to inconsistency after recovering from backups taken at different times

Backup + Consistency

Taking a consistent backup requires to:

Shared Database

image/svg+xml Customer Product Shipment Order

A centralized, shared database would require only one backup

Is this still a microservice architecture?

Shared Database, Split Schema

image/svg+xml Customer Product Shipment Order C O S P

A centralized, shared database would require only one backup

Each microservice must use a logically separate schema

What happened to polyglot persistence?

Orphan State

image/svg+xml new O/1 O/1 → C/1 new O/2 O/2 → C/2 new C/1 C/1/name new C/2 C/2/name Customer Order 1 2 3 4 5 6 new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 O/2 → C/2 1 2 3 4 5 6 1 2 3 4 5 6 new C/1 C/1/name new C/2 C/2/name new C/3 C/3/name new C/3 C/3/name new O/3 O/3 → C/3 new C/3 C/3/name

Orphan state is no longer referenced after recovery

Unstoppable System

image/svg+xml Customer Product Shipment Order

An expensive, replicated database with high-availability for every microservice

Unstoppable System

How do you restart an unstoppable system?

image/svg+xml EventualConsistency Consistency EventualInconsistency Recovery Disaster Strikes Backup

Eventual Consistency

Retries are enough to deal with temporary failures of read operations, eventually the missing data will be found

Eventual Inconsistency

Retries are useless to deal with permanent failures of read operations, which used to work just fine before disaster recovery

Distributed Transactions

image/svg+xml Customer Order new C/1 1 2 3 4 C/1/name 5 6 new O/1 1 2 3 4 O/1 → C/1 5 6 new C/2 C/2/name new O/2 O/2 → C/2 1 2 3 4 5 6 new C/1 C/1/name new C/2 C/2/name 1 2 3 4 5 6 new O/1 O/1 → C/1 new O/2 O/2 → C/2

Take snapshots only when all microservices are consistent

Avoid eventual consistency

Microservices

Distributed transactions are notoriously difficult to implement and as a consequence microservice architectures emphasize transactionless coordination between services, with explicit recognition that consistency may only be eventual consistency and problems are dealt with by compensating operations.

M. Fowler, J. Lewis https://www.martinfowler.com/articles/microservices.html

Splitting the Monolith

image/svg+xml Customer Product Shipment Order

Keep data together for microservices that cannot tolerate eventual inconsistency

Does it apply to you?

More than one stateful microservice

Polyglot persistence

Eventual Consistency

(Cross-microservice references)

Disaster recovery based on backup/restore

Independent backups

Eventual inconsistency (after disaster recovery)

Does it apply to you?

More than one stateful microservice

Polyglot persistence

Eventual Consistency

(Cross-microservice references)

Disaster recovery based on backup/restore

Synchronized backups (limited availability/autonomy)

Consistent Disaster Recovery

The BAC Theorem

image/svg+xml Consistency Availability Backup CA CB AB Not Consistent Not Availablefor updates Not Backed Up .

No Backup

image/svg+xml new O/1 O/1 → C/1 new O/2 O/2 → C/2 new O/3 O/3 → C/3 new C/1 C/1/name new C/2 C/2/name Customer Order 1 2 3 4 5 6 new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 new O/3 O/2 → C/2 O/3 → C/3 1 2 3 4 5 6 1 2 3 4 5 6 new C/1 C/1/name new C/2 C/2/name O/1 → C/1 O/2 → C/2

No Backup

image/svg+xml new O/1 O/1 → C/1 new O/2 O/2 → C/2 new O/3 O/3 → C/3 new C/1 C/1/name new C/2 C/2/name Customer Order 1 2 3 4 5 6 new O/1 1 2 3 4 O/1 → C/1 new O/2 5 6 O/2 → C/2 1 2 3 4 5 6 1 2 3 4 5 6 new C/1 C/1/name new C/2 C/2/name

Trim to the oldest backup

Loose even more data!

The BAC Theorem

When Backing up a whole microservice architecture, it is not possible to have both Consistency and Availability

Corollaries

  1. Microservice architectures eventually become inconsistent after disaster strikes when recovering from independent backups
  2. Achieving consistent backups can be attempted by limiting the full availability of the system and synchronizing the backups

Dealing with the Consequences of BAC

  1. Eventual Consistency breeds Eventual Inconsistency
  2. Trade off: Cost of Recovery vs. Prevention
  3. Cluster microservices to be backed up together

References

Made with

http://asq.inf.usi.ch

Acknowledgements

Guy Pardon, Olaf Zimmermann, Florian Haupt, Silvia Schreier, Ana Ivanchikj, Mathias Weske, Adriatik Nikaj, Sankalita Mandal, Hagen Overdick, Jesus Bellido, Rosa Alarcón, Alessio Gambi, Daniele Bonetta, Achille Peternier, Erik Wilde, Mike Amundsen, Stefan Tilkov, James Lewis

image/svg+xml s→s' RepresentationalState Transfer PUT GET GET GET s→s' MessageBus(Multicast) send receive receive receive
image/svg+xml s→s' MessageBus send receive s→s' RepresentationalState Transfer PUT GET s→s' call() Remote ProcedureCall (Poll) s' s→s' callback(s') Remote ProcedureCallback (Push)

Use a spacebar or arrow keys to navigate

Powered by
asq