Managing Data in Microservices

Randy Shoup, VP Engineering at Stitch Fix, spoke on Monday at QCon New York 2017 Conference about managing the data and isolated persistence in Microservices based applications. He talked about how their team applies machine learning techniques to every part of the business like buying, inventory management and styling recommendations.

Personalized recommendations are generated by running the inventory through machine learning to create algorithmic recommendations. These algorithmic recommendations are then human curated by stylists located all around the country to come up with the personalized styling recommendations.

Microservices architectures are evolutionary. Organizations like eBay, Twitter, and Amazon had all gone through few architecture iterations to transition from monolith applications to microservices.

In addition to having a single purpose, a well defined interface, being modular and independent, microservices should also be responsible for isolated persistence. Shoup discussed approaches like operating your own data store or using a persistence service for microservices persistence. In the first approach, you store data in your own database instances owned and operated by the service team. With the persistence service option, you store data in a separate schema in the database, operated as a service by another team or by a third-party provider. The data should be isolated from all other consumers of the service.

Events are a first class construct in microservices architecture. They represent how the real world works and have applications in domains such as finance. Events are a critical part of a service interface which should include all the events the service produces and all the events that service consumes.

Extracting microservices from a monolithic shared database involves the following steps:

Create a service: The service boundaries should include the service itself and the database it fronts.
Applications use the service: Decouple the apps from the shared database by using the newly created service.
Move data to private database: Then move the data from the shared database to a new private database. No impact on the client apps because they no longer directly depend on the database.
Repeat: Follow the same process for other business functions in the application that need to be their own microservices.

Shoup also discussed the microservices techniques for use cases involving shared data, joins, and transactions.

Shared Data: Create a service that's the single System of Record (SoR) and owns every piece of data. Every other copy of the data is a read-only, non-authoritative cache. To access shared data, you can use one of three options: a synchronous lookup (one service calls the other service for data), asynchronous event with a cache, or a shared metadata library.

Joins: Splitting the data into separate services makes joins very hard. You can perform data joins in the microservices world by using the joins in the client application or by creating "Materialized Views" by listening to events from two services and maintaining a denormalized join of the two data sets in local storage.

Transactions: Transaction management is easy in monolithic databases but very hard in microservices architecture because the data is split across different services. Implement transactions as workflows with a rollback mechanism by applying the compensating operations in reverse order. Many real world systems like payment processing and expense approval already do this. These workflows are also ideal candidates to use Functions as a Service (Serverless architecture).

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter