Managing Data in Microservices

| by Srini Penchikala Follow 38 Followers on Jun 28, 2017. Estimated reading time: 2 minutes | NOTICE: The next QCon is in San Francisco Nov 5 - 9, 2018. Save an extra $100 with INFOQSF18!

Randy Shoup, VP Engineering at Stitch Fix, spoke on Monday at QCon New York 2017 Conference about managing the data and isolated persistence in Microservices based applications. He talked about how their team applies machine learning techniques to every part of the business like buying, inventory management and styling recommendations.

Personalized recommendations are generated by running the inventory through machine learning to create algorithmic recommendations. These algorithmic recommendations are then human curated by stylists located all around the country to come up with the personalized styling recommendations.

Microservices architectures are evolutionary. Organizations like eBay, Twitter, and Amazon had all gone through few architecture iterations to transition from monolith applications to microservices.

In addition to having a single purpose, a well defined interface, being modular and independent, microservices should also be responsible for isolated persistence. Shoup discussed approaches like operating your own data store or using a persistence service for microservices persistence. In the first approach, you store data in your own database instances owned and operated by the service team. With the persistence service option, you store data in a separate schema in the database, operated as a service by another team or by a third-party provider. The data should be isolated from all other consumers of the service.

Events are a first class construct in microservices architecture. They represent how the real world works and have applications in domains such as finance. Events are a critical part of a service interface which should include all the events the service produces and all the events that service consumes.

Extracting microservices from a monolithic shared database involves the following steps:

  • Create a service: The service boundaries should include the service itself and the database it fronts.
  • Applications use the service: Decouple the apps from the shared database by using the newly created service.
  • Move data to private database: Then move the data from the shared database to a new private database. No impact on the client apps because they no longer directly depend on the database.
  • Repeat: Follow the same process for other business functions in the application that need to be their own microservices.

Shoup also discussed the microservices techniques for use cases involving shared data, joins, and transactions.

Shared Data: Create a service that's the single System of Record (SoR) and owns every piece of data. Every other copy of the data is a read-only, non-authoritative cache. To access shared data, you can use one of three options: a synchronous lookup (one service calls the other service for data), asynchronous event with a cache, or a shared metadata library.

Joins: Splitting the data into separate services makes joins very hard. You can perform data joins in the microservices world by using the joins in the client application or by creating "Materialized Views" by listening to events from two services and maintaining a denormalized join of the two data sets in local storage.

Transactions: Transaction management is easy in monolithic databases but very hard in microservices architecture because the data is split across different services. Implement transactions as workflows with a rollback mechanism by applying the compensating operations in reverse order. Many real world systems like payment processing and expense approval already do this. These workflows are also ideal candidates to use Functions as a Service (Serverless architecture).


Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Shared Data? by Jose Asilis

I have a question:
By Shared Data, does he mean that there will be a service that will call other services or will it contained duplicated data in that service? Or maybe it's a service where all the shared data will reside? (If it's the latter, how would one model it without violating a bounded context?)

Re: Shared Data? by Randy Shoup

See the slides here: What I was trying to do was to give some tools in your toolbox for solving the problem of sharing data. You mention 2 of the techniques already -- Synchronous Lookup; and Async event + Local cache. Each has its tradeoffs. I'd definitely not recommend a single service for all shared data, though; that's a monolith by another name.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

2 Discuss

Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you