BT

InfoQ Homepage News Managing Data in Microservices

Managing Data in Microservices

Leia em português

This item in japanese

This item in chinese

Bookmarks

Randy Shoup, VP Engineering at Stitch Fix, spoke on Monday at QCon New York 2017 Conference about managing the data and isolated persistence in Microservices based applications. He talked about how their team applies machine learning techniques to every part of the business like buying, inventory management and styling recommendations.

Personalized recommendations are generated by running the inventory through machine learning to create algorithmic recommendations. These algorithmic recommendations are then human curated by stylists located all around the country to come up with the personalized styling recommendations.

Microservices architectures are evolutionary. Organizations like eBay, Twitter, and Amazon had all gone through few architecture iterations to transition from monolith applications to microservices.

In addition to having a single purpose, a well defined interface, being modular and independent, microservices should also be responsible for isolated persistence. Shoup discussed approaches like operating your own data store or using a persistence service for microservices persistence. In the first approach, you store data in your own database instances owned and operated by the service team. With the persistence service option, you store data in a separate schema in the database, operated as a service by another team or by a third-party provider. The data should be isolated from all other consumers of the service.

Events are a first class construct in microservices architecture. They represent how the real world works and have applications in domains such as finance. Events are a critical part of a service interface which should include all the events the service produces and all the events that service consumes.

Extracting microservices from a monolithic shared database involves the following steps:

  • Create a service: The service boundaries should include the service itself and the database it fronts.
  • Applications use the service: Decouple the apps from the shared database by using the newly created service.
  • Move data to private database: Then move the data from the shared database to a new private database. No impact on the client apps because they no longer directly depend on the database.
  • Repeat: Follow the same process for other business functions in the application that need to be their own microservices.

Shoup also discussed the microservices techniques for use cases involving shared data, joins, and transactions.

Shared Data: Create a service that's the single System of Record (SoR) and owns every piece of data. Every other copy of the data is a read-only, non-authoritative cache. To access shared data, you can use one of three options: a synchronous lookup (one service calls the other service for data), asynchronous event with a cache, or a shared metadata library.

Joins: Splitting the data into separate services makes joins very hard. You can perform data joins in the microservices world by using the joins in the client application or by creating "Materialized Views" by listening to events from two services and maintaining a denormalized join of the two data sets in local storage.

Transactions: Transaction management is easy in monolithic databases but very hard in microservices architecture because the data is split across different services. Implement transactions as workflows with a rollback mechanism by applying the compensating operations in reverse order. Many real world systems like payment processing and expense approval already do this. These workflows are also ideal candidates to use Functions as a Service (Serverless architecture).

 

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Shared Data?

    by Jose Asilis /

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I have a question:
    By Shared Data, does he mean that there will be a service that will call other services or will it contained duplicated data in that service? Or maybe it's a service where all the shared data will reside? (If it's the latter, how would one model it without violating a bounded context?)

  • Re: Shared Data?

    by Randy Shoup /

    Your message is awaiting moderation. Thank you for participating in the discussion.

    See the slides here: qconnewyork.com/system/files/presentation-slide.... What I was trying to do was to give some tools in your toolbox for solving the problem of sharing data. You mention 2 of the techniques already -- Synchronous Lookup; and Async event + Local cache. Each has its tradeoffs. I'd definitely not recommend a single service for all shared data, though; that's a monolith by another name.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT

Is your profile up-to-date? Please take a moment to review and update.

Note: If updating/changing your email, a validation request will be sent

Company name:
Company role:
Company size:
Country/Zone:
State/Province/Region:
You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.