Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Cosmos DB - A Globally Distributed Database

Cosmos DB - A Globally Distributed Database

Today at Day 2 of the PASS Summit, Microsoft Group Product Manager for Cosmos DB Rimma Nehme was on hand to give the morning keynote about Azure Cosmos DB.  Leading an informative, fast pace talk, Nehme covered Microsoft’s approach to designing and building Cosmos DB.  

Before introducing Cosmos DB, Nehme discussed the market trends that influenced the design team’s thinking.  According to Nehme, 90% of the world’s data was created in the last 2 years.  Starting a 10 year period beginning in 2010, data worldwide is expected to grow between 50 to 100 times.  Combining the need to accommodate this growth in data with the idea that computations should be close to the data, traditional “Earthly” databases are not suitable for this shift in technology.

Project Florence, started by Dharma Shukla, is Microsoft’s answer to dealing with these trends.  By way of background, Florence was chosen as the project name because Shukla was in that city on vacation when he committed the first code.  On a broader scale, Florence also fits as that was the epicenter for the start of the European Renaissance, drawing parallels to the predicted explosion in data needs for today’s computing world.

Florence has since become Azure Cosmos DB, and at the beginning of the project, the project goals were dictated by the needs of Microsoft’s internal clients (Bing, Xbox Live, etc.).  In 2010, the requirements looked like this:

  1. Turnkey global distribution
  2. Guaranteed low latency at the 99th percentile, worldwide
  3. Guaranteed high availability within region and globally
  4. Guaranteed consistency
  5. Elastically scale throughput and storage, anytime on-demand, globally
  6. Comprehensive SLAs (availability, latency, throughput, consistency)
  7. Operate at low cost
  8. Iterate and query without worrying about schemas and index management
  9. Provide a variety of data model and API choices

Put simply, the Cosmos DB team was tasked with determining how to build a globally distributed database for the cloud, while serving the needs of internal Microsoft customers.  The success of Cosmos DB has led to it being considered a Ring 0 service within Azure, which means when a new Azure geographic region is established it is among the first services to be offered within that region.  Interestingly from a developer standpoint, Cosmos DB is primarily written in C++.

The 5 Consistency models offered in Azure Cosmos DB are Strong, Bounded-stateless, Session, Consistent Prefix, and Eventual.  Of those, Session is the most popular with the Bounded-stateless a distant second.

Nehme’s keynote was fast-paced and packed with information.  Those looking for more details on the technology behind Cosmos DB can begin with an article written this spring by Dharma Shukla.

Rate this Article