Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Microsoft Directly Challenges MongoDB and Cassandra with Cosmos DB

Microsoft Directly Challenges MongoDB and Cassandra with Cosmos DB

This item in japanese


The phrase "embrace, extend, and extinguish" is often thrown about whenever someone is upset with Microsoft. Superficially it describes any attempt by a technology company to attract users of a competitor's product, but the actual strategy is more complicated than that. In this report we will use Azure Cosmos DB to illustrate the concept.


The first step is to embrace the competitor's standards. In the 80s and 90s this meant having the ability to read and write their file formats. For example, MS Word needed to be able to open, modify, and save WordPerfect documents flawlessly. Otherwise users of the then dominant WordPerfect would not even consider trying to use Word.

In the world of NoSQL databases, the standard to be embraced is the API. Unlike relational databases, which at least nominally support the ANSI SQL standard, each NoSQL database has its own set of APIs and matching drivers. So theoretically one is locked into a specific product and cannot switch to any other without a costly rewrite.

Microsoft's Cosmos DB addresses this vendor lock-in by embracing the APIs and drivers that already exist for the more populate databases. And by "embrace" we mean this in a very literal fashion.

When you provision a Cosmos DB instance, you must select an API type. Options include:

  • SQL (actually the old Azure DocumentDB)
  • Gremlin, a graph database
  • MongoDB
  • Azure Table
  • Cassandra

If you choose MongoDB as your API, you can then use the existing MongoDB drivers. Not a driver that looks somewhat like the one for MongoDB. Rather, Microsoft's documentation points you directly to the official MongoDB drivers for Node.js, .NET, Java, etc. Likewise, for Gremlin and Cassandra you are expected to use their respective drivers when communicating with Cosmos DB in Gremlin or Cassandra mode.

In theory this means that Azure Cosmos DB is a drop-in replacement for these other NoSQL databases.


Given that all of the third-party databases listed above are free/open source, Microsoft has to offer something more than just hosting. Otherwise customers will switch back as soon as someone else offers a compatible cloud solution with better performance and/or lower prices.

This is where Microsoft's other Azure products come into play. Cosmos DB can be integrated with open source products such as Apache Spark or Apache Kafka as well as proprietary products such as Azure Search, Azure Data Factory, and HDInsight. Rather than extending the file format, Microsoft is attempting to extend what you can do with the database.

While switching from MongoDB's cloud hosting to Cosmos DB is mostly a QA and operations question, the use of other Azure products can put significant limitations on your future architectural options. The convenience and capabilities offered today need to be carefully weighed against long term plans.


It is hard to predict where the NoSQL sector will go in the long run. One possibility is that a standard query language, much like ANSI SQL in the 1980's, will be developed and shared across all major NoSQL databases. Another is that ANSI SQL itself will continue to evolve until it is capable of serving that role.

Or perhaps the existing APIs such as found in MongoDB will become de facto standard, informally agreed upon by major vendors but never formally approved by a standards body.

In the meantime, it is unlikely that any one NoSQL database will stay in a dominant position so long as the competitors can easily copy their REST APIs. Even if CosmoDB manages to unseat MongoDB or Cassandra, another database/cloud vendor such as Amazon or Google can do the same to them.

Rate this Article