Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Distributed PostgreSQL Benchmarks: Azure Cosmos DB, CockroachDB, and YugabyteDB

Distributed PostgreSQL Benchmarks: Azure Cosmos DB, CockroachDB, and YugabyteDB


Microsoft recently discussed the results of distributed PostgreSQL benchmarks, comparing transaction processing and price performance for Azure Cosmos DB for PostgreSQL, CockroachDB, and Yugabyte. With different implementation trade-offs, the results show a higher throughput for Azure Cosmos DB but highlight the challenges of benchmarking distributed databases.

According to the GigaOm benchmark, Azure Cosmos DB for PostgreSQL with Citus distributed tables provides better transaction performance and price performance than CockroachDB Dedicated and Yugabyte Managed, two alternative fully managed distributed databases.

As previously reported on InfoQ, PostgreSQL is becoming the new standard for cloud-distributed databases, with different vendors extending, reimplementing, or forking the popular open-source relational database. The Microsoft-sponsored benchmark has been performed using HammerDB, the open-source tool for benchmarking relational databases managed by the Transaction Performance Council (TPC). Marco Slot, principal software engineer at Microsoft, writes:

GigaOM used HammerDB TPROC-C to benchmark Azure Cosmos DB for PostgreSQL and two comparable managed service offerings (...) GigaOM initially used 1,000 warehouses for their benchmarks, which results in ~100GB of data. However, both CockroachDB and Yugabyte gave surprisingly low throughput. They could get better performance by increasing the number of warehouses for both, without changing the number of connections.


The critical and contentious difference is the usage of Citus, the open-source extension to distribute tables in PostgreSQL that requires developers to specify a distribution column, the shard key:

A core tenet of Citus has always been that Distributed PostgreSQL is about performance at scale because PostgreSQL is good enough for everything else.

The other distributed databases tested do not rely on the definition of a distribution column. On Reddit, Slot acknowledges the differences:

The performance difference is almost a little awkward. I wanted to highlight that using Citus does require some additional steps (e.g. create_distributed_table) to define distribution columns and co-location (otherwise, you're just using a single node). Our experience is that without co-locating related data your typical transactional PostgreSQL workload will perform much worse than a single server.

Tweeting about the benchmark, Franck Pachot, developer advocate at YugabyteDB, questions:

Is this comparing Citus (sharding over SQL databases with two-phase commit protocol) with YugabyteDB & CockroachDB (SQL over distributed storage with global ACID transactions)? Not the same level of resilience, global consistency, elasticity, auto-splitting/rebalancing. They have different use cases.

The report acknowledges that different distributed databases might excel for different characteristics, including response time, concurrency, fault-tolerance, functionality, consistency, or durability, targeting different deployments. Slot concludes:

Distributed systems, and distributed databases especially, are all about trade-offs at every level. CockroachDB and Yugabyte make different trade-offs and do not require a distribution column (...) The decision to extend Postgres (as Citus did), fork Postgres (as Yugabyte did), or reimplement Postgres (as CockroachDB did) is also a trade-off with major implications on the end user experience, some good, some bad.

To encourage customers to run benchmarks that match their workloads, Microsoft shared helper scripts to run HammerDB benchmarks on Azure Cosmos DB. Jelte Fennema, senior software engineer at Microsoft, shows how to run a benchmark automatically, including cluster setup and tear down.

According to GigaOm, Google Spanner Postgres Interface was not included in the comparison as the service does not provide the level of Postgres compatibility required to run the benchmark.

About the Author

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Awkward performance difference

    by Frits Hoogland,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Please note: I do work for YugabyteDB as a developer advocate.

    I would like to emphasise that the comparison is not equal nor fair. This is mentioned by my colleague Franck Pachot, but let me highlight one of the critical differences: the Microsoft Cosmos database has high availability turned off, which is according to the available documentation means one copy of the data is maintained, which means a single change results in a write at a single location, whilst with YugabyteDB the replication factor ("high availability") is set at three for our managed offering, meaning a single change results in writes at three locations. I can't talk for Cockroach, but this might be true for them too.

    The reason for explicitly pointing this out is that this sheds a totally different light on the "awkward" performance difference.

  • Re: Awkward performance difference

    by Renato Losio,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Thank you Frits for highlighting the difference.

    Very good point, I agree that selecting the "minimal fault tolerance option" was a questionable choice for the comparison even if I have doubts that alone this difference would cover the gap in throughput.

    And it is one more difference that makes a comparison between the 3 different options hard to judge, as I pointed out in the piece and both Slot and Pachot highlighted too.

    GigaOm had to acknowledge what you wrote: "Also, the services offered by these platforms have several high availability (HA) options that are not directly comparable".

    About CockroachDB, I think you are correct. According to the documentation that applies to CockroachDB Dedicated for Single Region: "Updates (writes) must reach a majority of replicas (2 out of 3 by default) before they are considered committed."

    This again highlights the challenges of benchmarking different distributed databases.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p