Andrew Pavlo’s annual retrospective on the database world has recently been released, covering trends and innovations from the past year. The opinionated report, "Databases in 2024: A Year in Review," highlights that while we may indeed be in the "golden era of databases," last year brought significant license changes, the rapid growth of DuckDB, and some surprising new releases.
Discussing the "turbulent" year for open source databases, including major license changes for Redis and Elasticsearch, Pavlo writes:
Notice that Redis and Elasticsearch are receiving more backlash compared to other systems that made similar moves. (...) It cannot be because the Redis and Elasticsearch install base is so much larger than these other systems (...) since the number of MongoDB and Kafka installations was equally as large when they switched their licenses. In the case of Redis, I can only think that people perceive Redis Ltd. as unfairly profiting off others' work since the company's founders were not the system's original creators.
Similar to how Postgres has emerged as the default choice for operational databases in recent years, DuckDB has, according to the 2024 article, "entered the zeitgeist as the default choice for running analytical queries on data." This trend justifies the recent release of four different extensions integrating the OLAP database with Postgres, as Pavlo explains:
Most OLAP queries do not access that much data. Fivetran analyzed traces from Snowflake and Redshift and showed that the median amount of data scanned by queries is only 100 MB. Such a small amount of data means a single DuckDB instance is enough for most to handle most queries.
An associate professor at Carnegie Mellon University and former co-founder of OtterTune, Pavlo began publishing an annual analysis in 2021, focusing on the dominance of PostgreSQL. In 2022, the main topic was "Blockchain databases are still a stupid idea," while in 2023, he shifted focus to the rise of vector databases.
The 2024 retrospective by the database management systems expert garnered significant attention, sparking popular threads on Reddit and Hacker News. An opinionated statement ("I don't care for Redis. It is slow, it has fake transactions, and its query syntax is a freakshow.") ignited a heated exchange on Hacker News with Redis creator Salvatore Sanfilippo, better known as "antirez," about the importance and design of the once open-source key-value store.
Focusing on major releases, the retrospective highlights Microsoft Garnet and Valkey as notable key-value stores. Pavlo, however, expresses disappointment with the feature list in MySQL v9 and views the retirement of Amazon QLDB as significant:
If Amazon can't figure out how to make money on a blockchain database, then nobody can.
Discussing the recent Aurora DSQL preview, he adds:
This announcement shows you how much brand recognition the name "Aurora" carries in the database world because AWS used it for this new DBMS that seemingly shares no code with their flagship Aurora Postgres RDS offering.
Following Pavlo's lead, the team at ByteBase recently released "Database Tools in 2024: A Year in Review". Many of Pavlo’s university lessons, along with his recent talk "What Goes Around Comes Around... And Around…," are available on YouTube.