Jabez Eliezer Manuel, Senior Principal Engineer at Booking.com, presented Behind Booking.com's AI Evolution: The Unpolished Story at QCon London 2026, where he discussed how Booking.com has evolved over the past 20 years and the challenges they faced on their journey to incorporate AI.
Paying tribute to the 20th edition of QCon London, Manuel kicked off his presentation with a look at technology from 2005. In particular: the Motorola Razr V3 was a popular cell phone; Web 2.0 had started to emerge; and Booking.com was nine years old.
In February 2005, Booking.com started their initial set of A/B testing experiments for which they had more than 1000 experiments in parallel and 150,000 total experiments. However, they observed a less than 25% success rate. Manuel stated that the goal wasn't to be right; it was to learn fast. These experiments ultimately built their Data-Driven DNA.
Manuel's presentation covered three layers: Data Management, Machine Learning Engineering and Domain Intelligence
Data Management
Booking.com's original tech stack was built on Perl libraries and MySQL that offered asynchronous replication and commercial support. They had only one master database in 2005 that has grown into approximately 6800 database instances in 2020. Their MySQL setup is also unique because they don't have specialized hardware, stored procedures, Universal Disk Formats (UDFs), database views, and a cache layer.
Their "secret sauce," as Manuel characterized it, consisted of smaller databases (with a 2TB limit) that fit in Non-Volatile Memory Express (NVMe) solid state drives. They observed point queries that were less than 350 microseconds.
This model was successful until their data grew too large. To remedy this, Booking.com added Apache Hadoop for distributed storage and processing at scale. By 2011, they had two on-premise Hadoop clusters that each contained approximately 60,000 cores and 200 PB hard disk space.
Hadoop had powered their machine learning pipeline for many years until they discovered cracks in the system. From a machine learning scientist's perspective, these cracks included: noisy neighbors where one bad query clogged a cluster; no GPU support; and capacity issues that caused overloads and outages at peak times. By 2018, it was decided to sunset Hadoop, but the process to upgrade and migrate away from Hadoop took approximately seven years.
There were five phases in Booking.com's migration strategy:
- Map their entire ecosystem.
- Analyze usage to reduce scope.
- Apply the Google Search PageRank algorithm.
- Migrate in waves.
- Phase out Hadoop.
Manuel stated that the key to their success was a unified command center.
Machine Learning Engineering
The evolution of Booking.com's machine learning stack started with Perl libraries and MySQL in 2005 to agentic systems in 2025. In between, there was Apache Oozie with Python, Apache Spark with MLlib, H2O.ai, deep learning and GenAI.
Manuel maintained that 2015 was a pivotal year for Booking.com as they solved two core problems: realtime predictions using online inference at scale; and feature engineering for training and inference.

As of 2024, their current machine learning inference platform has more than 480 machine learning models, 400 billion predictions per day and a latency of less than 20 milliseconds.
Domain Intelligence
Manuel discussed four domain-specific machine learning platforms and their respective use cases. The first three are: GenAI with use cases that include trip planning, smart filters and review summaries; Content Intelligence, a machine learning content hub for image and review analyses and text generation, includes use cases such as detailed content for hotels; and Recommendations with use cases for displaying personalized content for their customers.
Ranking, the fourth domain-specific machine learning platform for personalized real-time ordering, was a more complex task. Booking.com's three-way optimization challenge included: choice and value; exposure and growth; and efficiency and revenue.
Their 2005 ranking formula was simply a function that included parameters such as bookings and the number of views plus a random number function. They tried to evolve their formula with factors such as cancelations, distance-based ranking, room availability and hotel impressions. As they attempted to replace the ranking formula with machine learning, it was discovered that their formula was "undefeatable," as characterized by Manuel, due to infrastructure limitations.
Their experiments typically ran for two to four weeks, but they looked for improvement. They adapted their A/B testing experiments to include a technique of interleaving where 50% of each set of experiments were essentially interwoven into a single experiment. This allowed for more variants with less traffic. So it was decided to preselect with interleaving and validate with A/B testing.

Manuel concluded his presentation with how the domain-specific platforms are now unified for their orchestration layer.