BT

How Booking.com Uses Kubernetes for Machine Learning

| by Manuel Pais Follow 8 Followers on Apr 01, 2018. Estimated reading time: 2 minutes | This item in chinese |

A note to our readers: You asked so we have developed a set of features that allow you to reduce the noise: you can get email and web notifications for topics you are interested in. Learn more about our new features.

Sahil Dua, a developer at Booking.com, explained how they have been able to scale machine learning (ML) models for recommending destinations and accommodation to their customers using Kubernetes, at this year's QCon London conference (PDF of slides). In particular, he stressed how the properties of a Kubernetes cluster -- elasticity and resource starvation avoidance on containers -- helps them run computationally (and data) intensive, hard to parallelize, machine learning models.

Dua provided more details on how the properties of the Kubernetes platform benefited his team and are key for Booking.com to utilise many ML models at large scale; around 1.5 million room nights are booked daily, and the site receives 400 million monthly visitors:

  • Kubernetes isolation - processes that run within Linux containers (and Kubernetes pods) can be isolated at the operating system level, and therefore can be orchestrated to not directly compete for resources
  • Elasticity - pods running ML models can auto-scale up or down based on resource consumption
  • Flexibility - the self-service nature of Kubernetes, and the rapid deployment of containers allows the team to quickly try out new libraries or frameworks
  • GPU support - Kubernetes offers support for NVIDIA GPUs (albeit this is still in alpha), it allows 20x to 50x speed improvements

The declarative syntax of Kubernetes deployment descriptors is easy for non-operationally focused engineers to understand. By specifying that a pod requires a GPU resource, this tells Kubernetes to schedule it in a node with a GPU unit:

resources:
  limits:
    alpha.kubernetes.io/nvidia-gpu: 1

Each pre-trained ML model runs as a stateless app inside a container. The container image does not include the model itself, and instead this is retrieved at startup time from Hadoop. This keeps image sizes small and avoids having to create a new image every time there is a new model, thus speeding up deployments. Once deployed, the model will be exposed via a REST API, and Kubernetes will start probing the container for readiness to receive requests for predictions, until finally traffic will start to be directed to the new container.

booking.com machine learning with Kubernetes

Besides Kubernetes' auto-scaling and load balancing, Dua revealed some other techniques used at Booking.com for optimizing latency of the models, namely keeping the model loaded in the container's memory, and warming it up after startup (by issuing an initial request to TensorFlow, Google's ML framework, where the first run is typically slower than the rest). Not all requests come from a live system; in some cases predictions can be precomputed and stored for later usage. Optimizing for throughput (amount of work done per unit of time) is more important for the latter. Batching requests and parallelizing those that are issued asynchronous helped reduce the networking overhead, and improve throughput, said Dua.

ML models need to be trained with pre-selected data sets before they are ready to provide the kind of predictions Booking.com needs. The training part of the process is also run on Kubernetes infrastructure. Base images for the containers where training takes place contain only the required frameworks (such as TensorFlow and Torch) and fetch the actual training code from a Git repository. Again this keeps container images small and avoids proliferation of new images for each new version of the code. Training data is fetched from Hadoop clusters. Once the model is ready (training workload finished), it gets exported back to Hadoop.

Additional information on the talk can be found on the QCon London website, and the video of the talk will be made available on InfoQ over the coming months.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT