Q&A on Machine Learning and Kubernetes with David Aronchick of Google from Kubecon 2017

At the recently concluded Kubecon in Austin, TX, attended by over 4000 engineers, Kubernetes was front, left and center. Due to the nature of workloads and typical heavy compute requirements in training algorithms, Machine Learning topics and its synergy with Kubernetes was discussed in many sessions.

Kubeflow is a platform for making Machine Learning on Kubernetes easy, portable and scalable by providing manifests for creating:

A JupyterHub to create and manage Jupyter notebooks
A Tensorflow training controller to adapt for both CPUs and GPUs, and
A Tensorflow serving container

InfoQ caught up with David Aronchick, product manager at Google and contributor to Kubeflow. He delivered a presentation highlighting the synergy between Kubernetes and Machine Learning at Kubecon 2017.

InfoQ: ML seemed to get a lot of attention at Kubecon. Any particular reason(s) for this?

Aronchick: There’s no question that ML is changing the way business is getting done in nearly every industry. Whenever you get that big of an audience together, with as many smart people in a room talking about the future of technology, they’re going to cover the latest trends and new advancements. The more than 4,000 attendees at Kubecon wanted to see what everyone else was doing in cutting edge Machine Learning and how they might improve their own processes with this new technology.

That said, a lot of new projects got out the door with a fair bit of activity in the space. There were, in fact, so many announcements that there was an entire machine learning track. I’d say the number one takeaway was that machine learning on Kubernetes is not just the future: it’s here today.

InfoQ: Can you describe the overall synergy between ML and Kubernetes, if any?

Aronchick: ML is a new way to use the enormous amount of data now available and answer business questions with more accurate answers faster than ever before. However, the infrastructure support for ML solutions, which are often quite complicated, are still very nascent, requiring a lot of custom scripts, dependency analysis and compatibility issues. And because ML stacks are often deployed in multiple locations (for development, training and production), keeping everything in sync makes the challenge exponentially harder.

Kubernetes offers a common platform for helping to deploy and run these ML platforms at scale. With a rich orchestration that works in multiple clouds, Kubernetes gives the data scientist, developer and IT professional a straightforward way to deploy, run and manage even complicated, multi-service ML workloads.

InfoQ: What’s the difference between installing ML tools on Kubernetes via Helm charts for instance as opposed to using Kubeflow?

Aronchick: The actual installation of ML tools is done through a packaging system. Currently, though Kubeflow is using ksonnet, we fully expect to support a variety of different deployment techniques. The value of Kubeflow is more about giving simple ways for the multiple tools to work well together. We’re still evaluating all the different options (helm, ksonnet, etc), but we hope to provide a set of richer objects on top of just the installation to make sure the many packages involved work well together and out of the box.

InfoQ: If I am an ML/Data Scientist, how will Kubeflow simplify my daily life rather than complicate it by adding the Kubernetes layer?

Aronchick: Because Kubernetes provides deployment objects and service endpoints, if you’re a data scientist, it means you get to focus on JUST what matters to you - solving data problems. We would not expect, nor do we require, data scientists to install complicated Kubernetes set ups in order to use Kubeflow. On your laptop, you might use minikube. On your on-premise cluster, you might use a Kubernetes installation provided by your organization. And in the cloud, you can use a hosted Kubernetes provider. In each case, you just have one command to install Kubeflow, and then you see the Tensorflow service and Jupyter notebooks you’re familiar with.

InfoQ: Can you provide more technical details about how support for other ML toolkits can be integrated into Kubeflow?

Aronchick: Because we’re using native Kubernetes tooling, it should be a fairly straightforward integration to the existing deployment packages. We’re coming together as a community to offer a wide variety of options, but would love other ML toolkits direct engagement, since they know their platforms best (we’re in talks with most other groups right now). We also have a few discussions in our Github repo about which the next toolkit we’ll be adding taking on will be, but we’d love help!

InfoQ: How’s the community support for Kubeflow today and what is the roadmap for Kubeflow including perhaps similar support for platforms like Cloud Foundry, OpenShift and so on?

Aronchick: Kubeflow is Kubernetes native and we are committed to ensuring this remains the case. That means we will always plan to support for any platform that is Kubernetes Conformant. This includes Cloud Foundry and OpenShift native (Red Hat is already contributing to our project). We also already announced contributions with Canonical/Ubuntu, Weaveworks, Caicloud and many other platform providers. We take the value of a ubiquitous ML stack very seriously and want to ensure if you’re a data scientist, Kubeflow will work wherever you need it to.

Keynote sessions and other recordings are available via the schedule for Kubecon.

InfoQ Software Architects' Newsletter

Follow us on

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter