InfoQ Homepage Podcasts Real Time ML Pipelines Using Quix with Tomáš Neubauer

Real Time ML Pipelines Using Quix with Tomáš Neubauer

Apr 24, 2023

Live from the venue of the QCon London Conference we are talking with Tomáš Neubauer. He will talk about Quix Streams, an open-source Python library that simplifies real-time machine learning pipelines. Tomáš will discuss various architecture designs, their pros and cons, and demonstrate a real use case of detecting a cyclist crash using Quix Streams and a TensorFlow model.

Key Takeaways

Quix Streams is an open-source Python library that simplifies real-time machine learning (ML) pipelines, allowing data teams to directly contribute to products and test and develop in real-time.
Quix Streams can use Kafka as a message broker, and offers a higher level of abstraction for streaming concepts, making it easier to use than Kafka alone.
The library supports a range of data types, including time series, numerical, string, geospatial, audio, and video data, as well as metadata and events.
Quix Streams can be used with any Python package, allowing for greater flexibility in ML model implementation and deployment, with no restrictions on the frameworks used.
Tutorials and documentation are available for those interested in getting started with real-time ML using Quix Streams, covering everything from installing Python and Kafka to working through use cases and understanding streaming concepts.

Subscribe on:

Transcript

Introduction [00:44]

Roland Meertens: Welcome everyone to the InfoQ podcast. My name is Roland Meertens, your host today, and I will be interviewing Tomas Neubauer. He is the CTO and founder of Quix. We are talking to each other in person at the QCon London conference where he gave the presentation simplifying realtime ML pipelines with Quix Streams and Opensource Python library from ML engineers. Make sure to watch his presentation as it delivers tremendous insights and to review time ML pipelines and how to get started with Quix Streams yourself.

During today's interview, we will dive deeper into the topic of real-time ML. I hope you enjoy it and I hope you can learn something from it.

Tomas. Welcome to the InfoQ podcast.

Tomáš Neubauer:

Thank you for having me.

Roland Meertens: You are giving your talk tomorrow here at QCon London. Can you maybe give a short summary of your talk?

About Quix Streams [01:37]

Tomáš Neubauer: Sure, yeah. So I'm talking about the open-source library Quix Streams. It's a Python stream processing library for data and workloads on top of Kafka. And I'm talking about how to use this library in projects that involve realtime machine learning. And I'll talk about the landscape, different architecture designs to solve this problem, pros and cons of each. And then I put this against a real use case, which I, at the end of the presentation, develop on stage from scratch. And in this case, it's detecting a cyclist crash. So imagine a fitness app running on your handlebars and you crashed and you want to inform your relatives or emergency services.

Roland Meertens: So then you are programming this demo live on stage. Which programming language are you using for this?

Tomáš Neubauer: Yes, I'm using the Opensource library Quix Stream. So I'm using Python and yeah, I'm basically starting with having just data from the app, telemetry data like g-force sensor, GPS-based location, speed, et cetera. And I use a machine learning model that has been trained on history data to detect that the cyclists crashed.

Roland Meertens: And what kind of machine learning model is this?

Tomáš Neubauer: It's a sensor flow model and we basically train it beforehand, so that's not done on the stage and we label data correctly and train it in Google Colab. And I'm going to talk about how to get that model from that Colab to production.

What is Real Time ML? [02:40]

Roland Meertens: And so if you're talking about real-time machine learning, what do you mean with real time? How fast is real time? When can you really say this is real time ML?

Tomáš Neubauer: Well, real time in this case will be five times per second. We will receive telemetry data points from the cyclist. So all of these parameters that I mentioned will be five times per second stream to the cloud. And we will with, I would say 50 milliseconds, delay. Inform either the services or a consuming application that there was a crash. There's no one hour, one day, one minute delay.

Roland Meertens: Okay. So you get this data from your smart device and you are cutting this up into chunks which are then sent in to your API or to your application?

Tomáš Neubauer: So we're streaming this data directly through the pipeline without batching anything. So basically it's coming piece by piece and we are not waiting for anything either. So every 200 milliseconds we do this detection and either say this is not a crash or this is a crash. And in the end of the presentation, I will have a simple front end application with a map and alert because obvious I'm not going to crash a bike on the stage. I'm going to have a similar model that will detect shaking with the phone and I'm going to show everyone that the shake is detected.

Data for Real Time ML [04:19]

Roland Meertens: And where does this come from? How did you get started with this?

Tomáš Neubauer: The roots of this idea, for this Opensource library is coming from my previous job where I was working in McLaren and I was leading a team that was connecting F1 cars to the cloud and therefore to the factory. So people don't have to travel every second weekend around the world to build real-time decision insight. What I mean by that is basically deciding in a split second that the car needs different tires, different settings for the wing, et cetera. And it was a challenging use case, lots of data around 30 million numbers per minute from each car. And so we couldn't use any database technology that I'm going to talk about in a presentation and we had to adapt streaming technology. But the biggest problem we faced actually was to get this technology to the hands of our functional team, which were made of mechanical engineers, ML engineers, data scientists. They all use Python and really struggled to use this new tech that we gave them.

Roland Meertens: And how should I see this? So you have this car going over circuits, generating a lot of data, this sends it back to some kind of ground station and then do you have humans making decisions real time or is this also ML models which are making decisions?

Tomáš Neubauer: The way how it works is that in a car, there are sensors that's collecting data. Some of them are even more than kilohertz or more than a thousand numbers per second, that is streamed over the radio to the garage where there's a direct connection to the cloud. And then through the cloud infrastructure, it's being consumed in a factory where people during the week, building new models. And then in a race day there is plenty of screens in the garage where there are dashboards and different waveforms which basically visualizing the result of these models. So the people in the garage can immediately decide that car need something else.

Roland Meertens: And so this is all part of the race strategy where people need to make decisions in split seconds and this needs the data to be available and the models to run in split seconds?

Tomáš Neubauer: Yes, exactly. And basically during my time in Mclaren, we took that know-how from racing and actually applied outside. So at the end we end up doing the same thing for high-speed railway in Singapore where basically we were using machine learning to detect break and suspension deterioration based on the history of data. So there are certain vibration signatures that will lead to a deterioration of the object.

Programming languages for Real Time ML [06:45]

Roland Meertens: And you were talking about different programming languages like either Java or Python. How does this integrate with what you're working on?

Tomáš Neubauer: Basically, the whole streaming world is traditionally Java-based. Most of the brokers are built in Java or Scala. And as a result, most of the tools around it and most of the libraries and frameworks are built in Java. Although there are some ports and some libraries that let you use these libraries for Python, although there are just a few of them. It's quite painful because this connection doesn't really work well and therefore it's quite difficult for patent people, especially people from data teams to leverage this stack. And as a result, most of the projects really doesn't work that way. And most of the people work in Jupyter Notebooks and silos and then software engineering taking these models into production.

Roland Meertens: So what do you do to improve this?

Tomáš Neubauer: What I believe is that unless data team work directly on product, it's never going to work really well because people don't see the result of their work immediately and they are dependent on other teams. And every time that one team is dependent on another, it just kills innovation and efficiency. So the idea of this is that a data team directly contribute to a product and can test and develop straightaway. So the code doesn't run in Jupyter Notebook or stays there but actually goes to realtime pipelines and so people can immediately see a result of their work on a physical thing.

Roland Meertens: And you mentioned that there's different ways people can orchestrate something like this. There's different ML architectures you could use or you could use for such an approach. Which ones are there?

Tomáš Neubauer: So there's many options to choose, from all different dimensions that you look at the architecture of building such a system. But one of them is obviously if you're going to go for batch or streaming. So are you going to use technology like Spark and reactive data in batches or you need a real time system where you need to use something like Kafka or Pulsar or other streaming technologies. And the second thing is how you actually going to use your ML models?

So you can deploy them behind the API or you can actually embed them to a streaming transformation and discuss both pros and cons of each solution.

Roland Meertens: And what do you mean with a streaming transformation?

Tomáš Neubauer: This is a fundamental major concept of what I'm going to talk about, which is a pub and sub service. So basically we are going to subscribe in our model to a topic where we are going to get input data from the phone and we going to output the results. Therefore, is there a crash or no? And this is the major architectural cornerstone of this approach.

The tools needed [09:22]

Roland Meertens: Okay. And you mentioned for example, Kafka and you mentioned some other tools. How does your work then relate to this?

Tomáš Neubauer: Well, what we found out is that Kafka, although it's powerful, it's quite difficult to use. So we have built a level abstraction on top of it. Then we found that that's not enough actually because streaming on itself introduce complexities and different approaches to common problems. I have a nice example of that tomorrow. So we are building abstraction on top of streaming concept as well, which means that you would operate and you would develop your code in Python as it would be in Jupyter Notebook. So what you are used to when you working with a static data would apply to building a streaming transformation.

Roland Meertens: And how do you do this? How can people test this with a pre-recorded stream which they then replay and can you still use a Jupyter Notebook or do you as a machine learning or as a data scientist, do you then use and lose part of your tooling?

Tomáš Neubauer: So the Quix Stream is Opensource library that you can just download from paper and use and connect to your broker. If you don't have a broker, you can set it up. It's Opensource software as well. If you don't want to, you can use our manage broker as well, doesn't matter, it works the same. And then we have some Opensource simulators of data that you can use if you don't have your own. So for example, we have F1 simulator which will give you higher solution data, so that's quite cool. You can also, for example, subscribe to Reddit and get messages on Reddit or you can use the app I'm going to show you tomorrow. It's also Opensource, so you can install it from up store or possibly you can even clone it and change it to suit your need and deploy by yourself.

Different modalities [11:06]

Roland Meertens: So then Quix handles both text messages but also audio or what kind of data do you handle?

Tomáš Neubauer: Yeah, so we handle time series data, which involves a numerical and string values. Then we handle pioneer data, which is audio and video and geospatial, et cetera. Where we allow developers to just attach this and the column and then we have a metadata. So you don't have to repeat for example that this bike has a VMware 1.5. You just send it once and the stateful pipeline will persist that information. And then at the end you also can send events. So for example, crash is a good example of event, it doesn't have any continuous information.

Roland Meertens: Okay. So can you also connect these pipelines such that one pipeline for example gets all the information from your sensor and then sends events to another pipeline? Is this something which is sustainable?

Tomáš Neubauer: Yes. So the whole idea of building systems with this approach is to building pipelines. So each Node in your architecture is a container that connects to one or more input topics and output results to one or more output topics. You create a pipeline that has multiple branches, sometimes they join back together, sometimes they end and when they end they usually either go to database or back to the product. And same is with the stats, they could be from your product or could be CDC from database. So you have multiple sources, multiple destinations, and in the middle you have one or more transformations.

Roland Meertens: And is there some kind of limit to the amount of input or the amount of consumers you have for a pipeline?

Tomáš Neubauer: There isn't really limit to number of transformations or sources. One thing is that Kafka is designed to be one to one or one to a small number of consumers and producers. So if you have a use case like we going to do today with the phones where you can possibly have thousands or millions users, you need to put some gateway between your devices and Kafka, which we'll do. And in our case it'll be a web socket gateway collecting data and then funneling it to topic.

Roland Meertens: Okay. So do you still have some queue in between?

Tomáš Neubauer: There's really any queue in between, but there's a queue obviously in Kafka. So as the data flowing to the gateway, they're being put to the queue in topic and then the services listening to it will just collect, consume and process this data from that queue.

Use cases for Real Time ML [13:33]

Roland Meertens: You already have some consumers who are using this in some creative or interesting ways? What's the most interesting use cases you've seen?

Tomáš Neubauer: Yes, so one really cool use case is from healthcare where there's sensors on your lung and listening to your breathing and then being sent to the cloud. And machine learning is used to detect different illnesses that you have and that's all going to the company app. So it's quite similar to what we are going to do here. Then second quite interesting use cases in a public transport are wifi sensors detecting the occupation of the underground stations and automatically closing opening doors and sending people to a less occupied part of the stations.

Roland Meertens: Oh, interesting. So then you have some signal which tells you how many people are in certain parts of the station?

Tomáš Neubauer: Yes, correct. So you have the realtors all around the station, and then in real time you know that in the north part of the station there is more people than in the south and therefore it will be better if people come from the south and you can do this in a split second.

The implementation [14:33]

Roland Meertens: Oh, interesting. And then in terms of this implementation, if we, for example, want to have some machine learning model act on it, are there specific limitations or specific frameworks you have to use?

Tomáš Neubauer: Basically the beauty of this approach is, and I think that's why it's so suited to machine learning, is that it's just a patent at the end where all the magic happening. So you read data from Kafka into Python, and then in that code you are free to do whatever you want. So that could be using any PIP package out there, you can use the library like open CV for image processing and really anything that is possible in Python, it's possible with this approach. And then you just output it again with the Python interface. So there's no black box operation, there's no domain specific language that you will find in Flink.

Roland Meertens: Do I basically just say, "Whenever you have a new piece of data, call this Python function with these augments?"

Tomáš Neubauer: Correct. And even more than python functions, you can build python classes with all the structure that you are using in Python. You can also try in Jupyter Notebook, so the library will work in a cell in Jupyter Notebook. So again, there's basically a freedom of deployment in running this code anywhere, it's just a python.

Roland Meertens: If people are listening and they're beginners in realtime machine learning, how would you get started? What would you recommend to people?

Tomáš Neubauer: Well, first of all, what I'm doing here today, it's available as a tutorial, all the codes is Opensource, so you can basically try it by yourself. There are other tutorials that we have published that going to are different use cases and going step by step from literally installing Python, installing Kafka, things like that to get this going from the start. So I recommend to people to go to docs that we have for the library. There are tutorials and there are some concepts described. What is the detail of this? So yeah, that would be the best place to start.

Roland Meertens: Are there specific concepts which are difficult to grasp or is it relatively straightforward?

Tomáš Neubauer: What is really complicated is stateful processing that we are trying to solve and abstract from. But if you are interested to learn more about stateful processing, we have it in the docs explained. That's a very interesting concept and it will open the intricacy of the stream processing. But I think the goal of the library really is to make it simpler. Obviously, it's a journey, but I'm confident that we already have done a great job in making it a bit easier than it was.

Roland Meertens: Thank you very much. Thank you for joining the podcast and good luck with your talk tomorrow and hopefully people can watch the recording online.

Tomáš Neubauer: Tthank you for having me

About the Author

Tomáš Neubauer

Show moreShow less

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and the Google Podcast. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.