InfoQ Homepage Podcasts John DesJardins on In-Memory Data Grids, Stream Processing, and App Modernization

John DesJardins on In-Memory Data Grids, Stream Processing, and App Modernization

Sep 14, 2020

In this podcast, John DesJardins, field CTO and VP solution architecture at Hazelcast, sat down with InfoQ podcast co-host Daniel Bryant. Topics discussed included: how in-memory data grids have evolved, use cases at the edge (IoT, ML inference), integration of stream processing APIs and techniques, and how data grids can be used within application modernization.

Key Takeaways

An in-memory data grid (IMDG) is a set of clustered computers that pool together their memory to let applications share data with other applications running in the cluster.
Data stored within an IMDG can be accessed with low latency and can also be continuously analyzed using stream processing APIs and techniques (e.g. using Apache Beam).
An IMDG can be deployed on virtual machines (VMs), cloud platforms, and container orchestration frameworks such as Kubernetes. Many cloud vendors and IMDG vendors provide a hosted “as-a-service” offering.
IMDGs have seen a resurgence in usage for dealing with edge (IoT) and machine learning workloads.
Organizations that are modernizing their applications can use an IMDG as the data storage and processing “glue” between the old and new worlds.

Subscribe on:

Sponsored by Payara

Quickly and easily deliver Jakarta EE apps in any environment: on premise, in the cloud, or hybrid with enterprise-grade security. Payara Platform Enterprise is a stable, supported open framework optimized for mission critical systems and containerized Jakarta EE and MicroProfile applications.

Transcript

00:20 Daniel Bryant: Hello, and welcome to the InfoQ Podcast. I'm Daniel Bryant, news manager here at InfoQ and product architect at Datawire, and I recently had the pleasure of sitting down with John DesJardins, Field CTO and VP Solution Architecture at Hazelcast.

Having used in-memory data grids like Hazelcast in my previous roles almost five to 10 years ago now, I've recently been hearing a lot more about this technology again and its integration with other modern systems. I was keen to pick John's brains around the current state-of-the-art in in-memory data grid technology and learn about modern problems, architectures and solutions in this space.

Three particular areas of interest for me included: app modernization and how data grid technology can provide a stepping stone from the old world of mainframes to something like cloud and microservices; I was also interested in use cases at what folks are calling the edge, both from a local point of presence (PoP) view, and also from an IOT perspective; and finally, I was keen to learn more about streaming and machine learning use cases in this space. So welcome to the InfoQ Podcast, John, thanks for joining us today.

01:14 John DesJardins: Yeah. Happy to be here. Thanks for having me.

01:16 Introductions

01:16 Daniel Bryant: So could you briefly introduce yourself to the listeners please?

01:19 John DesJardins: Absolutely. I'm field CTO at Hazelcast, we're an in-memory computing platform. I've been here about two years. My background is, I came over from a company called Cloudera in the big data and machine learning space, been in software about 25 years. Originally studied economics actually, so I did a lot of predictive model development, far back as the late 1980s. It wasn't so cool then. So, I'm excited about what's going on in computing today.

01:48 What are the common problems that engineers address with in-memory data grid (IMDG) solutions?

01:48 Daniel Bryant: Very nice. So as a Java developer, I've been a long time user on and off of Hazelcast throughout my career. I'm sure many of the listeners will have heard of Hazelcast. But if people aren't familiar with the in-memory computing space, could you set some common problems you might be dealing with and how the technology applies to those problem?

02:04 John DesJardins: Yeah, absolutely. It's a great question. And an important topic is, what is in-memory computing or components of it? The origins of in-memory computing were really around just making applications easier, to be more resilient really. People would put not a lot of data in memory, in a distributed nature, in a cache, things like web session information. Historically you wanted every note of a cluster of servers, to be able to share some basic information, so that if you reconnected to a different server, they wouldn't miss a beat, would know who you were as a client. That really evolved as memory became cheaper and people started realizing, you can put a lot more stuff in there. Then people started realizing, Hey, there's all this data that we're just putting into memory. If we actually do the compute where the data is, we can actually have even faster compute.

02:57 John DesJardins: And so this evolved in more and more capabilities, querying on the data that was in this data grid. So it went from being just a distributed cache to more of a data grid capability. At the same time, there are technologies that focus on real time processing of continuously changing data. And that's another aspect of in-memory computing, which is stream processing or event stream processing. Many years ago, it used to be called complex event processing. I worked with a company called Software AG and we had a product called Palm Off that was considered a CEP technology.

03:33 John DesJardins: All of these things are about keeping a lot of data in-memory and either analyzing them continuously or just keeping them in there to make them available, to make applications faster, more resilient and easier to scale. If you're starting a computer, making the disks available and everything and also creating more instances of compute that have disc available to them. It takes longer to get that application up and running if it's based on disc. By having an in-memory architecture, you're able to then easily scale up, scale down, and, or, recover down to node, et cetera. This whole in-memory computing capability not only makes applications faster, not only allows you to do lower latency computations, but it actually makes things more resilient. That becomes a very big driver for us, in terms of seeing adoption of these technologies.

04:29 How do engineers typically interact with IMDGs? Do they write against an API or against SDK?

04:29 Daniel Bryant: Yeah. Very nice overlay. How do engineers typically interact with these data grids? Do they write against an API or against SDK?

04:37 John DesJardins: Yeah, that's a good question. Basically most data grids are based on working with objects or data structures that are commonly used by engineers and developers to build applications. So that's another difference between your data grid and your typical databases is that you have a shopping cart, or you have a customer or an address. You have a collection of different things that come together to make a meaningful chunk of data. And those objects can be stored in the grid very easily using simple key-value APIs, typically. Key value stores is just a really common data store for a lot of developers and it's supported in most major languages. You can basically interact with a data grid very easily using these APIs. One of the things that sets apart some of the more innovative data grids, is that we also offer them in asynchronous right through to the database.

05:34 John DesJardins: You don't have to separately put the data in the grid and the database. You can actually just put the data into the data grid and then you can have confidence that it's going to end up in the database, eventually. That's another nice thing for developers. If they're using a data grid that supports the right through capabilities, because then your write is instantaneously successful. But the database could be overloaded and it might take a little while for the database to catch up and databases are typically more expensive to scale. You putting that data grid between your application layer and microservices layer, and the database layer, it allows you to scale the whole solution better. It also allows you to handle spikes better. A lot of customers use these data grids for very busy websites. A lot of online stores are using this technology to do things like, storing and retrieving frequently changing data, such as calculating available to promise, shipping calculations or other things like that.

06:35 John DesJardins: They need to be able to do those calculations even on their busiest day of the year. If you were a pizza delivery company, which one of our customers is, Super Bowl Sunday is going to be your busiest holiday.

06:47 Daniel Bryant: Yeah.

06:48 John DesJardins: Yeah, right. If you're an online retailer, Black Friday might be your busiest day of the year. If you're a company that makes phones, when you introduce a new phone might be your busiest day of the year. So whatever that day is, having a data grid in place gives you more resilience and ability to handle those spikes.

07:06 How would something like an IMDG be deployed and operationalized?

07:06 Daniel Bryant: I like it. Just taking an operational step back for a moment, because you mentioned my data stores there. I'm sure a lot of our listeners are familiar with relational databases. Increasingly they're being consumed as a client service now. I pick Amazon, for example, with RDS there, or at my job, apps deploy onto a VM or a container, for example. And then I consume the database effectively as a service. How would something like an in-memory data grid be run? Is it like, I'd spin up a cluster of machines and deploy the product across them, or is it typically a cloud service? How is it often offered?

07:38 John DesJardins: That's a great question. We're definitely seeing a shift towards more interest in having it as a cloud service. Developers are like, "Yeah, I still want this capability, but I don't want to care about how many nodes I need or whether I'm going to install using Helm chart with Kubernetes, or with Operator and OpenShift or PCF tile." They just don't want to think about that, they just want to have an API that is fast and responsive. We have seen interest in the managed service side of things and we launched our own managed service in Hazelcast and the cloud. So that is an option for developers who are looking for that flexibility. On the other hand, we have a lot of customers who, even though they're moving to the cloud, they really care about that performance and they really care about being able to tune and optimize.

08:24 John DesJardins: Also, typically they want to use more advanced APIs doing computations on the grid. And those customers, they may even want to incorporate both stream processing and real time analytics, together with putting data in and out of the grid. Those customers typically, will probably leverage something like OpenShift or Kubernetes on the cloud so that they still have that elasticity and flexibility of the cloud, but where they're managing the containers. And then we have some customers who's actually running directly on the VMs provided by, such as a EC2 instance at Amazon.

09:01 Has the move toward edge computing (IOT etc) changed the way engineers are using things like in-memory data grids?

09:01 Daniel Bryant: I'm guessing over the years, the problem space itself has shifted. We're definitely pushing more stuff out to the edge now I think, and the edge means many things to many people, like data center edge, physical points of presence. And you even have mobile being the edge, I guess, and IOT and so forth. Has that changed the way folks have used things like in-memory data grids?

09:19 John DesJardins: Yeah, absolutely. What we're seeing at Hazelcast is, people have often wanted to feed data. For example, telemetry, you've got automakers putting vehicle telemetry data or insurance companies into the grid. We've got people connecting data grids to rail infrastructure, and then often, maybe they want to also deploy this stream processing out in those environments. When you start talking about these use cases where the data that's created is decentralized. In-memory computing is all about making things faster and lower latency and also being easy to deploy and stand up. That means that you can deploy this technology. Particularly Hazelcast, both our stream processing capability of our platform, as well as the data grid capability can be deployed in smaller form factors. We've got customers running it inside of factories. We've got some customers running it on large vehicle, like heavy industrial vehicles.

10:17 John DesJardins: There's a customer of ours, a company called SigmaStream that actually runs Hazelcast. They have a two node cluster that they'll run on a heavy truck, that actually drives out to monitor oil infrastructure. Then connects to the local network out there and monitors the sensors, and then is able to do adjustments and things. They have all kinds of secret sauce they've built on top of the technology, that I don't really understand. But finally, things like drilling heads and stuff are expensive. And you want to be as close to those systems as possible when your application is trying to monitor them, because the latency to try and get to the cloud is higher. In those kinds of scenarios, often the edge might not be connected to the data center or cloud. On the other hand, we have people just wanting that seamless connectivity from edge to cloud to data center.

11:11 John DesJardins: Hybrid cloud is often coming into the picture and edge deployments, and then of course, telecoms right with 5G. They're deploying more and more infrastructure that is expanding the edge as well as providing lower latency capabilities out to the edge. That's creating a lot of innovation in terms of what telecoms can do and how they can manage that. 5G is, by the way, the first end to end software defined, networking technology that's ever been developed. So that means that they can deploy infrastructure all over the place and simultaneously create private networks and public networks, and it's all software defined. Which, gives them a lot of cool capabilities. 5G also allows orders of magnitude, more devices to be connected. Some of those things are going to just accelerate the pace of edge and in-memory technologies can be critical to being able to run in those environments.

12:05 John DesJardins: Then also to bring that data back to the data center, you need a way to analyze, filter, and aggregate. Because the amount of data that's created at the edge where you have exploding numbers of devices and sensors, the sensors that are newer have way higher sampling rates. The amount of data is just exploding. What you're typically going to do is try to compress or sample that data down to provide it to the business. Essentially to have a common operational view or for data scientists to try to build good machine learning algorithms or do other analysis.

12:41 How does an edge device synchronize with a cloud-based IMDG?

12:41 Daniel Bryant: Very interesting. Something like Hazelcast, are there actually primitives for syncing up these various things you talked about there, John. I can imagine I have my IOT sensors at the edge, I collect data maybe locally, and then I want to almost connect up to a staging ground or maybe even the cloud. Is there primitives to do that kind of operation, maybe incorporate some of the sampling, and other things you mentioned there?

13:04 John DesJardins: Yeah. With our stream processing capability, it's very easy to do either compressions sampling, time window based or time series based processing. You can easily take a look at that data and do a simple aggregation and say, what is the average over this period? Or what was the high value over the period? Sometimes sensor readings don't change. You might just say, if the sensor reading is the same, I'm just going to keep tracking how long it's been the same. And I'll report back. This sensor for the past 10 minutes has had the same rating. It samples at 200 hertz and then when you get that one event back every 10 minutes in the data center, you can say, okay, I want to build a time series view of that, that is continuous.

13:50 John DesJardins: Well, I can take that information, that it was a 200 Hertz sensor and for 10 minutes the reading hasn't changed. And I can say, okay, so then I'll put that into my data model to do my machine learning. Because what you're looking for is like anomalies and changes and those kinds of things.

14:04 John DesJardins: But most industrial systems, once they warm up, they kind of stay at the same temperature, whether air temperature or a lubricant temperature or a coolant temperature, or the RPMs that they're turning out. Your car RPMs are constantly changing, but like any kind of large piece of industrial equipment, or even a small piece, like a conveyor belt, every low voltage motor on that conveyor belt is running at a continuous speed most of the time. Unless something jams the conveyor belt.

14:34 Daniel Bryant: You want to know about it, right?

14:36 John DesJardins: The speed of that motor changes, then you know that, that's probably indicating that there's a problem. At least data are easy to compress and then save a lot of bandwidth and how you send it. But you have to, then have that intelligence and ability to do the time series calculations with the edge.

14:53 How do engineers interact with the stream processing capabilities of modern data grid technologies? What kind of APIs do they use?

14:53 Daniel Bryant: That's very nice. I'll confess, that's something I haven't done with any of the Hazelcast tech. I've mainly used the in-memory data grid for session management or classic enterprise use case, I guess. What kind of APIs do engineers interact with the technology, in terms of things like Flink or as a bunch of other ones? It's not simply my area of expertise, but I know there is some standardized APIs in this space, where folks can set up stream processing.

15:16 John DesJardins: So what we've done is, we've taken the API for our stream processing. And it's very similar to the API of Java util streams that was introduced in Java 8. For Java developers it's very easy to use technology, if you're familiar with a lot of the things that were introduced in Java 8 in that area. If you want another layer of abstraction, we've also integrated with Apache Beam. Apache Beam has a cross stream processing standard API set that we support. You could just build everything using Apache Beam logic, and then we'd become the runner that runs inside of the Apache Beam. So we can support either model. What we are also going to be introducing down on the roadmap, is SQL API as well. We're on a committee of standards across stream vendors for standardizing the SQL Logic of stream processing. There'll be a consistency, of how do you use SQL with continuously changing time series data in a way that is portable between products. And that's coming as well, in the upcoming release.

16:23 Do IMDGs get compared to projects such as Apache Kafka?

16:23 Daniel Bryant: That's super interesting. I've chatted to the confident folks a bit about the ksqlDB, a product they've talked about a lot. And I guess now we've mentioned the streaming and other things you've mentioned sort of in the conversation. Do in-memory data grids get compared to things like Apache Kafka?

16:38 John DesJardins: Kafka's really hot right now, but it's really two different components, really. There's Kafka messaging, right, which is everywhere and it's incredibly good at moving data around. And it's also good at storing logs or other kind of time-based data and being able to replay it and load it later. We work very well with Kafka messaging and are used together with it, both with the data grid and the string processing in many places. So we view that as very complimentary. And in fact, we have a Kafka connector that is certified by Confluent. So we've partnered, we're on their marketplace. On the flip side, we do compete with Confluent with stream processing. So Kafka streams, we also compete with Apache Flink and Apache Spark. What's unique about us. Those technologies all have their strengths and weaknesses, but I will say that what's unique about Hazelcast is that our stream processing was built on top of a data grid, built by people who live and breathe low latency.

17:39 John DesJardins: Even when we're checkpointing in state at a certain point with our director, basically graph, which is the one way representation of your computational logic, that is common in a lot of these technologies. We can checkpoint it so that it could be restarted on any node in the cluster or different threads. And that you guarantee exactly once execution of your job and your logic. That state checkpointing that is done by Hazelcast, it's something that we could check point to something like Kafka or a new super database like Cassandra. But we typically checkpoint it to our data grid and have that ability to have zero downtime restarts. We also support zero downtime, job upgrades, and a lot of other cool things that are in part enabled by the in-memory architecture. We're very resilient as a result of that, as well.

18:34 John DesJardins: Then of course, we very low latency. When I was working at Cloudera, we would say, yeah, we do real time streaming and you can get your answers from analyzing the data continuously. Every second you'll get a new result, a new computation, with about a one second lag. To most people one second doesn't seem that long. But if you're trying to do credit card fraud, all right. The window of time that you have to do credit card fraud calculations before you approve. In other words like you tap or swipe your card, and then they're going to look at your history, your pattern of everything and decide, Hey, are we going to approve this $5,000 TV purchase at Best Buy or are we going to decline it. There's a 40 millisecond or at most 50 millisecond window, to do that authorization.

19:24 John DesJardins: We've got banks doing real time authorization and a lot of other people doing real time low-latency processing where every millisecond matters. If you can execute algorithms in less than a millisecond or one or two milliseconds, your ability to then execute more algorithms and logic, and improve the fraud detection can prevent that TV from walking out the door. I was building real time fraud calculations on Spark. The result of that would be that I'd send someone a text message and say, Hey, did you just buy this TV at Best Buy? And I'd say, no. Guess what happens if you get a text message and you say, please cancel that thing. If the TV walked out the door, that means that, Best Buy has an authorization code saying that they're getting paid for that TV, right? So who buys the TV? You don't, because you responded within seconds of your alert and said, no, that's not me. So the bank or other issuing institution has to eat that cost of that TV. If you could execute things in milliseconds, there are use cases where that really does matter.

20:28 Can data grids help with app modernization?

20:28 Daniel Bryant: Oh yeah, completely. It reminds me, I'm having a lot of interesting conversations at the moment around API performance whether it be at the edge and even service to service in microservice systems. I'm guessing in-memory data grids and some other things you mentioned there, John. Do play into, app modernization as we're all lifting and shifting in the cloud or doing Greenfield in the cloud. I'm guessing this technology can play a pivotal role.

20:49 John DesJardins: Yeah, absolutely. One of the things that is critical about modernization, is that you still typically have systems of record that have a lot of business logic built into them. They're also often built into the operational cost model of whatever that company is. You go to your line of business and you say, Oh, we're going to move this mission critical application, that's worked perfectly for like 10 or more years, and we're going to move it over here and completely rewrite it. And line of business will say, yeah, that's paid for, and it works. And I've never had any customer complain about it. They don't necessarily want that. What they want with modernization is, how do I make that system that is working perfectly well, a first class citizen with microservices in the cloud where I want greater agility, I want it to be able to change my applications and enhance them, add new customer engagement features into my applications, all of that stuff.

21:47 John DesJardins: You don't want that to somehow impact and load down the systems of record and mainframes even, with things like that. If you put a data grid between your microservices and cloud applications and the systems of record, then suddenly you have all these available data and you can write through the data grid and it will update the system of record asynchronously. So that you have this zero downtime mainframe, but MIPs are expensive. Just because you want to introduce a new, fun way of playing with charts and analyzing your banking information, that doesn't necessarily translate into more revenue for the bank. It increases customer loyalty, but it's not generating more revenue. It's not built into the cost model of that main frame. Whereas if you can easily access all the data that's in the mainframe 24 by seven and still keep it in sync with the mainframe, there's tremendous value in that, particularly when you talk about digital transformation, right?

22:45 John DesJardins: We want to move to this ability to roll out new capabilities on a very frequent basis. A lot of our online retailers are telling us that they need to be able to make changes to their code at any point in time in their online applications. Because they're getting disrupted and need to talk about who's disruptive, everyone knows. But it used to be that retailers froze everything during the period from say beginning of October until the day after Christmas. In the U.S. that was a really big deal. And now those guys are like, no, we can't afford to freeze because our competitors are not freezing their applications and they're changing what they do. And they'll even notice what's working on the Monday before Thanksgiving and roll out a code change before Friday and somehow gain some advantage.

23:36 John DesJardins: Agility is critical in so many different businesses, banking. Our banking customers want a greater agility because they're getting disrupted as well. You want to do that in a cost effective way, which means, you need to put some layers of abstraction that provide data to these applications. And yeah, one of the things that Hazelcast brings is also very strong multi data center replication that will work from both, traditional data centers, as well as inside a cloud or inside of platforms like Kubernetes.

24:10 John DesJardins: If you're talking about a hybrid cloud architecture, when you talk about modernization, that's often the case. We can have the data replicated from the data center to the cloud, and that could be configurable. You can say, all right, I want this object to have two way replication. Because the online store is making changes that need to get replicated back to the data center and eventually synced into the mainframe or any other systems there. Whereas, there might be some data where it's only allowed to be changed in one place. Like what is your address or something like that, maybe there's some governance around that. You can configure data by data type, whether it's one way or two way replication, you can even build in additional filtering logic. When, of course with the real time stream processing, you can also build further in any direction that you need the data to flow. And so that's another common pattern.

25:01 If folks are combining existing and new technologies like data grids, how do they approach testing their systems?

25:01 Daniel Bryant: Yeah, very interesting. The one thing that I've struggled with app modernization is testing. You've got the collision of the old worlds and new worlds and in the new worlds Kubernetes microservices, it's quite easy to virtualize some of those services. How do you folks generally go about testing with some of the scenarios you mentioned John. Using almost data grids as an adaptor or facade, or write through cache, for example. How do folks go about testing that?

25:27 John DesJardins: That's a good question. I think one of the ways that we actually make that easier is that you're not having to test necessarily an end to end test, right? Because, hey, I'm writing the data grid and then there's separate code that is writing through to a mainframe. The developers who are writing to the data grid can be testing their part and they know that's working. And then the people who are testing the part that writes to the mainframe or code that might synchronize with the mainframe. Depending on how you're synchronizing the data, there's a number of different ways that could work. That can be tested separately, and then maybe you do a little lightweight end to end testing. Where you can separate like your load testing or various other more intensive testing, both in terms of happening at the same time, but also in terms of when it happens in a project and a lot of other aspects of that. This also makes it easier to do more continuous integration testing, particularly on the microservice side of things, or even, now how people want to do chaos engineering, right?

26:31 John DesJardins: You are not going to do chaos engineering against the mainframe. If you combine chaos engineering and mainframe, I think some people's heads will explode.

26:43 Can IMDGs be useful for processing machine learning workloads?

26:43 Daniel Bryant: I agree, yeah. I did like the mention of mainframes there. It's integrating with systems and what we sometimes refer to as the past is key to practically any organization that's been in business for some time, right? Looking forward to the future now. How does all this technology fit into something like machine learning? Are there any challenges that in-memory data grids can solve in this space?

26:59 John DesJardins: One of the things that machine learning is a challenge, is how to get the inference to be easily embedded into applications and how to make inference scale and perform well. And traditional machine learning frameworks have often taken an approach of, "Well, we'll just throw it into a container and put a rest API in front of it."

27:22 Daniel Bryant: Yeah.

27:23 John DesJardins: "Oh yeah, and then you just have more containers and problem solved." Well, anybody who's tried to build high performance applications knows that, rest is first of all, not exactly a very high performance protocol. And second of all, it's not very granular, scaling things with containers. What we do is, we're able to actually execute the machine learning within Hazelcast's in-memory computing platform, either on the grid or within the stream processing. We actually have an inference runner within the stream processing that will support Python or Java of course, but it also supports inference that can be done with any language that can support gRPC.

28:07 John DesJardins: What we can do is basically say, okay, you've got whatever your language is, C++, C#, Go, it could be anything. And that code that executes the machine learning, we can rap and then we can actually scale it. We can take single threaded machine learning code and make it multithreaded, and take advantage of all the resources within one pod of your Kubernetes environment. Then we can also distribute it across all the pods of our cluster. So now you have the ability to have distributed machine learning. But the other thing that you may need is, you might need to be able to share information and, or aggregate or prepare information for that machine learning. That's again where, being able to do things like continuous aggregation or other types of data logic on the grid, or inside of our stream processing engine. To look at windows of time and aggregate and prepare data, we can then prepare the data, feed it to the machine learning algorithm.

29:07 John DesJardins: We can then execute and then take the results and publish that back out to wherever it needs to go or hand it off to some other logic for further processing. What would you do with this? Well, obviously low latency fraud detection, or maybe low latency trade monitoring could be some use cases. We look at financial transactions continuously and plug machine learning in, to take actions on that change in data.

29:32 John DesJardins: But we could also use this to look at, for example, data that is related to the behavior of users and customers in order to make applications smarter. If you have an online store and you want to make real time offers. Well, you need to know what people are putting into their shopping cart and also what they're not putting into their shopping cart, but what they're looking at. Because maybe what you need to do is make that real time offer so fast that, as they're looking at a couple of different items, you make one of them blink and go, Hey, there's a flash discount for the next 10 minutes on one item that you showed some interest in and which we think we can still make a good margin on, if we get you to buy it right now.

30:15 John DesJardins: And we've got a lot of inventory. If we only had a few in inventory, maybe we wouldn't want to discount them as much. You could build this real time logic to make an application smarter in say retail, or you could do the same thing with media or architecture and trying to do recommendations. Embedding personalization and recommendation into applications is another area where this technology can shine. And then finally, pushing out machine learning to the edge, whether edge is a retail store or that edge is a factory, or the edges within a 5G mobile network, wherever the edge may be. Being able to distribute the machine learning out there is important. One of the things that I also want to talk a little bit about is, our strong focus on the ecosystem and making sure that our technology works with other technologies.

31:04 John DesJardins: I mentioned earlier that we've certified our connector with Kafka. We've got a partnership with IBM. And we've worked with both IBM and Red Hat to certify with a lot of different technologies, like Open Shift and IBM's Multicloud Manager and cloud packs. We've also certified things like Quarkus and Open Liberty and everything, to make sure our technologies work well there. We've also worked with Pivotal on Spring and on PCF. So we're very committed to an ecosystem play. And that also goes into cloud vendors and making sure that our technology works with machine learning, the work we've done on Beam. I think it's important as you're trying to make microservices work, to make machine learning work in your applications or to deal with going into cloud. That your technology has really got the ability to run in a lot of different heterogeneous environments.

32:01 John DesJardins: We do have a lot of capabilities because we do have the data grid and the real time stream processing integrated. And we're adding more and more things like SQL, that we'll be adding soon to the news. We get a lot of input for customers, but we also get a lot of input from the open source community directly and developers, as well as from our partners on what's going on, what are the trends, in which we'll be doing, to continue to make it easier for everyone to just develop technology in this kind of environment.

32:30 Daniel Bryant: It's been a fascinating tour de force, of all things in-memory data grid there. I think I really appreciate your time and thanks for joining us today.

32:36 John DesJardins: Well, thank you very much. I appreciate your time and always good chatting with you. So thanks a lot.

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and YouTube. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Previous podcasts

[Video Podcast] Improving Valkey with Madelyn Olson

Developers Can Improve the ESG Aspects of Software by Tackling Early Ethical Debt

Startup Software Architecture - You Never Really Throw it Away: a Conversation with David Gudeman

[Video Podcast] AI-Driven Development with Olivia McVicker

InfoQ Software Architects' Newsletter