BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations Examining the Past to Try to Predict a Future for Building Distributed Applications

Examining the Past to Try to Predict a Future for Building Distributed Applications

Bookmarks
46:45

Summary

Mark Little looks at some core concepts, components and techniques in reliable distributed systems and application building over the years and tries to predict what that might mean for the future.

Bio

Mark Little works for Red Hat, where he leads the JBoss Technical Direction and research&development. He has experience with two successful startup companies, and was Chief Architect as well as co-founder at Arjuna Technologies. Over the years he was a Research Fellow at the University and is now a Professor. He is also a Visiting Professor at INSA-Lyon university.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.

Transcript

Little: I'm Mark Little, VP of engineering at Red Hat. I've been at Red Hat since 2006, when they acquired JBoss. I'm here to present to you on whether we can examine the past to try and predict the future for building distributed applications. I hear from some of my organization, and customers, various concerns around how building applications these days seems to be getting more complex. They need to be experts in so much more than just building the application. That's not actually wrong. I want to look back and show that's always been the case to one degree or another, and why. As a side effect, I also hope to show that application developers have been helping to drive the evolution of distributed systems for years, even if they don't always know it.

This is not in Java, or an enterprise Java story. I'll use Java and Java EE or Jakarta EE if you're up to date, as examples, since many people will know about them, and if you don't just know about them, you may have also even used them directly. This is not meant to be, A, a picture of me, and B, me complaining in this talk about complexity and about the impact on application developers. I'm trying to be very objective on what I'll present to you. Hopefully, there are some lessons that we can all learn.

Where Are We Today?

Where are we today? Hopefully, there's nothing in this slide that's that contentious. Hybrid cloud is a reality. Developers these days are getting more used to building applications that are using one or more public clouds and maybe even private clouds, and in combination. Event driven architectures are used more often. Probably Node.js is the framework that really started to popularize this a lot. Then when cloud came along with serverless approaches like Amazon Lambda, or Knative, for instance, it became accessible to developers who didn't want to use JavaScript, for instance. Then there are other popular approaches, I put one here, Eclipse Vert.x. It's in the Eclipse Foundation. It's a polyglot approach, even though it's based on the JVM, similar in some ways to Node.js. Linux containers have come along. I think they've simplified things. Kubernetes simplified or complicated things. Cloud services, again, developers today are far more comfortable than maybe they were even five years ago with building applications that consume more of the cloud services that are out there, like Amazon S3, for instance.

IoT and edge are very important now, as well as cloud. Lots of people working in that space, connecting the two together. I think open source is dominating in these new environments, which is a good thing. Developers are really the new kingmakers. You don't have to go too far to probably have heard that before. Is this really a developer's paradise? It sounds like there's lots of choice, lots of things going on. There are complexities, which cloud do you choose? Which Kubernetes do you choose? What about storage types and security implementations and distributed transactions? Which one of those do you take? Which model? It does seem like building apps today, the developer needs to know far more than they did in the past. In fact, if you've been in the industry long enough, it's like it was in the '80s with CORBA, as we'll see. Developers are having to know more about the non-functional aspects of their applications and the environment which they run, but why?

What Application Developers Typically Want

I'm going to make some sweeping statements here, just to try and set some context. What do application developers typically want? I think, typically, they do want to focus on the business needs. They want to focus on the functional aspects of the application or the service or the unit of work that they're trying to provide. What is it that they have to develop to provide a solution to the problem and answer to the question? Yes, all good application developers want to test things. That's an interesting thing for them, something they need to understand. They'll be thinking about their CI/CD pipelines, probably wanting to use their favorite language and probably wanting to do all of this in their framework, and favorite IDE. What do they not want? Again, sweeping statement. I think generally, they don't want to become experts in non-functional aspects of building or deploying their application. For example, they probably don't want to become clustering gurus. They don't want to become an SQL king or queen. They don't want to become transaction experts. Although speaking as a transaction expert, I do think that's a great career to get into. They probably don't want to be hardware sages, either. They don't necessarily want to know the subtleties of ensuring that their application runs better on this architecture versus that, particularly if one has got GPUs involved, or one does better hyper-threading. These are things that really they would like to have abstracted away.

The Complexity Cycle

I'm trying to show you what I'm calling the complexity cycle. Hopefully, as I explain it, you get the idea. This is a dampening sine wave. Actually, it's only natural that when a new technology comes on the scene, it isn't necessarily aimed at simplification, or the developers behind it aren't necessarily thinking about, how can I make this really easy to use? In fact, simplifying something too early can be the worst thing to do. Often, complex things require complex solutions, initially. As you probe the problem, as you figure out whether this thing works or that thing works better. As we'll see in a moment, if you look at distributed systems growth, you'll see that it was a lot more complex initially, and then did get simplified over the years. Part of that simplification includes standardization. Frameworks come on the scene and abstractions to make whatever it is more simple, maybe a bit more prescriptive. All this leads to simplification, and future developers therefore needing to know less about the underlying technology that they're using, and perhaps even taking for granted.

Reality

It's hard enough with just one damped sine wave. We all know the reality is that today's application developer has to juggle a lot of things in their mind, because there is so much going on, as I showed earlier. I think application developers go through cycles of working on non-application development things as a result, including infrastructures. Because when they start down a path to provide a solution, they might suddenly find that there is no solution for something that gets in their way to provide the next banking app, for instance. A really bad example. Or next travel booking system. That takes them down a detour where they maybe by themselves or with others, provide a solution to that problem they run into, and then they get back on the main road. Then they finally build their application. In fact, in many cases, that developer pool, the pool of developers who have built distributed systems and provided solutions to clustering and SQL and transactions. It's seeded by application developers, some of whom then decide to stay on in that area, and probably never get back to their application.

Where Are We Heading?

I hear a lot from developers in my org, and many of our customers that they are spending more time on non-functional aspects than they want. It worries some of them because they're not experts in Kubernetes, or immutable architectures, or metrics, or observability, or site reliability engineering. Like, do I really need to carry a pager for my application now? For those of you in the audience who've been in this industry long enough, you'll probably remember similar thoughts and discussions from years ago, maybe around REST versus WS-*, or even J2EE against Spring Framework. I also think as an industry, we often focus too much on the piece parts of the infrastructure. Like, I've got the best high performance messaging approach, or I've got the best thread pool. We do need to consider the end user, the application developer more. We should start thinking maybe a little bit early about what does this do for the application developer? Am I asking that developer to know too much about this?

Sometimes, particularly in open source, rapid feedback, which we often suggest is great for open source, or a good reason for using it, it doesn't necessarily help as the feedback you often get in many upstream communities is from the same audience that are developing that high performance messaging. It just keeps exacerbating that problem. I think if you try and look at this from a different industry point of view, this is not what we expect from the car industry, for instance. As far as I know, you don't hear about them having conferences purely about the engines, or their seatbelts, and how they've got the best seatbelts. They talk about the end result, how you make the whole thing come together and look beautiful, and perform extremely well.

Distributed Systems Archeology

Before we are able to look back to predict, hopefully, what's going to happen in the future, it's important that we look back I suppose. Distributed systems began in the '70s, probably the '60s really, where developers wanted to connect multiple computers together. They built systems that could talk between heterogeneous environments, send messages, receive messages. Between the '70s and the '90s, these were though, typically, single threaded applications. Your unit of concurrency was the operating system process. Many languages at the time had no concept of threading. If you were lucky, and you were programming in C, or C++, and maybe other languages had something similar, but you could do threading if you setjmp and longjmp, you could build your own thread packages. In fact, I've done that a couple times. I did it as an application developer. I was building something, it was a simulation package. I needed threads, there were no thread packages in C++ at the time. We built a thread package. Then went back to the application.

That's the complexity that you had to get involved in because you came across a problem that got in your way and you had to solve it. Then I think if you look at the core services and capabilities that started to emerge back then, you can see what was being laid down then is now influencing where we are today. They were talking about transactions and messaging and storing, things we take for granted. They had their birth in the '70s to the '90s. We need to also remember, there's something called the fallacies of distributed systems, or Notes on Distributed Systems by Waldo, where they showed that distributed systems shouldn't look like local environments, local systems. In the '70s to the '90s, we spent a lot of time trying to abstract distributed systems to look like local environments. That was not a good thing. Go and look at that. In some ways, that's an example of oversimplification, and frameworks trying to hide things, which they probably shouldn't have.

CORBA (Other Architectures Were Available)

One of the first standard efforts around distributed systems was CORBA from the OMG. Here's a high level representation where we have these core services, like transactions, and storage, and messaging, and concurrency control. Everything still in CORBA initially was still typically single threaded. COBOL, C, C++, no threads. POSIX threads was evolving at the time, and then eventually it came along and was more widely adopted. Developers still needed to know a lot more than just how to write their business logic. Yes, they didn't need to know the low level, things like network byte ordering anymore, because CORBA was doing a really good job of isolating you from that. It was getting simpler. Abstractions were getting more solid, but you still had independent services. At least they conformed to a standard for interoperability and some portability, but you still have to think which ORB, which language?

CORBA and systems that came before it and many that even came after it, was also pushing a very closely coupled approach. What I mean by that is in CORBA, for instance, you didn't typically define your service endpoints in the language that they were going to be implemented in. You used something called IDL, Interface Definition Language. That could then create the client and service stubs in the right language. You wrote the IDL once, and you could make that service available to end users who wanted to interact with it through C, C++, Java, COBOL. You didn't need to worry about that, as a developer. That was a good simplification. The IDL generator would do all of that work for you. However, because the IDL is essentially the signatures of the methods you're going to invoke, it meant that, like in any closely coupled environment, if you change the signature. You change an int to a double, you have to regenerate all the client server stubs in all of the languages that will be used, or existing clients will suddenly not be able to use that service. Essentially, you change one signature, you have to change everybody. There are some advantages. That was certainly a simplification approach over having to know host port pairs and network byte order and big-endian, little-endian from the '70s, absolutely. It had disadvantages in that, like I said, it could be quite brittle to the distributed system.

In the late '90s to early 2000s, we saw the chip explosion and performance of chips went exponential. You had multi-core, hyper-threads, RAM sizes exploding. It went from kilobytes to megabytes. Obviously, we've gone much further than that. Interestingly, network speeds were not improving as quickly. One of the things that that then pushed was the colocation of services to get out of having to pay the network penalty.

The Application Container

Colocation became an obvious approach to improving performance, and reducing your memory footprint. If you could take your multiple services that were consuming n times the memory and stick them into one service, then hopefully the memory size wouldn't be n times m, where m is the number of services. Your application container, and I'm using that term so that we don't get confused with Linux containers, or operating system containers, it does much of the work. It was also believed to be a more reliable approach. In this era, colocation becomes the norm. In fact, in Java, it takes only a few years, other languages took quite a while. In fact, in CORBA, if you are still involved with that back in the day, their component model was essentially a retrofit of the J2EE component model. Abstractions at this point have become a key aspect of building distributed systems. J2EE, Spring Framework all tried to hide the non-functional aspects of the development environment, to enable the developer to build more complex applications much more quickly. You abstract clustering, more or less, and you abstract transactions. Developers are typically only having to write annotations, but it's not perfect. However, developers, I think, start to feel much more productive in this new era with these application containers than they were in the CORBA era.

J2EE Architecture

This is an example of a typical application server with the application container. Obviously, I'm using JBoss here because of my background, but there were many other implementations at the time and still quite a few today. You can see the core services, thread pool, and connection pooling, and the developer sees very little as the container is handling it all. Like I said, the developer typically annotates the POJO, says this method should be transactional, this one should have these security credentials, and the container takes care of it all, really simplified. At the tail end of that, you know that sine wave there.

Application Container Backlash

Frameworks such as Spring do a much better job of abstracting this and making the lives of developers easier than some other application container approaches. I have to include J2EE there, even though my background is much more than that, and even in CORBA. Credit where credit is due. Spring developers at the time recognized that there was still much more complex in J2EE than there needed to be. It had become to a point where it really wasn't as simple to develop the easy stuff in some of those application containers. Because we started to add more capabilities into these things, which not every developer wanted, they became more bloated. They took up more memory footprint. They took longer to start up. There was this all or nothing approach, you had to have everything in the J2EE spec, for instance. Until the Web Profile came in towards the latter half of J2EE's life, you had no choice. Yes, you could tune things, you could remove things in some implementations, but it was a non-standard way. It might work with JBoss, but probably it wouldn't work with WebLogic, and vice versa. Also, at the time, there was this big push related to the bloating and the slow startup, in some regards. To trying to have long running applications and dynamic aspects pushed into the application container itself, so that you didn't have to stop the application server if you wanted to make a change to the messaging service implementation, for instance. OSGi came along to try and help that on their rather bespoke module systems. All of this brought more bloat and started to not be a good experience for many application developers.

Services Became the Unit (SOA)

Then, services become the unit of coding again. I think a lot of that was based on problems with building systems from closely coupled approaches. Teams couldn't be independent enough. Change your method signature, you had to let everybody know, I will use this now. People wanted to get away from application containers for some of the reasons we mentioned earlier. We saw the re-rise of services and service oriented architecture. Like I said, CORBA had this originally, you could argue that it's similar to that. SOA was meant to focus on loosely coupled environments, so that's where the similarity with CORBA starts to break down. CORBA is very much more closely coupled. In SOA, you have a service as a unit of work. It's not prescriptive about what happens behind the service endpoint. Some way similar to an IDL, but the idea was that you didn't have to inform end users if you made a change to a signature. You could do that dynamically or you could let them know this signature changed, maybe try this version instead.

Web services or WS-* came on the scene at about this time, they often got equated to service oriented architecture, but they weren't. You could write SOA applications in CORBA if you wanted to. You could write them in J2EE. You could write them in web services, but you could just as easily write non-SOA applications in these same technologies. Generally, web services should not be equated with SOA. ESBs came on the scene about the same time, the same sort of thing. There were lots of arguments about ESBs about whether they were SOA, whether ESBs were web services. Whether you should put the intelligence in the endpoints versus the pipes. The answer is the endpoints, I think. REST which has been around since the birth of the web, there was a lot of debates at the time. In many cases, I think, you only have to look where we are today. REST won. The REST approach I think is much more conducive to SOA and loose coupling than WS-* was, and probably ESBs.

Now of course, it was Einstein who said make everything as simple as possible, but not simpler. Perhaps I think the rush to simplification and improving performance led to more complication at different levels in the architecture, in this SOA architecture essentially. Despite what I said earlier about web services not being equated with SOA, they did tend to get used a lot in SOA. In some ways, as we'll see later on, maybe it gave SOA a bad rep.

Web Services Standards Overview

This is a really old screenshot of some standards from 2008, I think 2006 to 2008. This is the WS-* stack. You can see that complexity has started to come back in here. There was a lot of effort put into individual capabilities in WS-*, the enterprise capabilities that various vendors and consultants and individuals said that was needed for building enterprise applications. It doesn't matter whether you're building them with J2EE, or you're building them with SOAP, you need things like messaging, and eventing, and all these things. My concern, looking back now is there was not enough thought placed on how this plethora of options and implementations would affect the application developer. There was lots of backlash as a result. Frameworks and abstractions actually don't manage to keep up initially. They forced developers to hand code a lot of this, and that led to the inevitable demise, I think.

Monoliths Are Bad?

At that turning point, I think of SOA web services. Then we started to hear about how monoliths are bad. I don't buy into this. It's pretty much like everything, there's no black or white, there's no silver bullet. If you've got a monolith, and your monolith is working well for you and has been working well for you for years or decades or whatever. It doesn't matter if it's written in COBOL. If it's working, don't refactor it, just keep using it. There's nothing wrong with a good monolith. Not all monoliths are bad. It doesn't mean you shouldn't refactor all your monoliths, but don't think that because you've got a monolith, you've necessarily got some of that. You have to put engineering effort towards rewriting in the latest cool language or the latest cool framework. I would absolutely recommend if you haven't seen this, go and look at The Majestic Monolith. It's a very pragmatic approach. It's a relatively old article now. It's still online, but you should definitely go and look at it. It hopefully says things a lot better than I was able to.

Microservices

At that point, when monoliths are said to be bad, then we suddenly see this discussion about microservices, which are still around today. Hopefully, everybody knows that. It's almost a decade ago that Adrian Cockcroft, and others started to talk about pragmatic SOA, and that became microservices. We sometimes forget about the heritage, but it's there. They are related to SOA. Despite the fact that many people at the time were trying to point this out, unfortunately, I think we failed to learn quickly enough from history in this case. That this approach, in many ways was going to be turning back the clock and adding some complexity. Absolutely benefits, but there was a lot of complexity that came in very early on that I think perhaps we could have avoided.

I think the reasons for this complexity are fairly straightforward. For many years, as an industry and even in academia, we'd be pushing developers to be less interested in knowing they're building on distributed systems with coarse-grained services and three-tier architectures. Our tools, our processes, even education were optimized for this almost localized development. Yes, there were distributed systems, but there aren't many components, two or three. Then, suddenly, with microservices, we're turning back the clock to the '70s. We're telling developers that they need to know about the distribution. They need to understand the fallacies of distributed systems. They need to understand CAP, and monitoring, and fine-grained services. I'm not saying they don't need to understand those things. I think that they don't necessarily need to understand all of them, and certainly not in a lot of depth as we were asking them to look into almost a decade ago.

Linux Containers

There's definitely been one great simplification in my view in the previous years for application developers, and that's operating system containers. Yes, I know the concept predates Linux with roots in the '60s and '70s, but for the purposes of our talk, we'll focus on Linux. Linux containers came on the scene around the time of microservices, so some people start to tie them together. The reality is that they can and are used independently. You don't have to use containers for microservices. Likewise, your containers don't have to be microservices. Unless you're dealing with actual services, which tended to be on a static URL, for instance, and you only needed to worry about how your clients would connect to them. Distributing your applications has always been a bit of a hit and miss affair. Statically linked binaries come pretty close, assuming you just want portability within an operating system. Even Java, which touted originally the build once, run anywhere, that fails to make this a reality as well. If you're a Java user, hopefully you'll understand what I say. Just look at Maven hell, for instance.

Kubernetes

Clearly, one container is never enough, especially when you're building in a distributed system. How many do you need to achieve specific reliability and availability metrics? Where do you place those containers? What about network congestion or load balancing or failover? It is hard. Kubernetes came on the scene as a Linux container orchestration approach from Google. There were other approaches at the time and some of them are still around like Apache Mesos or Docker Swarm. Kube has pretty much won. Developers do need to know a bit about Kubernetes, and specifically immutability, but they shouldn't be experts in it. I think we're still at that cusp where it's too complex for Kube in some areas and for some developers. The analogy I use, as you probably know and use a VPN, but you don't necessarily know how it works. You don't need to. Back to the immutability. Immutability is the biggest change here that Kubernetes imposes. It assumes because of the way it works, that when it wants to fire up a container image somewhere, it can just pull that container image out of a repository and place it on a machine and start it up. If that's a replica image, it might take down another image or that other image might have failed. It doesn't need to worry about state transfer. It's all immutable. If you want to change that image, like the example we were using earlier with application containers, and you want to change the messaging implementation that's in that container, you create a new image. You put that new image in the Kube repo. You do not change that image as it is running, because if you do, if Kube takes it down, you've lost that change.

Why is this important? Because for almost two decades, especially in the Java community, we've been telling application developers and application container developers to focus on making dynamic changes to running applications, whether through OSGi, or equivalents, as we discussed earlier, leaving runtime decisions to the last minute or even the last second is the approach we've asked developers to take. Yet immutability goes against this. Decisions like that should be made at build time, not runtime. We're only now starting to understand the impact that this is having on application developers, and some frameworks are now starting to catch up but not all frameworks.

Enterprise Capabilities

What about enterprise capabilities? We've seen this with web services, and with J2EE and others. We've been focusing a lot more of course into containers, and trying to simplify that. What goes around those containers and between them? Things like messaging, and transactions. Applications still need them, it doesn't matter whether it'd be deployed into the cloud, public or private. You probably still need to have some consistency between your data. You want to be able to invoke those applications very fast. We're actually seeing application containers, which if you remember, in many ways evolved from CORBA-like approaches and colocating those services into a single address space. They're now breaking apart. Those monoliths are becoming microservices, so you're having independently consumable containers and services. Those services are going to be available in different languages, so it's not just REST over HTTP, there are other protocols that you can evoke them through JMS, Kafka. Lots of work going on in this space. We're going back a bit. They're no longer residing in the same address space. With that comes the problems that we perhaps saw with CORBA, where the network is becoming the bottleneck. What does that mean from a performance point of view?

The inevitable consequence of this concern about what goes around the containers and how you glue them together and how you provide consistency, is complexity. Whether you bought into web services or not, and that complex diagram we saw earlier, you do need these things. Perhaps web services could have provided them in a different way, maybe through REST, for instance. There were different approaches from, that REST had a REST [inaudible 00:35:09] had at the time, that were perhaps more simple. You do need these capabilities, or you need many of these capabilities. I want to show you this. This is the CNCF's webpage. CNCF is a group that is attempting to standardize and ratify a number of different approaches to building in cloud environments. Various projects are there like Knative, Kubernetes. They want to be this place that helps application developers choose the right tool for the right job. The thing to know here is that a lot of these smaller icons are not necessarily competing specifications for one of the [inaudible 00:35:59], they are actually different implementations of the same thing, so different CI/CD pipelines, for instance. That is still complex. Which group do you choose? Which pipeline do you choose? Does that pipeline work on the group you've just chosen? There's a lot of complexities still in this environment and evolving.

There Are Areas Where Complexity Remains

There are areas where complexity remains. I think that's because we haven't quite figured out the solution domain yet. Abstracting for application developers, it is a little early. I think we're doing the right thing in a number of areas. We're innovating, and trying to innovate rapidly, rather than standardizing too early and providing frameworks that abstract too early. We need real world feedback to get these things right. Unfortunately, that complexity is there. It has to be there until that sine wave decays, and we get to the point where we feel we understand the problem, and we have good enough solutions. I think application developers have been at the forefront of distributed systems development for decades. It's inevitable. It's how this whole industry evolves. If we didn't have people to build applications, then we wouldn't have Kubernetes, we probably wouldn't have Android or other equivalents.

Areas Still To Be Figured Out for the Application Developer - Consensus

I did want to spend the last few minutes on my thoughts on areas that are still to be figured out, areas where there still is complexity, where we probably have a few more years to go before we can get to that simplification. They include consensus in an asynchronous environment, large scale replication, and accountability. I think complexity will unfortunately continue in these areas, or fortunately, if you are in that mindset to want to get involved and help fix things. Consensus is incredibly important in a distributed environment that you can have participants in an application or in a group, agree on what the outcome of some unit of work is. Transactions, two-phase commit, for instance, that is a consensus protocol. Getting consistency with multiple participants in the same transaction is very important.

We have growing need, though, in the world that we now live in, whether it's cloud, or cloud and edge, or just edge to have consensus in loosely coupled environments. Two-phase commit is a synchronous protocol, very suited to closely coupled systems. Like local area networks, for instance. As we grow our applications into the cloud, and the cloud gets larger, and you have multi-clouds, it's inevitable that your applications become much more loosely coupled. You still want that consistency in your data, in your conversations. Multi-party interactions are becoming more conversation-like, with gossip protocols, where I tell you, you tell somebody else. I don't tell somebody else directly. Then, at some point, we come together and then decide that was the answer. Not that but this. How do we do that though, in the environment where we have failures or slow networks? Like I said, active area of research and development, but I hope this is an area where we can actually learn from the past. You don't have to go back that far. Less than a decade and you can find lots of work by the RosettaNet team and ebXML, where they've done this work. No central coordinators, lots of autonomy, and yet with guarantees.

Replication at Scale

Then there is replication at scale. Transactions, I've mentioned them a few times, or extended transactions where you loosen the ACID capabilities. Consistency is important for many applications, not just in the cloud, but outside the cloud, which predates the cloud by a long time. Relaxing ACID properties may help. There's been a lot of work going on in that over the last 20-odd years. We have to explore the tradeoffs between strong and weak consistency. Strong consistency is what application developers intimately understand. It's really what they want. I haven't spoken at all about NoSQL at the moment, but one of the things about NoSQL is it typically goes hand in hand with weak consistency. While there are use cases where that is really important, it's harder for most developers to think about and to build their applications around. Frameworks have often not been that good in helping them to understand the implications of weak consistency.

You only have to look at Google, for instance. They did an about face. They were going down the weak consistency avenue for a long time, and then they came out with Spanner. Spanner is strongly consistent, distributed transactions with ACID guarantees, globally scaled. Not just a few machines running in the same city or a few machines running on the same continent. They're trying to do this globally, and they're doing it successfully. CockroachDB is an open source equivalent, essentially equivalent of Spanner. FoundationDB. Many efforts now are pushing us back towards strong consistency. I don't think it'll be one or the other. I think we do need to see this coming together of weak consistency and strong consistency, and caching plays into this as well. Frameworks need to evolve to make this simpler for application developers, because you will be building applications where you have different levels of consistency, particularly as we grow more modular in nature.

Accountability

Finally, accountability. The cloud does raise issues of trust, and legal implications of things. Third parties doing things that you didn't want them to do. What happens if something goes wrong? Who was responsible for making that change in the ledger? In the absence of solid evidence, disputes may be impossible to settle. Fortunately, we're seeing things like blockchain come on the scene, because accountability is fundamental to developing trust, and audit trails are important. In fact, I think we should have the equivalent of a flight recorder for the cloud, that records logs of service interactions, events, state changes, needs to guarantee fairness and non-repudiation. It should enable tracing back of incidents. In fact, if you've ever seen a hardware TPM, or Trusted Platform Module, you'll understand where I'm coming from here. That's a system that's usually soldered onto a motherboard, and you can't get rid of it without breaking the motherboard. It tracks all of these things, every keystroke, everything that comes in and out of your machine. It does it in a way that can be used in a court of law. I think that's the thing that we will see more in the cloud.

Conclusions (Predictions)

I actually think Kubernetes will eventually disappear in the background. I'm not saying it's going to go away. I think it is important. I think it's going to remain the standard until something better comes along. Even if that does happen, my expectation is that it will sink into the infrastructure, you as an application developer won't need to know about it. Just as, towards the last decade or so, you've had hardly if ever had to know about Java EE application clustering. You fire up another WildFly instance, and it just clusters automatically. You don't need to configure anything. Compare that with what you had to do with JBoss AS, back around 2004, it's chalk and cheese. Linux containers will be the unit of fault tolerance and replication. I think we're pretty much there anyway. Application containers or something like them will return. By that I mean we'll start to see these disparate services get colocated in the same address space, or something that makes them closer such that you don't have to pay the network overhead as much. Now, that could be with caching, for instance, or it could be with lighter weight implementations, in some use cases.

Event driven will continue to grow in adoption. I do think frameworks need to improve here, and to abstract a bit more. It's definitely a different mindset to think about asynchronous event driven applications. It's not the same as your synchronous, for instance. My only product or project reference here would be Quarkus. If you haven't looked at Quarkus, please do. The team is trying to make event driven very simple. It uses Vert.x underneath the covers, and it's trying to hide some of that. If you want to know what Vert.x is there, you can get down into the details, but generally, they're trying to make it simpler for application developers. I think immutability from Kubernetes is actually a pretty good approach. It's a pretty good architectural style. I think that's probably going to grow because of Kubernetes. I think generally, it's also useful to consider as a mindset. The application developer will continue to help evolve distributed systems. It's happened for the last 50 years, I don't see it changing for the next 50 years.

 

See more presentations with transcripts

 

Recorded at:

Dec 30, 2022

BT