BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations The Commoditization of the Software Stack: How Application-First Cloud Services are Changing the Game

The Commoditization of the Software Stack: How Application-First Cloud Services are Changing the Game

Bookmarks
36:37

Summary

Bilgin Ibryam discusses the intersection of cloud-native technologies such as Dapr with developer-focused cloud services.

Bio

Bilgin Ibryam is a technical product manager at Diagrid, where he focuses on developing tools to enhance developer productivity. Prior to this role, Ibryam served as a consultant and architect at Red Hat. He is a member of the Apache Software Foundation, and has co-authored two books on Kubernetes Patterns and Camel Design Patterns.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Ibryam: My name is Bilgin Ibryam. I'm here to tell you how I see cloud services evolving, and how is that likely to influence the distributed applications we are building. This will be a fast-paced talk where we won't have time to dig deep into each tech and pattern, but rather have a quick overview of how architectures evolve, and try to analyze what that might mean for the future of cloud native applications. I'm a product manager at Diagrid, where we are building APIs for developers. Before that I was consultant, architect, product manager at Red Hat where I've used projects such as Apache Camel, which is an integration framework, contributed to it and wrote a book about it. Similarly, I have used Kubernetes as a developer and wrote a book about it, which you can get the sponsored free copy from the provided URL, https://k8spatterns.io. What we'll do is we'll look at how applications have been changing from monoliths to microservices, functions, and whatever is coming next. Also, we will look at how infrastructure is evolving in the form of cloud services, and how it is shaping the application architecture.

Early-Cloud Era

Let's get started by looking at the pre-cloud and early-cloud era. In the timeline I looked at, this is the time before microservices movement. That's before the cloud became mainstream. This is mainly the on-prem and early-cloud era, such as early EC2 instances. If we look at the representative on application architecture at that time, such as an application built on top of an ESB platform, we would see that developers have to implement or used from the ESB everything themselves. That includes things such as anything to do around application packaging, placement, deployment apps on the SOAR platform, even placement of the application on the VMs. The deployment and release process for new updates. Handling the configuration and scaling aspects of the application for synchronous interactions based on SOAP and web services or even RPC. That would be still done all controlled from within the application. At the time, we didn't have things such as circuit breaker pattern, the mTLS. Developers were responsible for service discovery, retries, timeouts, authentication, authorization. Under the asynchronous interactions category, I include any kind of interactions through messaging, responsibilities for message transformation, converting from one protocol to another, connecting to various systems, dead letter queues. Finally, under the stateful patterns and workflows, that typically requires persistent state, that includes things such as business workflow orchestration, or the simpler saga pattern implementations. Idempotent consumer, shared distributed log, timers, Cron jobs, all of these patterns require persistent storage behind the scene. All of these will be implemented by developers from within the application layer. In this era, all of these responsibilities were handled by the application.

What about the infrastructure? As a developer at the time, I would see the infra as a thin layer, and I wouldn't have really big expectations from it. The infrastructure would give developers compute in the form of VM with certain capacity, networking and storage in the form of maybe a relational database and message broker. Even these, in some cases would be part of the application layer provided by the ESB platform. What about the interface and the boundary between these two layers and teams? It was in the form of operating system abstraction. The ops team would give you a VM specific capacity, and maybe installed Java or some application server and that will be it. There will be a list of static IPs and port numbers to refer to other services and databases. This pretty much all in the pre-cloud and early-cloud era. There were no universally accepted application formats and APIs that can be used from different languages and environments. This is the era of smart monolith and dumb static infrastructure from an application developer point of view.

Internal Architecture

While that was the state of the art before 2010, there was a revival and renewed interest in the application development, with a few major software development trends that happened next. They are still influential today. Let's see them. There are different aspects of a software architecture and different ways to visualize them. Among the popular ways, there is one called 4+1 architecture model, which includes logical view, physical view, process view, development view. This technique pretty much relies on using UML and diagrams for visualizing these different views. Later, there was also another popular way of visualizing software architecture, it's called C4, created by Simon Brown, which takes a bit more simplistic approach and looks into software from a hierarchical level such as system contexts, containers where the app is running, the components that make up the application, and the code level, such as classes and packages.

For this talk, I want to take an even simpler approach and talk about application architecture in just two levels, which I call internal and external architecture. Internal architecture is everything that is created by and fully in developers' control, that is classes, functions, packages, the different layers within the application, even abstractions of the external systems. We could say that internal architecture is everything that goes inside the container image, and is treated as a black box from ops and platform point of view. External architecture is a collection of everything that the application interacts with. That's the other services that make up the whole system, databases, message brokers, maybe cloud services. As an ops person, you have to be aware of these external interactions and ensure the connections are reliable, secure, observable. Compared to the C4 model, the internal architecture is level 3 and 4 with the components and code, whereas external architecture is basically level 1 and 2, that's the system context and containers.

With this disclaimer in place, let's see some of the memorable influences that happened in the monolithic application and how that changed the internal architecture of the applications. The first one is domain-driven design, a term coined by Eric Evans in his book, "Domain-Driven Design." Domain-driven design is a collection of principles and patterns that help developers encapsulate complex business logic, and close the gap between the business reality and the code. While this book was written nearly a decade before microservices, it set the foundation and later became a cornerstone of microservices by helping developers break down a monolithic application into smaller loosely coupled modules that represent different business domains represented by bounded contexts. Next significant shift, in my opinion, is hexagonal architecture, coined by Alistair Cockburn, in an attempt to avoid the structural pitfalls in the object-oriented software design. Pitfalls such as undesirable dependencies between the layers, and contamination of the user interface with the business logic, and vice versa. Basically, this approach improves the flexibility and maintainability of 3-tire applications by decoupling components and providing a standardized approach for interacting with external dependencies. There were also a few related ideas such as onion architecture, clean architecture by Uncle Bob. These design patterns basically emphasize the separation of concerns within the application layer, and they organize application code base into different layers with specific responsibilities. All of these architectural styles help with the separation of application's business logic from the infrastructure, and allow developers to make changes to the infrastructure without affecting the business logic, and vice versa.

Then came microservices and 12-factor apps. Build on top of the ideas from domain-driven design such as bounded context, aggregates, and hexagonal architecture for isolating external dependencies. Microservices basically allow each service to be independently released and scaled to meet the demand of changing business requirements. 12-factor apps methodology represents a set of best practices for developing microservices based applications, and modern, scalable cloud applications. Each of these ideas built on top of the previous one, but maybe also slightly altered it. As a result of these application development trends, in a decade, the applications' internal architecture changed significantly. The monolith architecture we saw earlier turned into taboo and an antipattern almost, and started transitioning towards microservices and functions.

Compute-First Cloud Era

While the application developers were busy transitioning from monoliths to microservices, let's see what happened with the infrastructure layer at the time. Due to all the changes in the application internal architecture and cloud migration, we start seeing the emergence of standalone middleware for microservices. Whether that is for integration middleware, such as Apache Camel, Spring integration, or Kafka for event-driven architecture, or projects for workflow orchestration, such as Conductor, Cadence, Camunda. These specialized frameworks would start to deliver some of the needs of microservices, whether it's deployed on-prem, on cloud, they still remained within the developer's realm. This shift represents a split of integration responsibilities from the ESB, from the monolith into a separate component, but managed by developers. What's more interesting here is what happened in the compute and low-level networking layers. Docker was announced in 2013, which unlocked a huge way of innovation in the compute abstraction layer. That basically started cloud and compute, for instance, for the ops teams. Kubernetes and lambda were announced, and service mesh a bit later. All that meant that runtime responsibilities start shifting to the underlying platform, and became a responsibility of ops teams and cloud services. They were no more concerns for developers.

The container-based packaging meant that applications written in any language can be orchestrated uniformly, that is performing things such as placement, deployment, scaling, configuration management, moving basically from developer responsibility into ops responsibility, and turning into a declarative and executable format instead of being written as a documentation that may have errors. Also, networking became more dynamic and application focused. Some of the reliability, service discovery, failover, observability, routing responsibilities shifted to the platform level to ops teams. I think one of the main reasons for such a rapid transformation was the fact that we had for the first time, a polyglot application specific format, such as Docker and Kubernetes. These are basically represented by the red boxes on the diagram. These technologies bridge the gap between developers and ops teams, and enabled practices such as DevOps and GitOps using a common language, common patterns, abstraction and tools used by both teams.

I want to dig a little bit deeper into the contract between the application and the compute in this case. Today, whether you're running a microservice as a container on Kubernetes, or a pure container service, or running it as a serverless function, there is a certain contract between the application and the runtime platform. To distinguish from other kinds of platforms that we'll see later and emphasize the fact that this is typically a managed service or SaaS offering, I call this the compute cloud on the diagrams. This contract between the application and the compute cloud is in the form of API interactions, configurations, and even practices such as rolling deployments. All of that we'll refer to as compute bindings. Let's see how we bind an application to the compute layer, and what those APIs are.

Let's say we have a microservice that has some application logic within a container. That application has its own database, its internal state, and may talk to some other systems and services. These are the external dependencies. Then, when we run such an application on a compute platform, there are certain contracts between these two at first. Through a configuration interface, whether that's a YAML file or some other format, we are passing certain resource demands to the compute platform, that is CPU and memory requests and limit in Kubernetes. That is the memory request in case of AWS Lambda. We will be using these two runtime platforms as a comparison. We use other configuration policies to define where the application should run. On lambda that is the region selection or the option to deploy the lambda at the edge. On Kubernetes there is a richer set of configuration options such as taints, tolerations, affinity, anti-affinity, and so on. There can be also other metadata such as labels, environmental variables that we pass to the compute platform, so that it knows how to configure our application. There is even a certain contract, how this configuration is passed from the platform to the application, usually that's through environment variables, but it can also be through mounted files in a specific location in a specific format. There are lifecycle hooks too. The platform now knows how to start and stop our application and trigger certain events during startup, or before shutdown, or other significant lifecycle phases that the application then can interact with. For Kubernetes, there is the post-start event and pre-stop event. For lambda, there are similarly extension APIs that allow the application to intercept the init, invoke, and shutdown phases. Then there are APIs for the platform to check the health of the application. APIs to check whether the application has started, and whether it needs any rectifying actions from the platform. For Kubernetes, these are basically the various health probes the platform is performing. For lambda, because lambda run is such a short-lived process, basically, the health is determined by the response status, and it dictates whether the platform should retry the request or not.

Then every compute platform also offers ways to collect logs, metrics, traces, and these are now mainly based around the structured log format, or Prometheus and OpenTelemetry based metrics and traces. Whether you are aware of this or not, these are some written and unwritten contracts, conventions and practices that call the compute bindings between the application and the compute cloud, regardless which one you are using. I include in compute cloud, also, some various service meshes and transparent mTLS, resiliency, observability concerns. If you look at all of these APIs between the application and the platform, we as developers have to do very little. Maybe we have to implement the health probe APIs, make sure the application is containerized properly, and it can start up and shut down. That's pretty much it. Most of these bindings are used by ops teams to operate the apps at scale at runtime. The nice thing about all of these bindings is that most of these today, are heavily influenced by containers, Kubernetes, and other open source projects and formats. They are pretty universal across most compute platforms, cloud providers, and even different application architectures. Cloud native as a whole is focused around compute and compute binding, and its reach goes this far.

External Architecture

To sum up, we looked at how applications' internal architecture has been evolving from monolith to microservices, and how compute centric application services were born, creating a certain binding between the app and the compute platform, mainly used by ops teams today. What we'll see next is how the applications' external architecture start changing, moving to the cloud too. We will look at the integration bindings, and in this talk, these are the collection of interactions of the application with the other application cloud services, storage layer. These are bindings used in the application's external architecture primarily. In contrast to compute bindings which are used by ops teams, the integration bindings are used by developers while implementing the applications. Here, again, we have our containerized application with some internal state. Notice that when we look at an application's external architecture, there can also be other third-party systems and services the application is interacting with. In addition, if you are familiar with the idea of data on the outside, there can be also state that has not reached the application yet. That state can be in a workflow engine, in a DLQ, or retry in progress. It is designated for this service, but it hasn't been accepted yet. That's the external state on the diagram, which is as important as the internal state when we look at end-to-end request flow. These integration bindings can be in the form of connectors to external systems. It can be messaging and eventing logic, such as message retries, filters, dead letter queues, message delays, content-based routing, handling of poisoned messages. All of these are integration bindings. They can be also service orchestration and workflows, webhooks, and triggers that poke your application at a specific time to do something. Even something such as distributed log used for singleton components. Basically, I put in this category, all the distributed system patterns that developers have to use while implementing the distributed application. These are basically a collection of undifferentiating technical features you have to use to implement your application's bespoke business logic.

What I find more interesting is where these integration capabilities can live. The same way we see that, thanks to containers, and Kubernetes, ubiquitous formats, and even lambda, compute and runtime responsibilities are moved from the application, from the ESB into the compute layer, which is managed by ops or a cloud provider. Similarly, we can see that some of the integration responsibilities are moving from the application layer into its own layer, as a standalone middleware, or even to serverless cloud services. In terms of the deployment option, for this integration capabilities, the traditional approach is to have all of these integration logic within the application layer. For example, that is with projects such as Apache camel, and Spring integration. Camel provides an implementation of tens of messaging patterns, connectors, which are beautifully wrapped in a nice Java DSL. This approach offers most flexibility, but is not available to all popular languages, and it tightly couples your application with integration logic or life cycles. Another extreme is to offload all the integration needs into something like AWS EventBridge or some other cloud provider's framework, and couple your application with it. These frameworks such as the EventBridge are basically a modern serverless cloud based ESB. If you use it, basically you couple your application with the whole ecosystem and tools of that provider.

There is a third option, and that is to use de facto standards, open source projects and APIs to bind your application with integration logic, similarly to how containers and Kubernetes are used for compute bindings. Examples of these open APIs are things such as Apache Kafka, its API is used for stream processing. Redis for caching, and even AWS S3 for file access, and Dapr for distributed systems. These can be deployed on-premise, because there's usually an open source implementation local to the app, or they can be consumed as a cloud service, and you can even change your mind and move back and forth. One limitation with some of these de facto standards is that they lack higher level abstractions that I've been describing. They're mainly focused on the storage access layer. For example, Kafka is for message access, and Redis is just the key-value access. S3 is for cloud access. The integration bindings I've been describing in this talk are more than storage access only. They cover high-level developer concerns, such as the ones offered by Dapr.

I'll cover briefly what Dapr is. Dapr was founded by Microsoft and donated to CNCF in 2021. In essence, Dapr is a set of distributed primitives exposed as APIs and deployed as a sidecar. These capabilities in Dapr are called building blocks. They are nothing more than an API with multiple implementations. For example, there is a state management building block, which is similar to the Redis API, but it can have different state store implementations. There is a Pub/Sub API similar to Kafka, but it has multiple implementations, such as one based on Kafka, Redis, Amazon SQS, GCP Pub/Sub, RabbitMQ. Not only that, the Pub/Sub API, for example, can have high level features that some of the messaging systems around don't offer, but Dapr implements it. For example, DLQs and filtering, delayed message delivery. Basically, Dapr implements most of the configuration bindings I have been talking about, so far, stateful orchestration patterns, which is a new Dapr API called Workflows. Asynchronous interactions that I've been describing, which is the Pub/Sub API in Dapr. The synchronous interactions which is the status invocation API in Dapr, and more. In terms of architecture, Dapr is typically deployed as a sidecar, but we direct our work into making it available as a SaaS tool. You consume Dapr APIs through well-defined HTTP and gRPC APIs, unaware of the backing implementation of these APIs, which can be provided by a cloud service, deployed on-prem, or in-memory implementation for development purpose.

If I was to compare Dapr with Camel and EventBridge, there are many differences. In terms of coupling, Camel would be one that is cloud agnostic, but very much language specific. EventBridge, on the other hand, will be specific to AWS only. Dapr can be used by multiple languages on different cloud providers. It can be co-deployed with your application as a sidecar and eventually consumed as a SaaS tool. Whether something is language and cloud specific is not only about application portability. A cloud and language agnostic framework allows portability of patterns, tools, and practices, and knowledge, and even developers across different projects, teams, and clouds, so it becomes universal knowledge and a de facto standard eventually.

Application-First Cloud

Lastly, let's see what are those application for cloud services and how they might influence the apps we are building. We saw how compute cloud took over the responsibilities of the runtime management and networking, from developers to ops, and even to managed cloud services. What's interesting is that the new compute services are all about individual applications. Here I have put a few from AWS, but other cloud providers have similar services. This list is growing and expanding even to the edge. The networking services are also becoming application focused. They are able to understand HTTP, gRPC, and even application protocols, and give you application-level controls. In a similar way, I see the birth of the integration cloud. The integration cloud is basically a collection of managed services that take the integration responsibilities away from developers and offer them as serverless capabilities. In addition to pure storage services, such as Postgres, MySQL, Kafka, Redis, file storage, I see services for processing events such as AWS EventBridge, Google Eventarc, Azure Event Grid, and all the other variations out there. There are even more services for stateful orchestration such as Step Functions, Temporal Cloud. Services for Pub/Sub, such as Ably, Cron services, webhooks, data virtualization services, GraphQL services, and the list goes on. All of those basically represents an integration cloud.

All of these services are created for developers first and not for ops, and they are typically fully managed and serverless. In the resulting architecture, the core application logic can be bound to one or more cloud services, over compute and integration APIs. Some of these APIs are often based on open source projects, and some today are vendor specific. Developers are still responsible for exposing certain APIs in their application and calling APIs of the integration cloud and connect their application business logic with this cloud. Ideally, this should be done following the principles of hexagonal architecture, but looking at it from a more modern view, using open APIs and formats rather than the original in-process method calls and interfaces. In this architecture, compute and integration capabilities are consumed as SaaS, delegated to a trusted third-party company. The role of ops is more about governing, securing, and configuring these cloud services. There, it's responsible for implementing the differentiating business logic and reusing the undifferentiated integration capabilities as a service.

Here we see more specialized application-first cloud services which the application can bound to. The different serverless compute services, serverless traffic routing services, event processing services, stateful orchestration services, in an ideal world, an application wouldn't be bound to all of them. The application would use few provided within the same cloud region and cloud provider and consume them over open APIs, which leaves the flexibility to move somewhere else later, if needed. In my opinion, we will see more applications running on the cloud, not only bound to the compute layer, or just the storage layer, but bound to the integration layer too. If you trust a cloud provider with your data and compute, why not trust it also for the integration layer, as long as it has standardized boundaries, enabling portability of apps and developers.

Key Takeaways

Lastly, why all of these matters, and what are the key takeaways from this talk. Firstly, one of the goals of this talk was to give an overview of how application and infra evolved over the last two decades. Maybe that's an indication of the direction of change going forward. In terms of key takeaways. First, you should encapsulate an application's internal architecture, whether that's microservices functions, modular monolith, whatever, using open compute bindings that is basically containers as of today. If you understand containers, their lifecycle events, resource constraints, health checks, you can understand many compute platforms and use them quickly and benefit from the whole ecosystem of tools and knowledge without reinventing the wheel. Second, focus on implementing the differentiating business features in your application and try to reuse the undifferentiated repetitive distributed features over APIs, the same way we do for compute and storage today. Ultimately, this comes down to portability. It is rarely about the application portability from one cloud to another. It's more about people and tools portability. It's portability of patterns, of practices, experiences, knowledge from one project to another, from one cloud to another, from one employer to another. We have that portability for the compute layer, and I think we needed the same for the integration layers too.

 

See more presentations with transcripts

 

Recorded at:

Oct 13, 2023

BT