Key Takeaways
- Think of microservices interaction in terms of commands and queries: Commands are external requests that create, update or delete data. Queries are external requests that simply read data.
- Service interaction mechanisms include Messaging, RPC and REST.
- Service orchestration gives visibility into the processes and data flow involved but it's inflexible as you have to describe every interaction.
- Service choreography works by exchanging of the messages between different services and translates into decentralized service composition.
- Spring Cloud Stream framework lets the developers focus on writing messaging-centric services whereas Spring Cloud Data Flow lets them orchestrate those messaging-based microservices.
Introduction
The recent trend in application architectures is to transition from monolithic applications to a microservices model. Software development teams are trying to develop a unified architecture that will work for most of the use cases in real world applications: online transactional, batch, IoT and big data processing pipelines.
This transition without a good service interaction model will most likely result in chaos and a service landscape that's hard to govern and maintain.The teams would have hundreds of microservices communicating with each other without any governance to allow only the authorized microservices to call the other microservices.
The focus of this virtual panel article is to discuss the pros and cons of service orchestration vs. service choreography and best practices on how to implement business process services that require calling multiple microservices.
Panelists:
- Chris Richardson - developer, architect, Java Champion and author of POJOs in Action
- Daniel Bryant - Independent Tech Consultant, CTO of SpectoLabs & InfoQ editor
- Glenn Engstrand - Software Architect at Adobe Systems, Inc.
- Josh Long - Spring Developer Advocate at Pivotal and Java Champion
- Alex Silva - Principal Architect, Data Platform, Pluralsight
InfoQ: What are the different options for managing microservices interaction? Please discuss any design considerations for how microservices should communicate with each other.
Chris Richardson: One simple way to think about microservice interaction is in terms of commands and queries. Commands are external requests (i.e. from outside the application) that create, update or delete data. Queries are, as the name suggests, external requests that simply read data. The challenge in a microservice architecture is that in order for services to be loosely coupled each service has its own database, the Database per service pattern. As a result, some queries need to retrieve data from multiple service and some commands must update data in multiple services.
Queries are the usually the easiest to implement. Simpler queries retrieve data from a single service but more complex queries must retrieve data from multiple services. For example, the order details might be scattered across multiple services including the order service and the delivery service. One way to implement this kind of query is through the API Composition pattern: call each service that as the data and join the results together. Another approach is the CQRS pattern, which maintains an easily queried view that pre-joins the data from multiple services. The view is kept up to date by subscribing to the domain events that are published by services when their data changes.
Commands are more challenging. Complex commands need to update data in multiple services. For example, when the user places an order, the application must create an Order in the OrderService, and redeem a Coupon in the CouponService. Distributed transactions are not an option. Instead, an application must use the Saga pattern. A saga is a sequence of local transactions in the participating services that are coordinated using messages or events.
Daniel Bryant: The two most popular methods of communication for microservices are Remote Procedure Calls (RPC) -- typically via HTTP and JSON or something like gRPC --- and messaging/eventing -- often using something like RabbitMQ or Apache Kafka.
The key difference in interaction patterns within any kind of distributed system is in regards to coupling via time and space. For example, RPC tends to be synchronous and point-to-point. Messaging is typically asynchronous, delivered via topic queues or pub/sub, and often there are multiple consumers for a message. In general, synchronous interaction is easier to reason about, design and debug, but is less flexible, more costly to ensure fault tolerance, and more challenging to operationally scale in comparison with asynchronous interaction.
An additional coordination layer is often added on top of the communication mechanism, which deals with things like service discovery or orchestration.
Glenn Engstrand: In order to understand the options for managing microservice interaction, we should first study its history. Let’s look back to a time that is almost a decade before microservices really took off. In the early 2000s, the book Enterprise Integration Patterns was published. The corresponding web site for EIP remains an important reference for service interaction even to this day.
Workflow engines were a popular option back in the days of Service Oriented Architecture, Business Process Management, and the Enterprise Service Bus. The promise at that time was that you could create orchestration style APIs without needing to employ a fully trained engineer. They are still around but there isn’t much interest in this option for microservice interaction anymore, primarily because they could not deliver on that promise.
The most natural way to break a monolith up into microservices is to carve out the data access parts from the code base and replace them with calls to RESTful data centric APIs. This was the concept of the original API gateway but that term now refers to something else (see below). Though this has been going on for quite some time, the technology focused media started to recognize this trend as BfFs (Backends for Frontends) a couple of years ago. A slight variation with BfFs is that you can have a different orchestration service for desktop and mobile experiences.
The first article on Staged Event Driven Architecture was originally published December 2001. SEDA started gaining in popularity at about the same time as workflow engines. Since then SEDA has been eclipsed by reactive programming which has become very popular in the past few years. It is possible to write a single program in a reactive style but the most common use cases for reactive programming are distributed and involve a message broker.
BfFs were originally homegrown but the more framework oriented parts, such as authentication, request logging, and rate limiting, became a vendor category known as API Gateways.
Block chains are most famous for their crypto currency capability but IBM and the Linux Foundation believe that the consensus and smart contracts parts could become a popular option for microservice interaction.
There are a lot of design considerations to take into account when formulating architectures for microservice interaction; dependency management, data integrity, eventual consistency, clustering, service definition, objects vs aspects, authentication, monitoring, and resiliency.
When you split a monolithic application up into microservices, dependency management becomes a thing. There is a growing trend in IT right now, called monorepo, which basically is a response to this issue.
In the time of the monolith with a relational database, data integrity was taken for granted because you ran every change needed to maintain data integrity in a single transaction which either failed or succeeded atomically. That is not really a viable option in the world of microservices. How can you maintain consistent state in the case of partial failure?
Reactive systems are popular now because they have a lot of advantages but they do come with a price which is eventual consistency. What users want emotionally and what is easiest for GUI developers to code is that, when you call an API that mutates state and the API returns a success status, then you can count on that change as being immediately in effect. That is not the case with eventual consistency. In my personal experience, eventual consistency is not really an issue with social applications but is a deal breaker with financial applications.
In this 12 factor world, no service is run by a single host. Rather a cluster of hosts, each running an instance of the service, sits behind a load balancer which proxies each request to a different node in the pool. Some technologies, such as memcached and cassandra, require client side load balancing. Compound this with the ephemeral nature of the cloud, and you realize why technologies such as Kubernetes are quickly gaining in popularity.
When you have to manage over a dozen rapidly evolving APIs, you come to appreciate any mechanism that can keep the clients (usually written in different programming languages) and the server code in sync. That is the problem that service definition technologies, such as Swagger, Apache Thrift, and GRPC, are attempting to solve.
Study modern enterprise microservices written in Java and you will find a mixture of both Object Oriented Programming and Aspect Oriented Programming. Conventional wisdom dictates that AOP is best for cross cutting concerns. It is not clear yet whether or not microservice interaction is best considered as a cross cutting concern. Many technologies that facilitate microservice interaction offer both. You have to choose which one is right for your team. Right now, there is better support for OOP than AOP in most IDEs (Integrated Development Environment).
Authentication is always an important design consideration. For microservices, you will most likely want to adopt the oauth 2 standard. Many API gateways make it easy to integrate with oauth 2. There are a few situations where that doesn't really work such as when a web browser needs to launch a spreadsheet.
Monitoring becomes a very important design consideration in a microservice world. Application and access logs need to be aggregated and surfaced in a way that is easily searchable by both developers and user ops folks. Performance related metrics, such as per minute latency (average and percentiles), throughput, error rate, and utilization, need to be aggregated and surfaced in a way that is easily searchable by developers, Quality Engineers, and technical operations support. An alerting system needs to be in place where uncharacteristic patterns and Service Level Agreement violations in logging and metrics are detected with corresponding notifications sent to the right personnel in a way that minimizes alert fatigue.
The last design consideration is resiliency. Study the code from most junior developers and you will find lots of assumptions in the code that dependent services are always available. That is not necessarily the case and it would be best if your service did not destabilize when dependent services become degraded or unresponsive. Techniques such as rate limiting, circuit breaking, and bulkheading become relevant here.
Josh Long: There are a lot of different ways for services to interact. It's useful to think about interaction patterns and what qualities you want in those interactions. What level of robustness? What levels of decoupling?
Messaging - where systems communicate through a messaging fabric like Apache Kafka or RabbitMQ - supports a higher degree of service decoupling while introducing another moving part into a system. Messaging forms the backbone of a number of other patterns like CQRS, the Saga pattern and eventual consistency. Messaging gives you locational decoupling - it doesn't matter if the consumer knows where the producer is, so long as it knows where the broker is. It gives you temporal decoupling - the consumer can process requests at whatever time or pace. A producer and a consumer are coupled by the payload (its schema) of the message. They're also, technically, constrained by the broker itself but it is a well-understood recipe to stand up highly available message brokers or fabrics that cluster so that were any node to fail the system would correctly endure.
RPC - where clients and services interact in terms of function or procedure or method invocations. In RPC, remote services are made to appear like local invocations of functions or methods on objects. There are countless options for building RPC-centric applications including gRPC, RMI, XML-RPC and RPC-style SOAP (which is different from document literal-style SOAP).
REST - where clients and services interact in terms of HTTP requests and replies using the qualities of HTTP, like request parameters, headers, and content-negotiation - to build services. REST has some of the benefits and constraints of both messaging and RPC. It's main benefit is that it's suitable for the open-web where all languages and platforms have support for talking to HTTP services.
There are a lot of other ways for nodes running in separate processes to communicate, but I usually limit the discussion to these three options when talking about microservices as other approaches are more readily described as integration.Microservices come in systems. It's a natural consequence of moving to the architecture: you need to address the complexity implied in building distributed systems. The complexity arises in their interactions, their connections, which may be imperiled for any number of reasons including latency, overwhelming demand, or network partitions. There is no perfect solution, only solutions that support the trade-offs that you are comfortable making. I talk about some of the trade-offs for different types of communication styles in the first question. Consider levels of reliability.
Alex Silva: A few options are:
- Transport data over HTTP, using a serialization format such as JSON, Avro or protobuf.
- Use some type of message broker software, such as RabbitMQ.
- Use a distributed log that supports data replication and streaming semantics, such as Kafka.
InfoQ: What are the pros and cons of service orchestration approach? What type of use cases are better candidates for service interaction?
Chris Richardson: Orchestration-based sagas use an orchestration object, which tells the participants what actions to perform. For example, a CreateOrderSaga object would tell the OrderService to create an order and tell the CouponService to redeem the coupon. If one of those steps fails, the CreateOrderSaga would tell the participants to execute compensating transactions to undo the work that they had already committed.
There are several benefits of this approach. The logic of the saga is centralized and so it is easy to understand. It also simplifies the participants. Another benefit is that eliminates cyclic design-time dependencies. The orchestrator depends on the participants but not vice versa.
One potential drawback of orchestration is that there is a risk of the participants being anemic because the orchestrator implements too much logic. Care must be taken to ensure that the orchestrator is as simple as possible. Another drawback, is that you need to implement the orchestrator. As I describe below, some event event-based, choreography-based sagas can be very simple to implement.
Daniel Bryant: Service orchestration is generally easier to implement, and often gives better visibility into the processes and data flow involved, both at design time and runtime. Platforms that support orchestration also tend to offer cross-cutting concerns as part of the deployment fabric or framework, such as service discovery, flow control and fault tolerance.
On the flip side, orchestration can be somewhat inflexible as you have to describe every interaction, and it doesn’t always scale well when dealing with large and complex processes. Depending on the implementation, it can be the case that adding new task into an existing process can result in having to deploy the entire system (i.e. monolithic deploys). This can be problematic if the process is continually changing and evolving.
Glenn Engstrand: To understand why these pros and cons are intrinsic to service orchestration, let’s imagine an illustrative yet typical sample service interaction. There are three services, B, C, and D. Service B needs something from service C in order to successfully finish an API call. Service C needs something from service D in order to successfully finish an API call. In the orchestration approach, there would be a fourth service, called A, that first calls D and takes the response from that call and includes it in the call to service C then takes the response from that call and includes it in the call to service B. Service B doesn’t know how to call C nor does it depend on C directly. Service C doesn’t know how to call D nor does it depend on D directly.
The biggest advantage to service orchestration is less code complexity. Of the four services described above, only service A has to concern itself with resiliency. If you want to understand how service B depends on service C or how service C depends on service D, then all you have to do is study service A. This is a concept known as clear separation of concerns.
The biggest disadvantage to service orchestration is greater release complexity. Sometimes a new feature will require changes to multiple services. Each service has its own independent release schedule. You have to coordinate the changes in such a way that services that are depended on are released prior to services that depend on them. No release can be backwards breaking. The orchestration service may have to go through multiple interim releases in order for correct behavior to occur at all times. Imagine a new feature that requires changes to all four services of our sample service interaction. The worst case would be that you would have to release D first, then A, then C, then A, then B, then A.
The most common use case for service orchestration is the BfF.
Josh Long: Orchestration refers to integrating disparate services towards a desired result. The idea is that there is a single actor that involves other services. These services may not be aware of the desired goal. The actor guards the process state. In choreography, all actors in the system are aware of the global outcome desired and play their role towards that outcome. If something goes wrong in an orchestration, the orchestrator (a business process management engine like Activiti or a saga coordinator, for example) is responsible for recovering. In choreography, individual actors are aware of what must be done to recover. individual actors work more to be robust because there is nothing else - no other actor - that will compensate and recover.
Orchestration works well if you need to be explicit about process state because the individual actors in the process aren't aware of the encompassing process. The drawback, of course, is that you need to involve another actor which in theory introduces another moving part which may fail. In orchestration, each individual component may be blissfully focused on doing one thing, ignorant of the encompassing process or desired result. Choreography, on the other hand, works well when you have full control over the actors involved in the process and they all work towards a common goal; you don't need an extra moving part in the system and that moving part doesn't need to be so robust as to ensure that every other actor in the system works. In choreography, services may not abdicate responsibility for the encompassing process or desired result.
Alex Silva: Often times, services act together to create a distributed, asynchronous workflow, which typically follows a request/response pattern. In situations like these, an orchestrator is a natural architectural choice. Orchestrators can help with:
- Track and manage workflows.
- Process lifecycle: pause, resume and restart processes.
- Synchronously (or serially) process all the tasks when there is a greater need for control.
- Process elasticity.
The trade offs in using service orchestrators include:
- Introduction of tight coupling between the services.
- Introduces a single point of failure (the orchestrator itself.)
Synchronous processing can block requests, which can lead to resource starvation.
InfoQ: What are the pros and cons of service choreography approach?
Chris Richardson: Choreography-based sagas, where participants simply emit domain events. Other participants receive these events and perform an update. For example, the OrderService could emit an OrderCreated event, which causes the CouponService to attempt to redeem the Coupon. The CouponService then emits an event indicating the outcome of redeeming the Coupon.
They are often remarkably simple to implement, especially when using event sourcing. Event-Sourcing CQRS example application which I described in my QCONSF 2014 talk, is an example of a choreography-based saga that is built using event sourcing.
However, there are a few drawbacks. First, the implementation of a saga is distributed amongst the participants. It can be difficult to understand the interactions. Second, participants are listening to each others events and so there are cyclic dependencies, which is a design smell.
Daniel Bryant: The big benefit of service choreography is that it is in general more loosely coupled. However, as with asynchronous communication and design paradigms, it can be more difficult to design, understand, and operate. Additionally, a developer implementing the choreography approach often has to also build functionality to manage the process state (or data flow) and implement reliability and quality of service mechanisms along with the handling of fault tolerance. This functionality also has to be typically built-in to each service, as each component within a choreographed system is acting somewhat in isolation (and autonomously). This can lead to the increased component size and complexity, and also managing these dependencies adds overhead to the implementation and operating (i.e. upgrading libraries across all services).
Glenn Engstrand: Let’s revisit our sample service interaction again. This time using choreography instead of orchestration. There is no service A. Service B calls service C (either synchronously or asynchronously via a message bus) which, in turn, calls service D.
The biggest advantage to service choreography is fewer moving parts, one less service to support. Another big advantage, in the case of reactive programming, is stronger resiliency.
The biggest disadvantages to service choreography are cyclic dependencies and QA automation. Using our sample interaction, let’s say that a new feature is required. The developer ends up adding a call to service B from service D. Now you have an eternal loop. If the system is reactive (i.e. asynchronous messages instead of synchronous API calls), then you may not even detect the eternal loop for a while. Debugging reactive systems can be quite a challenge for junior developers.
Usually QA automation is used to regress or ensure that recent changes do not break existing functionality. This is done by setting up a known data store state, calling the API, then checking the data store state to ensure that it was changed correctly. More complexity is added for reactive systems because the QA automation also has to consume the messages from the message broker to ensure that the correct messages were sent. Even if you decide that QA automation doesn’t have to validate the messages, there is still more complexity with validating the changes to the underlying datastores since consistency is now eventual.
Alex Silva: With service choreography can be defined by exchange of messages, interactions and agreements between two or more services, which translates into decentralized service composition.
Some of the benefits include:
- Faster processing as services can be run asynchronously.
- Adding or updating services is a much simpler operation since they can be simply plugged in or out of the event stream.
- Since control is distributed, a single orchestrator is no longer needed.
- Service choreography enables the usage of several patterns, such as event sourcing, command query responsibility segregation (CQRS), to name a few.
Some of the trade offs include:
- Asynchronous programming is a challenge and a paradigm shift for many engineers.
- Even though the architecture may look simpler, the complexity has just shifted from a single orchestrator to a flow control that is now broken up and spread across all the individual services.
InfoQ: In Service Choreography model, how can the developers ensure the communication between microservices doesn’t get out of control with pretty much any service being able to invoke any other service?
Chris Richardson: If the saga is simple, then use choreography. If it becomes too difficult to understand, then use orchestration.
Daniel Bryant: Some people may argue that this is one of the benefits of microservices i.e. the ability to compose new functionality from existing components (the holy grail of reuse!). This also shouldn’t be a problem if each microservice is cohesive, offers a loosely coupled API, and operates and evolves in a controlled manner under a Service Level Objective (SLO) and Agreement (SLA) (for example, significant functionality changes should result in a new service being created).
Having said this, I take your point about the potential challenges with every service being able to call every other (and let’s be honest, the capability provided in the majority of popular programming languages of every module being able to call every other module has led to the spaghetti code within some monoliths!). There is also regulatory reasons why you would want to isolate service e.g PCI-DSS.
Possible solutions to these challenges include:
- Creating multiple environments, and deploying services accordingly. Environments can be completed isolated (as we often see with the classic development, QA and production environments), or an environment can expose a small API to other environments.
- Assuming there is a network boundary in place between each service (even if only a virtual boundary), then network policies or Access Control Lists (ACLs) can be used to create segmentation. For example, Kubernetes offers Network Policies, as does Cloud Foundry, and all of the major cloud vendors offer some form of ACL, Security Group or Firewall Rules to control network access.
- Lastly, the applications themselves (or associated frameworks and libraries) can enforce some form of access control.
As with any choice within software development, there is always a tradeoff, and developers should actively seek to make informed and measured decisions.
Glenn Engstrand: This assurance against getting out of control is with regards to managing complexity. As an architect, my job is to manage that complexity. Though particular technologies or tool chains can help with this important yet never ending task, ultimately the recipe for managing complexity comes down to lots of engineering discipline and skillful avoidance of burnout. These are not technological problems, they are cultural ones.
Adhere to SOLID (Single responsibility, Open / closed, Liskov substitution, Interface segregation, Dependency inversion) principles in your design reviews and code reviews. There has been a lot of online discussion recently regarding psychological safety. Perhaps that is also a good place to start.
Josh Long: Service choreography supports service composition and the same rules apply for service composition as do for object composition: lower layers don't know about higher layers.
Alex Silva: A good documentation layer is key. These can take a few different shapes; I’ll list a few:
- Understand what are potential contract break changes and communicate those across the organization. When in doubt, it is better to version your messages.
- Adopt some type of Wiki page that defines the message shape, its contract, and provide overall definition of fields for all versions supported by the service.
- Some type of self-documenting format that enforces a schema, such as Avro.
Making data discovery part of all microservices that participate in choreography. These can be as simple as adding another endpoint to the application.
InfoQ: Can you talk about frameworks for service orchestration, such as Spring Integration or Spring Cloud Data Flow? Are there any similar frameworks or tools useful for service choreography?
Chris Richardson: There are a couple of issues with implementing sagas and business logic in a microservice architecture. The first is a design problem: how to write business logic that works within the constraints of the microservice architecture. I find that the Domain-Driven Design concepts of Aggregates and Domain Events are particularly useful. See this article.
The second problem is technical. How to atomically update state of a domain object or saga orchestrator and publish a message. One solution is to use event sourcing, which I describe here. Another solution, which works well with traditional (e.g. JPA or MyBatis-based) business logic is to use some form of transaction messaging. In my book, Microservice patterns, I describe a framework called Tram (transaction messaging). It implements a mechanism for reliably sending and receiving messages as part of a database transaction without using 2PC/JTA.
Daniel Bryant: Spring Cloud Dataflow to me looks much like the modern variant of Spring Integration (which along with Apache Camel was the Java-flavoured realisation of many of Gregor Hohpe’s classic Enterprise Integration Patterns). As a Java developer, I really enjoy working with the Spring framework, as I can often become productive very quickly. The drawbacks to the Spring approach is that it is obviously Java-centric, and you have to subscribe to the Spring way of doing things. This can lead to some degree of lock-in, and also dealing with somewhat leaky abstractions as the Spring team add adapters and mediators over the top of low-level external APIs.
Other tools that I have explored include AWS Step Functions (essentially the combination of AWS Lambda and Simple Workflow Service), and also Netflix Conductor. I also keep bumping into Apache NiFi, and I know a few other teams are working on similar tooling. The orchestration and coordination of microservices is very much an area of innovation at the moment.
Glenn Engstrand: Though quite old school, Mule or Apache Camel are well suited for EIP. In the domain of API gateways, take a look at Kong, especially the http-log plugin. A great library for resiliency is the Hystrix project. As mentioned previously, one option for choreography is reactive programming which depends on a message broker. Some leading message brokers are Kafka, Kinesis, and Rabbit MQ. If you don't mind being an early adopter, then follow the work being done on Hyperledger Fabric.
Every analytical technology seems to have a Kafka connector these days. I am not a fan of Spark Streaming but the on-boarding is easy. The problem comes when the time it takes to run the Spark job exceeds the window size. Other data pipeline technologies with a time tested Kafka connector include Apache Storm and Akka Streams.
Josh Long: Spring Integration is a framework for building event driven architectures. It supports the classical enterprise application integration (messaging) patterns as defined in Gregor Hohpe and Bobby Woolf's seminal Enterprise Integration Patterns. Spring Integration flows imply process state from the messages (or events) coming off of, or going into, different services. Spring Integration lets us talk about a larger process without a formalized process definition. In Spring Integration, the state is kept in the messages between services, not in the flow. If the message gets lost, the process has failed.
Spring Integration serves a similar use case as Apache Camel.
Spring Integration supports a pipes-and-filters model for distributed systems. individual actors in a Spring Integration flow are ignorant of each other. The individual actors have pre-conditions and post-conditions and don't care about what happens upstream or downstream; as long as the incoming message looks like it's supposed to everything works fine. Spring Integration flows are built in such a way that they can be asynchronous or synchronous with almost no code change. Messages flow from one actor to another in a canonicalized form - a Message<T>. Spring Integration supports integration of disparate services and data with a set of adapters - components that originate (by adapting an event or message from other systems into a Spring Integration Message<T>) or terminate (by adapting a Spring Integration Message<T> into an event or message for other systems) the flow of a message. Spring Integration also supports gateways which provide request-response semantics instead of the unidirectional semantics of an adapter. There are adapters for all sorts of systems out there!
In a microservices system, however, we can probably take for granted that we're not going to connect microservices using, say, SFTP servers. We can probably take for granted that we'll use highly scalable, robust messaging technologies like RabbitMQ, Apache Kafka, Redis, or Kinesis to connect services. Spring Cloud Stream builds on Spring Integration, preserving the pipes-and-filters processing model expressed as MessageChannels and components. It makes a matter of convention and (declarative, external) configuration what would've been the inbound and outbound adapters in Spring Integration. The effect is that we write less code concerned with connecting actors, and focus on business logic.
Spring Cloud Data Flow builds upon Spring Cloud Stream. Where Spring Cloud Stream lets me focus on writing messaging-centric services, Spring Cloud Data Flow lets me orchestrate those messaging-based microservices. It features a design-time tool to describe how Spring Cloud Stream-based services should be stitched together and it can deploy the orchestration into a distribution fabric like Cloud Foundry, Kubernetes, Mesos or YARN.
Application integration has always been about connecting disparate data and services. Spring Cloud Data Flow makes this integration trivial, and lets developers focus on data processing and integration.
Alex Silva: One of the interesting things Spring Integration brings to the table is the ability to integrate with endpoints external to your application. Spring Integration uses Enterprise Integration Patterns as a blueprint for its implementation, adopting the familiar pipes and filters approach to implement service integration. Just like the rest of the Spring framework, Spring integration offers a dynamic way to configure our service integration tiers, allowing teams to express the dependency between services in a loosely-coupled fashion, making it easy to update and extend these dependencies as the business needs grow.
Spring Cloud Data Flow is a framework aimed at bringing data integration and/or processing pipelines to microservices. Cloud Data Flow positions itself as a platform that abstracts both stream and batch processing for data; it does so by exposing a DSL-based approach that enable developers to build, expose, and integrate data pipelines using a single programming model. The library provides constructors to handle aspects such as data ingestion, data export, and even real time analytics.
Both of these choices make a lot of sense if you are familiar Spring and are developing services using the Spring framework.
InfoQ: Can the software development teams use both the orchestration and choreography techniques together for microservices communication in their applications? What are the pros and cons of this combo approach?
Chris Richardson: Yes. Different sagas can use the approach that is best suited. Simple sagas can use choreography; more complex sagas can use orchestration.
Daniel Bryant: Yes, I believe so. The benefits are that the team can use the “right tool for the job”. The challenges include having more stuff to learn and manage, and the potential operational burden of running multiple platforms and frameworks.
Glenn Engstrand: The combo approach is actually the most common for larger companies. The pro is increased flexibility and the con is potential inconsistency. Instead of insisting that there is only one (potentially large) service to rule them all, it might make sense to have a small number of orchestration services for the different bounded contexts of a large enterprise system. This is especially appropriate if many of the separate services came from acquisition. Everyone wants to shy away from the notion that microservice definition is dependent on size. The ugly truth is that, once a microservice gets big enough, it exhibits the same disruptive release cycle as a monolith. As your orchestration service gets bigger, you will want to break it up into multiple microservices just like any other service that gets too big.
There is a pattern called CQRS which is short for Command Query Responsibility Segregation. It is usually easier to develop reactive systems for commands than for queries. That would be another way to combine orchestration with choreography.
Any time you allow more than one way to accomplish a task in a system, you run the risk that developers will pick an inappropriate option for their particular use case. If different teams chose orchestration or choreography without some form of consensual guidance, then the overall system will appear to be inconsistent which makes it harder to comprehend and reason about.
Josh Long: They absolutely can be used together. Orchestration enables us to compose services in such a way that they act as a refinery for data.
Alex Silva: Absolutely. I for one believe that a single approach is not always the best approach in software architecture. With microservices, for instance, we may have a mix of synchronous and asynchronous processing.
The pros of this approach include:
- Services are decoupled
- Asynchronous processing can be leverage when needed
- Overall flow remains distributed
The cons include:
- Increased architectural complexity
There is coupling between the coordinator and the services. Moreover, if the coordinator goes down, the entire system can be compromised.
InfoQ: Container technologies like Pivotal Cloud Foundry (PCF) offer container to container networking capability which helps minimize the latency incurred in invoking services remotely. Can you discuss this more and how it can help with service orchestration?
Chris Richardson: Latency is always a potential issue in a distributed architecture. If you decompose your application in a way that creates excessive latency, then perhaps you should combine services and eliminate the need for inter-process communication.
I am not familiar with PCF container-to-container communication and how that is different than the kind of container-to-container communication that is an integral part of Docker-based orchestration frameworks.
Daniel Bryant: There are many other offerings in this space too, and the topic of networking within the container space is very hot. There is some great work being done by Weaveworks, Project Calico, AWS and many others (often affiliated with the Cloud Native Computing Foundation).
In general I think this type of technology is orthogonal to the challenges of microservices interaction, and developers would be best placed to evaluate and choose their required interaction paradigm, and then map the technologies on to this (and not the other way around!).
One final caution I would like to offer is that remote procedure calls must not be treated like local (in-process) calls. I’ve seen (and made) this mistake with technologies like Java EE EJBs, and it prevents developers from being exposed to things like the Eight Fallacies of Distributed Computing.
Josh Long: In a Pivotal Cloud Foundry installation, communication between nodes routes out through the platform router and then back in again to addressable nodes. Container-to-container networking supports bypassing the router, allowing calls from one container to another, and it supports access policy, limiting or permitting requests to a given container.
Alex Silva: Container-to-container networking can be used to build faster service-to-service interactions. These protocols allow engineers to build communications routes between containers that leverage better security and provide lower latency levels. Container-to-container network usually involves some type of custom protocol that will be faster than a generic protocol, such as HTTP. The downside is that it may introduce vendor-specific and/or protocol coupling across the application stack.
InfoQ: Please add any other comments or thoughts on microservices interaction best practices.
Daniel Bryant: I would recommend that developers read about the basics of service choreography and orchestration, and also about technologies like gRPC and Kafka. One thing I have learned in my 10+ years working as a consultant is that having a broad knowledge of foundational paradigms and skills - and constantly keeping this knowledge up to date with the basics of new approaches and technologies - is vital for making informed architectural decisions.
Glenn Engstrand: Microservice interaction best practices depend on the best practices of the lower levels of the network stack. To that end, container orchestration and devops maturity are also very important.
Josh Long: There is no single approach that will work well for everybody. Google uses RPC. Netflix use REST. Others use messaging. Consider your requirements and proceed from there.
Alex Silva: The choice of microservice interaction should always optimize these broad areas:
- Team autonomy: It pays off to have autonomous teams where coordination with external teams is not part of the development process
- Resiliency: Software systems fail; working with distributed microservices greatly exacerbate these failure scenarios. Make resiliency a design construct from the beginning of any system design.
- Maintainability: This is the other side of flexibility. Managing many codebases requires having some guidelines and tools in place to ensure devops and C.I. consistency.
- Automation: Automate all the things!
- Flexibility: Empower teams with the ability to make choices that match what’s best for their environment and use cases.
Development speed: You can save a lot of money by empowering teams to build and deploy services as quickly as possible!
Conclusions
In this virtual panel article, we asked the experts to discuss the microservices interaction models and best practices. The most common options are service orchestration and service choreography. Both have pros and cons so you need to pick the solution based on the use case context. You can also use the techniques together in specific use cases to get the best of both options.
About the Panelists
Chris Richardson is a developer and architect. He is a Java Champion and the author of POJOs in Action, which describes how to build enterprise Java applications with frameworks such as Spring and Hibernate. Chris was also the founder of the original CloudFoundry.com. He consults with organizations to improve how they develop and deploy applications and is working on his third startup. You can find Chris on Twitter @crichardson and on Eventuate.
Daniel Bryant is leading change within organisations and technology. His current work includes enabling agility within organisations by introducing better requirement gathering and planning techniques, focusing on the relevance of architecture within agile development, and facilitating continuous integration/delivery. Daniel’s current technical expertise focuses on ‘DevOps’ tooling, cloud/container platforms and microservice implementations. He is also a leader within the London Java Community (LJC), contributes to several open source projects, writes for well-known technical websites such as InfoQ, DZone and Voxxed, and regularly presents at international conferences such as QCon, JavaOne and Devoxx.
Glenn Engstrand is Software Architect at Adobe Systems, Inc.. His focus is working with engineers in order to deliver scalable, server side, 12 factor compliant application architectures. Glenn was a breakout speaker at Adobe's internal Advertising Cloud developer's conference in 2017 and at the 2012 Lucene Revolution conference in Boston. He specializes in breaking monolithic applications up into micro-services and in deep integration with Real-Time Communications infrastructure.
Josh Long is Spring Developer Advocate at Pivotal. He is a Java Champion, author of 5 books and 3 best-selling video trainings (including "Building Microservices with Spring Boot Livelessons" w/ Phil Webb), and an open-source contributor (Spring Boot, Spring Integration, Spring Cloud, Activiti and Vaadin).
Alex Silva is a chief data architect at Pluralsight, where he leads the development of the company's data infrastructure and services. He has been instrumental in establishing Pluralsight's data initiative by architecting a platform that is used to capture valuable insights on real-time video analytics while integrating several data sources within the business. Before joining Pluralsight, Alex was a principal data engineer at Rackspace, leading a team of developers responsible for building the company's data initiative.