Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles The Potential for Using a Service Mesh for Event-Driven Messaging

The Potential for Using a Service Mesh for Event-Driven Messaging

Leia em Português


Key Takeaways

  • The current popular implementations of service meshes (Istio, Linkerd, Consul Connect, etc.) only cater to the request-response style synchronous communication between microservices
  • For the advancement and adoption of service meshes, we believe that it is critical that they support event-driven or messaging-based communication
  • There are two main architectural patterns for implementing messaging support within a service mesh; the protocol proxy sidecar, which is a proxy for all the inbound and outbound events from the consumer and producer; and the HTTP bridge sidecar which translates or transforms event-driven communication protocol to HTTP or similar protocol
  • Regardless of the bridging pattern that is used, the sidecar can facilitate the implementation (and correction abstraction) of cross-functional features such as observability, throttling, tracing etc.
Want to learn more about Service Mesh?
Read our ultimate guide to managing service-to-service communications in the era of microservices and cloud.
Read the guide
Service Mesh

Service meshes are increasingly becoming popular as an essential technology and an architectural pattern on which to base microservices and cloud-native architecture. A service mesh is primarily a networking infrastructure component that allows you to offload the network communication logic from your microservices-based applications so that you can fully focus on the business logic of your service.

A service mesh is built around the concept of a proxy, which is colocated with the service as a sidecar. Although a service mesh is often advertised as a platform for any cloud-native application, popular implementations of service meshes (Istio/Envoy, Linkerd, etc.) currently only cater to the request/response style of synchronous communication between microservices. However, interservice communication takes place over a diverse set of patterns, such as request/response (HTTP, gRPC, GraphQL) and event-driven messaging (NATS, Kafka, AMQP) in most pragmatic microservices use cases. Since service-mesh implementations do not support event-driven communication, most of the commodity features that service meshes offer are only available for synchronous request/response service - event-driven microservices must support those features as part of the service code itself, which contradicts the very objective of service-mesh architecture.

It is critical that a service mesh supports event-driven communication. This article looks at the key aspects of supporting event-driven architecture in a service mesh and how existing service-mesh technologies are trying to address these concerns.

Implementing event-driven messaging

In a typical request/response synchronous messaging scenario, you will find a service (server) and a consumer (client) that invokes the service. The service-mesh data plane acts as the intermediary between the client and the service. In event-driven communication, the communication pattern is drastically different. An event producer asynchronously sends the events to an event broker, with no direct communication channel between the producer and consumer. The communication style can either be pub-sub (multiple consumers) or queue-based (single consumer), and depending on the style, the producer can send messages to either a topic or queue respectively.

The consumer decides to subscribe to a topic or a queue that resides in the event broker, which is fully decoupled from the producer. When there are new messages available for that topic or queue, the broker pushes those messages to the consumer.

There are a couple of ways to use service-mesh abstraction for event-driven messaging.

Protocol-proxy sidecar

The protocol-proxy pattern is built around the concept that all the event-driven communication channels should go through the service-mesh data plane (i.e., the sidecar proxy). To support event-driven messaging protocols such as NATS, Kafka, or AMQP, you need to build a protocol handler/filter specific to the communication protocol and add that to the sidecar proxy. Figure 1 shows the typical communication pattern for event-driven messaging with a service mesh.

[Click on the image to enlarge it]

Figure 1: Event-driven messaging with a service mesh

As most event-driven communication protocols are implemented on top of TCP, the sidecar proxy can have protocol handlers/filters built on top of TCP to specifically handle the abstractions required to support each of the various messaging protocols.

The producer microservice (Microservice A) has to send messages to the sidecar via the underlying messaging protocol (Kafka, NATS, AMQP, etc.), using the most simple code for the producer client while the sidecar handles most of the complexities related to the protocol. Similarly, the logic of the consumer service (Microservice B) is also quite simple while the complexity resides at the sidecar. The abstractions provided from the service mesh may change from protocol to protocol.

The Envoy team is currently working on implementing Kafka support for the Envoy proxy based on the above pattern. It is still work in progress, but you can track the progress at GitHub.   

HTTP-bridge sidecar

Rather than using a proxy for the event-driven messaging protocol, we can build an HTTP bridge that can translate messages to/from the required messaging protocol. One of the key motivations for building this bridging pattern is that most of the event brokers offer REST APIs (e.g., the Kafka REST API) to consume and produce messages. As shown in figure 2, the existing microservices can transparently consume the underlying event broker’s messaging system simply by controlling the sidecar that bridges the two protocols. The sidecar proxy is primarily responsible for receiving HTTP requests and translating them into Kafka/NATS/AMQP/etc. messages and vice versa.

[Click on the image to enlarge it]

Figure 2: The HTTP bridge allows the service to communicate with the event broker via HTTP

Similarly, you can use the HTTP bridge to allow microservices based on Kafka/NATS/AMQP to communicate directly with HTTP (or other request/response messaging protocols) microservices as in figure 3. In this case, the sidecar receives Kafka/NATS/AMQP requests, forwards them as HTTP, and translates HTTP responses back to Kafka/NATS/AMQP. There are some ongoing efforts to add support for this pattern on Envoy and NATS (e.g., AMQP/HTTP Bridge and a NATS/HTTP bridge, both for Envoy).

[Click on the image to enlarge it]

Figure 3: The HTTP Bridge allows services based on event-driven messaging protocols to consume HTTP services

Although the HTTP-bridge pattern works for certain use cases, it is not strong enough to serve as the standard way of handling event-driven messaging in the service-mesh architecture because bridging event-driven messaging protocol with a request/response messaging protocol always has limits. It is more or less a workaround that might work for certain use cases. 

Key capabilities of an event-driven service mesh

The capabilities of a conventional service mesh based on request/response-style messaging are somewhat different from the capabilities of a service mesh that supports messaging paradigms. Here are some of the unique capabilities a service mesh that supports event-driven messaging will offer:

  • Consumer and producer abstractions - With most messaging systems, such as Kafka, the broker itself is quite abstract and simple (a dumb pipe in microservices context) and your services are smart endpoints (most of the smarts live in the producer or consumer code). This means that the producers or consumers must have a lot of messaging-protocol code alongside the business logic. With the introduction of a service mesh, you can offload such commodity features (e.g., partition rebalancing in Kafka) related to the messaging protocol to the sidecar and fully focus on the business logic in your microservice code.
  • Message-delivery semantics - There are many message-delivery semantics such as "at most once", "at least once", "exactly once", etc. Depending on what the underlying messaging system supports, you can offload those tasks to the service mesh (this is analogous to supporting circuit breakers, timeouts, etc. in the request/response paradigm).
  • Subscription semantics - You can also use the service-mesh layer to handle the subscription semantics, such as durable subscription of the consumer-side logic.
  • Throttling - You can control and govern the message consumption limits (rate limiting) based on various parameters such as the number of messages, message size, etc.
  • Service discovery (broker, topics, and queue discovery) - The service-mesh sidecar allows you to discover the broker location, topic, or queue name during message production and consumption. This involves handling different topic hierarchies and wildcards.
  • Message validation - Validating messages that are used for event-driven messaging is becoming important because most of the messaging protocols such as Kafka, NATS, etc. are protocol agnostic. Hence message validation is a part of consumer or producer implementation. The service mesh can provide this abstraction so that a consumer or producer can offload the message validation. For example, if you use Kafka along with Avro for schema validation, you can use the sidecar to do the validation (i.e., fetch the schema from an external scheme registry such as Confluent and validate the message against that scheme). You can also used this to check messages for malicious content.
  • Message compression - Certain event-based messaging protocols, such as Kafka, allow the data to be compressed by the producer, written in the compressed format to the server, and decompressed by the consumer. You can easily implement such capabilities at the sidecar-proxy level and control them at the service-mesh control plane.
  • Security - You can secure the communication between the broker and consumers/producers by enabling TLS at the service-mesh sidecar level so that your producer and consumer implementations do not need to worry about secured communication and can communicate with the sidecar in plain text.
  • Observability - As all communications take place over the service-mesh data plane, you can deploy metrics, tracing, and logging out of the box for all event-driven messaging systems.

About the Author  

Kasun Indrasiri is the director of Integration Architecture at WSO2 and is an author/evangelist on microservices architecture and enterprise-integration architecture. He wrote the books Microservices for Enterprise (Apress) and Beginning WSO2 ESB (Apress). He is an Apache committer and has worked as the product manager and an architect of WSO2 Enterprise Integrator. He has presented at the O'Reilly Software Architecture Conference, GOTO Chicago 2019, and most WSO2 conferences. He attends most of the San Francisco Bay Area microservices meetups. He founded the Silicon Valley Microservice, APIs, and Integration meetup, a vendor-neutral microservices meetup in the Bay Area. 

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • A "Service Mesh" with (Event) "Broker" or a (Event) "Bus" is not a mesh anymore!

    by Patrice Krakow,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    A "Service Mesh" has been called mesh, because by moving the “component” handling the cross-cutting concerns of integration from a central proxy (such as Message Oriented Middleware or Enterprise Service Bus) to each and every services as sidecar proxies, the network topology has changed from a “star” diagram to a “mesh” diagram.

    If you use a central event broker or even bus, you are back to “star” diagram with a network bottleneck in the middle of your topology :-(

    In my opinion, if you want to complement your Service Mesh with an asynchronous pub-sub event-driven style, you would better not lose the benefit of the mesh over a “star” along the way.

    You wrote: << The consumer decides to subscribe to a topic or a queue that resides in the event broker, which is fully decoupled from the producer. >> I would simply challenge the fact that you want to decouple the event “broker” from the producer, just keep it with the producer - but, if we do that it is correct that we would better not call it a “broker” anymore ;-)

    In the early days of HTTP, the HTTP server was also another piece of software, but then it became part of the service as a library! Let’s do the same for our Kafka!!! Today, we even do not need to transform it into library, we can just add it as an extra container within the pod containing the service, thanks to Kubernetes.

    Then, you will have both the advantage of a “mesh” and the asynchronous pub-sub event-driven communication. That’s what I would call a Genuine Event Mesh ;-)

  • Re: A "Service Mesh" with (Event) "Broker" or a (Event) "Bus" is not a mesh

    by Miroslav Gula,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I think star topology of MoM or ESB is much simpler and easier to handle, setup, and maintain. With clustering it is capable of handling thousands of messages per second. So there is really no bottleneck. The mesh is needed in some edge cases, when you need to handle tens of thousands messages per second or more. Then the advantage of local proxies kicks in.

    I think service mash is another modern buzzword, like microservices. I read lot of articles lately advocating against microservices in favor of the monolith. And I think they are right.
    As with every enterprise piece of software, at the beginning you need good and deep analysis of clients requirements and also near future plans, so you can choose right tools and architecture for the job.

  • Sidecars

    by Dawid Nowak,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    In this concept, I like that the sidecar provides an abstraction layer so my application doesn't need to know which EventBus/Broker is in use. I have implemented such an approach in SWIR platform .

    From the productivity perspective, it is awesome to see how quickly an application can be developed.

    On top of that the sidecar takes care of ensuring that the behaviour is the same for all supported brokers and the sidecar implementation deals with subtle differences between Kafka,NATS or AWS Kinesis.

  • Will I be in a Service Mesh mess for thinking it can be used for Async comm?

    by John Conors,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I read the following from the co-creator of Service-Mesh.
    "!Furthermore, many of the service mesh's features are designed to help in the case of synchronous communication like HTTP or gRPC calls between services. If your application is a monolith, or communicates purely via Kafka or another distributed queue, then the service mesh will not provide a lot of value."

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p