BT

Your opinion matters! Please fill in the InfoQ Survey!

Distributed Schedulers with Microservice Architectures

| by Andrew Morgan Follow 0 Followers on Aug 19, 2017. Estimated reading time: 2 minutes |

Martin Campbell, microservices scalability expert at DigitalOcean, talked about running a microservice based architecture with a distributed scheduler at MicroXchg Berlin 2017. He focused primarily on the problems encountered along the way, and the tradeoffs between offerings like Kubernetes, Nomad, and Mesos.

Some key takeaways included:

  • Distributed schedulers allow you to reason about a cluster as if it is a single physical machine.
  • They make DevOps easier, reducing a lot of operational complexity that is particularly prevalent in microservice based architectures.
  • None of the offerings out there run stateful services particularly well, so it is best to not run these types of services using one.
  • In the event of a network partition, a fault tolerant distributed scheduler should leave all processes running even if their host nodes cannot talk to the master.

Campbell first explains that an operating system kernel is a centralized scheduler. This is because it is something that manages multiple processes on a single machine. He then explains that a distributed scheduler is conceptually the same, except that it works across a cluster of machines as opposed to a single one: "You are taking your entire datacenter and treating it like it is a single physical machine"

Distributed schedulers are particularly beneficial in a microservice architecture. Campbell believes that this is because of the extra operational overhead caused by multiple services which are constantly being scaled and deployed.

Comparing various distributed schedulers, Campbell first spoke about his experience with Mesos. With it, you don’t need to care about what physical machine a process ends up on, as it handles deployment based on resource quotas such as CPU and RAM. It also provides a dashboard that allows to easily visualize the data center as if it’s a single physical machine.

Campbells main issue with Mesos was how it handles network partitions. If a process can no longer communicate with the Mesos master, it’s killed. Campbell believes this is a poor design choice, and in fact, as network partitions are commonplace in distributed systems the applications should still keep running in the event of one. He gave Kafka as an example of where this behavior leads to data loss. Although it’s a distributed message bus which is designed to be resilient, a partition could lead to losing nearly every single node, thus data.

Ultimately, Campbell abandoned Mesos, first seeking out Nomad as an alternative. The advantage of Nomad was that it had its own gossip protocol, allowing servers to communicate within the same data center, and across data centers. In the event of partition, services within the same partition could still function and communicate, and in the event of partition resolution, they would become eventually consistent. However, because Campbell did not know of any applications running Nomad in production, he did not want to risk the migration.

The final and chosen tool was Kubernetes. Whilst similar to Mesos, Campbell found a few advantages. Mainly, it handled network partitions differently and did not kill instances in the event of one. It also had a dashboard that made it easy to reason about the cluster, and fewer levels of abstraction when dealing with the applications.

The full talk is available to watch online, with more detail about the application architecture that Campbell dealt with, and more information about the various schedulers.
 

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT