BT

Microservices Resiliency and Fault Tolerance Using Istio and Kubernetes

| by Srini Penchikala Follow 34 Followers on Jan 15, 2018. Estimated reading time: 2 minutes |

Animesh Singh and Tommy Li from IBM spoke at the recent KubeCon + CloudNativeCon North America 2017 Conference about microservices resiliency and fault tolerance leveraging the Istio framework.

Microservices architecture models offer highly scalable and distributed services and need failure management at different layers with resiliency and fault tolerance. We also need to enforce policy decisions such as fine-grained access control and rate limits on services. A service-mesh architecture helps with these requirements by extracting the common resiliency features needed by a microservice framework away from the applications and frameworks.

Singh talked about the container stack where Docker and Kubernetes are part of Layer 5 of the OSI model as part of the orchestration and scheduling service model. He also discussed the IBM Cloud Container Service, which can be used to manage services based on Kubernetes technology. Kubernetes is a good choice for microservices -- its container orchestration capabiities including scheduling, cluster management and discovery features make it easy to deploy and manage microservices.

Singh and Li talked about some of the requirements to build reactive and resilient microservices:

  • Fault avoidance
  • Fault isolation
  • Fault detection
  • Recovery

A service mesh, which is basically a network for services, complements Kubernetes by providing the resiliency to the microservices without each service having to worry about it. Lightweight sidecars help by managing traffic between services.

Istio, an implementation of service mesh, can be used to deploy resilient microservices. The speakers talked about Istio concepts like Pilot, Mixer, and Proxy as well as the Control Plane and Data Plane components. Istio adds fault tolerance to your application without any changes to code. Resilience features include timeouts, retries with timeouts, circuit breakers, health checks, AZ-aware load balancing, and systematic fault injection.

They discussed how to make microservices resilient with Istio which can be accomplished using patterns like Traffic splitting and Traffic steering (content-based traffic steering). The presentation included a demonstration of the resiliency of a sample app using Istio. The sample application includes an auto-generated control panel for manual recreation of problem scenarios using Istio fault injection. It also showed simulated failed microservices and how to observe the live responses from microservice mesh.

Singh and Li discussed the Kubernetes and microservices based developer patterns:

If you are interested in learning more about the developer patterns on container orchestration, you can find more details on their website.
 

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Failure as a Use Case by David Pitt

We've had luck using OpenShift as a service mesh architecture on top of Kubernetes. Also, we have applied fault injection using the Trouble Maker open source framework.

github.com/in-the-keyhole/khs-trouble-maker

Layer it on, baby! by Cameron Purdy

There's no problem that yet another layer of complexity won't solve ... at least until it too has a problem that demands yet another layer of complexity ...

Re: Layer it on, baby! by Daniel Bryant

Haha, I think this is a fair comment Cameron!

However, people have recently dealt with the inherent complexity of operating a (microservices-based) distributed system by either pulling up operational concerns into the app framework (a'la Netflix OSS and Spring Cloud) or by pushing down development concerns into the infrastructure (a'la Kubernetes and/or Envoy).

This often caused challenges with dev not being familiar with ops concepts and vice versa (and yes, I appreciate this is what approaches like DevOps aim to address, but the tooling can assist here :-) )

I believe that a service mesh layer -- with a well-defined control data and data plane -- could be the right abstraction for managing this.

Re: Layer it on, baby! by Srini Penchikala

Thanks Cameron & Daniel for the comments on the post.

Daniel captured it very well in his reply. Capabilities like service retry, fault management etc are typically handled by the application developers in each of their apps. Ideally, we want these capabilities at the platform level so all apps and services can leverage them w/o having to implement one-off solutions.

Service Mesh (with Sidecar proxy) nicely abstracts all these features away from both the applications as well as the containers.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

4 Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT