Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Microservices Resiliency and Fault Tolerance Using Istio and Kubernetes

Microservices Resiliency and Fault Tolerance Using Istio and Kubernetes

This item in japanese


Animesh Singh and Tommy Li from IBM spoke at the recent KubeCon + CloudNativeCon North America 2017 Conference about microservices resiliency and fault tolerance leveraging the Istio framework.

Microservices architecture models offer highly scalable and distributed services and need failure management at different layers with resiliency and fault tolerance. We also need to enforce policy decisions such as fine-grained access control and rate limits on services. A service-mesh architecture helps with these requirements by extracting the common resiliency features needed by a microservice framework away from the applications and frameworks.

Singh talked about the container stack where Docker and Kubernetes are part of Layer 5 of the OSI model as part of the orchestration and scheduling service model. He also discussed the IBM Cloud Container Service, which can be used to manage services based on Kubernetes technology. Kubernetes is a good choice for microservices -- its container orchestration capabiities including scheduling, cluster management and discovery features make it easy to deploy and manage microservices.

Singh and Li talked about some of the requirements to build reactive and resilient microservices:

  • Fault avoidance
  • Fault isolation
  • Fault detection
  • Recovery

A service mesh, which is basically a network for services, complements Kubernetes by providing the resiliency to the microservices without each service having to worry about it. Lightweight sidecars help by managing traffic between services.

Istio, an implementation of service mesh, can be used to deploy resilient microservices. The speakers talked about Istio concepts like Pilot, Mixer, and Proxy as well as the Control Plane and Data Plane components. Istio adds fault tolerance to your application without any changes to code. Resilience features include timeouts, retries with timeouts, circuit breakers, health checks, AZ-aware load balancing, and systematic fault injection.

They discussed how to make microservices resilient with Istio which can be accomplished using patterns like Traffic splitting and Traffic steering (content-based traffic steering). The presentation included a demonstration of the resiliency of a sample app using Istio. The sample application includes an auto-generated control panel for manual recreation of problem scenarios using Istio fault injection. It also showed simulated failed microservices and how to observe the live responses from microservice mesh.

Singh and Li discussed the Kubernetes and microservices based developer patterns:

If you are interested in learning more about the developer patterns on container orchestration, you can find more details on their website.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Failure as a Use Case

    by David Pitt,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    We've had luck using OpenShift as a service mesh architecture on top of Kubernetes. Also, we have applied fault injection using the Trouble Maker open source framework.

  • Layer it on, baby!

    by Cameron Purdy,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    There's no problem that yet another layer of complexity won't solve ... at least until it too has a problem that demands yet another layer of complexity ...

  • Re: Layer it on, baby!

    by Daniel Bryant,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Haha, I think this is a fair comment Cameron!

    However, people have recently dealt with the inherent complexity of operating a (microservices-based) distributed system by either pulling up operational concerns into the app framework (a'la Netflix OSS and Spring Cloud) or by pushing down development concerns into the infrastructure (a'la Kubernetes and/or Envoy).

    This often caused challenges with dev not being familiar with ops concepts and vice versa (and yes, I appreciate this is what approaches like DevOps aim to address, but the tooling can assist here :-) )

    I believe that a service mesh layer -- with a well-defined control data and data plane -- could be the right abstraction for managing this.

  • Re: Layer it on, baby!

    by Srini Penchikala,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Thanks Cameron & Daniel for the comments on the post.

    Daniel captured it very well in his reply. Capabilities like service retry, fault management etc are typically handled by the application developers in each of their apps. Ideally, we want these capabilities at the platform level so all apps and services can leverage them w/o having to implement one-off solutions.

    Service Mesh (with Sidecar proxy) nicely abstracts all these features away from both the applications as well as the containers.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p