Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Google Delivers Comprehensive Cloud Infrastructure Reliability Guide

Google Delivers Comprehensive Cloud Infrastructure Reliability Guide

This item in japanese

Google recently delivered a cloud infrastructure reliability guide combining best practices and expertise from its engineers for its customers.

The guide is intended for customers looking to make the right design decisions for their cloud infrastructure to land their workloads. In a Google Cloud blog post, Nir Tarcic, senior staff engineer, and Kumar Dhanagopal, cross-product solutions developer at Google, explain:

The Google Cloud infrastructure reliability guide walks you through the building blocks of reliability in Google Cloud and how these building blocks affect the availability of your cloud resources. You’ll get a deeper understanding of regions, zones, and platform-level availability targets for applications deployed in a single zone, in multiple zones, or across regions.

Within the guide, customers can find deployment architectures that they can choose from to distribute resources across locations and deploy redundant resources:

  • A single-zone architecture might suffice for workloads that can tolerate downtime or for applications that enterprises can deploy quickly at another location when necessary.
  • A multi-zone architecture is suitable for workloads that need resilience against zone outages yet can tolerate some downtime caused by region outages.
  • A multi-region deployment architecture is ideal for business-critical workloads and where high availability is essential, such as retail and social media applications.


Customers can also find information on traffic and load-management techniques like capacity planning, autoscaling, and change-management guidelines to reduce the reliability risk of the infrastructure resources.

Similarly, other public cloud providers have guides and offerings available for reliability. For example, Microsoft has a dedicated site providing an overview of products, training, and documentation with Azure reliability. And AWS offers a paper (reliability pillar) as part of its Well-Architect framework.

Richard Seroter, a director of developer relations and outbound management at Google, stated in a LinkedIn post:

There are many resilience features in a public cloud that you don't even have to think about. Some things just work better without you doing anything! But overall, systems resilience is a matter of architecture. It's intentional work on your part. This new Google Cloud guide can help you build more reliable infrastructure wherever your apps run.

Lastly, Google provides more guidance with patterns and best practices for building scalable and resilient applications.

About the Author

Rate this Article