BT
rss
DevOps Follow 831 Followers

Why the World Needs More Resilient Systems: Tammy Butow Discusses Chaos Engineering at QCon London

by Daniel Bryant Follow 706 Followers on  Mar 18, 2018

At QCon London, Tammy Butow, explained why the world needs more resilient systems, and how this can be achieved with the practice of chaos engineering. Three primary prerequisites for chaos engineering were provided -- high severity “SEV” incident management, monitoring, and measuring the impact -- and a series of guidelines, tools and practices presented.

Cloud Follow 286 Followers

Microsoft Introduces Azure Availability Zones, Completes MAREA Transatlantic Connection

by Kent Weare Follow 10 Followers on  Sep 29, 2017

In a recent blog post, Microsoft announced the expansion of High Availability (HA) and resiliency options for customers. The update comes in the form of Azure Availability Zones which increase the availability of certain Azure services within a specific region by providing complete redundancy and isolation of the infrastructure. Azure Availability Zones include a financially-backed SLA of 99.99%.

Cloud Follow 286 Followers

Public Preview of Azure IaaS Disaster Recovery Announced

by Kent Weare Follow 10 Followers on  Aug 07, 2017

In a recent announcement, Microsoft released details about its public preview for Infrastructure-as-a-Service (IaaS) disaster recovery using Azure Site Recovery (ASR). Using the ASR service, organizations can protect IaaS workloads in one Azure region and have it replicated to a different Azure region within a geographical cluster.

Development Follow 612 Followers

GitLab.com Postmortem Digs into Root Causes of 18 Hour Outage

by David Iffland Follow 4 Followers on  Feb 21, 2017

GitLab's postmortem into the root cause of their 18 hour site outage is a detailed look at how the incident began, how it got worse before it got better, and how they plan to learn from the mistakes and improve the service.

Development Follow 612 Followers

BitBucket Introduces Disaster Recovery and Merge Strategies

by Sergio De Simone Follow 14 Followers on  Sep 11, 2016

Recently released BitBucket Server and BitBucket Data Center 4.9 bring the possibility of defining a strategy for disaster recovery, setting a preferred merge strategy, and more.

Followers

Too Big To Fail: Lessons Learnt from Google and HealthCare.gov

by Daniel Bryant Follow 706 Followers on  Jun 14, 2015

At QCon New York 2015, Nori Heikkinen shared stories of failure and lessons learnt during her time working as a site reliability engineer (SRE) at Google and HealthCare.gov. The discussion of managing large-scale outages included recommendations for preparation, response, analysis and prevention.

Followers

CenturyLink Acquires DataGardens to Offer DR as a Service

by Janakiram MSV Follow 0 Followers on  Dec 11, 2014

CenturyLink, one of the largest telecommunications and cloud providers has announced the acquisition of Canada based disaster recovery software company, DataGardens.

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT