InfoQ Homepage Disaster Recovery Content on InfoQ

News

RSS Feed

Newer Older

DevOps

Disaster Recovery Across a Million Pieces: Michelle Brush at QCon San Francisco

During the second day of QCon San Francisco 2023, Michelle Brush, an engineering director, SRE at Google, discussed challenges, patterns, and practices for disaster recovery actions in massively distributed systems in her session. The session is part of the "Designing for Resilience" track.

Steef-Jan Wiggers
on Oct 04, 2023
Cloud

Google Introduces Cloud Backup and Disaster Recovery

Google recently introduced Cloud Backup and Disaster Recovery (DR), allowing customers to enable centralized backup management directly from the Google Cloud console. The new backup and recovery service is designed to work with cloud storage repositories, databases, and applications.

Steef-Jan Wiggers
on Sep 18, 2022
Development

How to Prepare for the Unexpected: an InfluxData Outage Story Told at KubeCon EU 22

Cloud applications promise high availability and accessibility to its users, but for that to be achieved a disaster recovery plan is essential. The team behind InfluxDB shared at KubeConEU22 their lessons learned from battle testing their disaster recovery strategy on the day when they deleted the production.

Olimpiu Pop
on May 19, 2022
Cloud

Amazon Introduces S3 Batch Replication to Replicate Existing Objects

Amazon recently introduced Batch Replication for S3, an option to replicate existing objects and synchronize buckets. The new feature is designed for use cases such as disaster recovery setup, reduce latency or transfer ownership of existing data.

Renato Losio
on Feb 20, 2022
Cloud

Amazon Announces Elastic File System Replication for Multi-Region Deployments

Amazon recently announced Elastic File System Replication to keep an up-to-date copy of a network file system in a second AWS region or within the same region.

Renato Losio
on Feb 05, 2022
Cloud

AWS Announced General Availability of Elastic Disaster Recovery

Recently AWS announced the general availability (GA) of AWS Elastic Disaster Recovery (AWS DRS). With this new service, organizations can minimize downtime and data loss through the fast, reliable recovery of on-premises and cloud-based applications.

Steef-Jan Wiggers
on Nov 29, 2021
Cloud

Amazon Introduces AWS Resilience Hub to Monitor and Improve RPO and RTO

Amazon recently announced the availability of AWS Resilience Hub, a service designed to help customers define, measure, and manage the resilience of their applications on the cloud.

Renato Losio
on Nov 17, 2021
Cloud

AWS Releases Amazon Route 53 Application Recovery Controller into General Availability

Recently, AWS announced the general availability (GA) of Amazon Route 53 Application Recovery Controller, an additional new set of capabilities in Amazon Route 53. With the capabilities, it will be easier for customers to continuously monitor their applications’ ability to recover from failures and control their recovery across AWS Regions, Availability Zones, and on-premises infrastructure.

Steef-Jan Wiggers
on Aug 10, 2021
Cloud

Microsoft Announces the Public Preview of Disk Pool for Azure VMware Solution

Microsoft recently announced the preview of disk pool enabling Azure Disk Storage as a persistent storage option for Azure VMware Solution - a vSAN hyper-converged vSphere cluster. With this persistent storage option, customers have another choice for running VMware workloads on Azure.

Steef-Jan Wiggers
on Jul 20, 2021
Architecture & Design

Uber Implements Disaster Recovery for Multi-Region Kafka

In a recent blog post, Uber engineers highlight how they use a replication platform to implement disaster recovery at scale with a multi-region Kafka deployment. Uber has a large deployment of Apache Kafka, processing trillions of messages and multiple petabytes of data per day. Uber's engineers provided business resilience and continuity in the face of natural and human-made disasters.

Eran Stiller
on Jan 04, 2021
Cloud

Amazon Introduces a New Feature for ElastiCache for Redis: Global Datastore

Recently Amazon announced Global Datastore, a new feature of Amazon ElastiCache for Redis that provides fully managed, fast, reliable and secure cross-region replication.

Steef-Jan Wiggers
on Mar 28, 2020
DevOps

Summary of Chaos Community Day v4.0: Resilience, Observability, and Gamedays

Earlier in the year, the fourth edition of “Chaos Community Day” was held at Work-Bench in New York City. Key takeaways from the day included: the topic of chaos engineering draws heavily from other domains, which software engineers can also learn from; understanding systems, and communicating and exchanging the related mental models, is vital for establishing resilience.

Daniel Bryant
on Jun 07, 2019
DevOps

Building Production-Ready Applications: Michael Kehoe Shares Lessons Learned from LinkedIn

At QCon San Francisco, Michael Kehoe presented “Building Production-Ready Applications”. Drawing on his experience with site reliability engineering (SRE), he introduced the tenets of “production-readiness” that all engineers across the organisation should focus on as: stability and reliability; scalability and performance; fault tolerance and disaster recovery; monitoring; and documentation.

Daniel Bryant
on Nov 12, 2018
DevOps

Why the World Needs More Resilient Systems: Tammy Butow Discusses Chaos Engineering at QCon London

At QCon London, Tammy Butow, explained why the world needs more resilient systems, and how this can be achieved with the practice of chaos engineering. Three primary prerequisites for chaos engineering were provided -- high severity “SEV” incident management, monitoring, and measuring the impact -- and a series of guidelines, tools and practices presented.

Daniel Bryant
on Mar 18, 2018
Cloud

Microsoft Introduces Azure Availability Zones, Completes MAREA Transatlantic Connection

In a recent blog post, Microsoft announced the expansion of High Availability (HA) and resiliency options for customers. The update comes in the form of Azure Availability Zones which increase the availability of certain Azure services within a specific region by providing complete redundancy and isolation of the infrastructure. Azure Availability Zones include a financially-backed SLA of 99.99%.

Kent Weare
on Sep 29, 2017

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News