InfoQ Homepage Availability Content on InfoQ
-
Cloudflare Global Outage Traced to Internal Database Change
Cloudflare’s recent global outage, linked to a database update, caused widespread disruption and highlighted the risks of single-vendor reliance. While service was restored, the incident sparked discussions on the importance of multi-vendor strategies in tech. Cloudflare's CEO vowed to enhance system resilience, emphasizing that outages can impact even the largest providers.
-
AWS Simplifies Multi-Region Failover with ARC Region Switch
AWS's Amazon Application Recovery Controller Region Switch revolutionizes multi-region failover with a fully-managed, centralized solution. Simplifying disaster recovery, it automates and coordinates essential tasks across AWS services. With proactive validation and a global dashboard, it transforms complex processes into confident, push-button drills, enhancing reliability and cost efficiency.
-
Azure Event Hubs Geo-Replication Reaches General Availability
Microsoft has launched the General Availability of Geo-replication for Azure Event Hubs, enhancing data availability and redundancy. This feature allows seamless cross-region data replication, ensuring business continuity during outages. With synchronous and asynchronous options, users can choose their preferred data consistency, backed by increased health metrics for better monitoring.
-
Microsoft's Customer Managed Planned Failover Type for Azure Storage Available in Public Preview
Microsoft’s new customer-managed planned failover for Azure Storage enhances disaster recovery by enabling geo-redundancy without data loss or reconfiguration. This proactive solution supports business continuity during outages and large-scale disasters, aligning with competitive offerings from AWS and Google Cloud.
-
Azure Advisor Well-Architected Assessment in Public Preview to Optimize Cloud Infrastructure
Microsoft Azure recently announced the public preview of the Advisor Well-Architected assessment. This self-guided questionnaire aims to provide tailored, actionable recommendations to optimize Azure resources while aligning with the Azure Well-Architected Framework (WAF) principles.
-
Canva Opts for Amazon KDS over SNS+SQS to Save 85% with 25 Billion Events per Day
Canva evaluated different data massaging solutions for its Product Analytics Platform, including the combination of AWS SNS and SQS, MKS, and Amazon KDS, and eventually chose the latter, primarily based on its much lower costs. The company compared many aspects of these solutions, like performance, maintenance effort, and cost.
-
Google Cloud Enhances Spanner with Dual-Region Configuration
Google Cloud has introduced a significant update to its fully-managed distributed SQL database service, Spanner, which now offers a dual-region configuration option. The company aims with this enhancement to assist enterprises in complying with data residency norms across countries with limited cloud support while ensuring high availability.
-
Erlang-Runtime Statically-Typed Functional Language Gleam Reaches 1.0
Gleam, an actor-based highly-concurrent functional language running on the Erlang virtual machine (BEAM), has reached version 1.0, which means it is now ready to be used in production systems with a guarantee of backward compatibility based on semantic versioning.
-
Uber Improves Resiliency of Microservices with Adaptive Load Shedding
Uber created a new load-shedding library for its microservice platform, serving over 130 million customers and handling aggregated peaks of millions of requests per second (RPSs). The company replaced the solution based on QALM with Cinnamon library, which, in addition to graceful degradation, can dynamically and continuously adjust the capacity of the service and the amount of load shedding.
-
How RevenueCat Manages Caching for Handling over 1.2 Billion Daily API Requests
RevenueCat extensively uses caching to improve the availability and performance of its product API while ensuring consistency. The company shared its techniques to deliver the platform, which can handle over 1.2 billion daily API requests. The team at RevenueCat created an open-source memcache client that provides several advanced features.
-
Slack Migrates to Cell-Based Architecture on AWS to Mitigate Gray Failures
Slack migrated most of the critical user-facing services from a monolithic to a cell-based architecture over the last 1.5 years. The move was triggered by the impact of networking outages affecting a single availability zone, causing user-impacting service degradation. The new architecture allows incrementally draining all the traffic away from the affected availability zone within 5 minutes.
-
Automated Horizontal Scaling with Amazon Aurora Limitless Database
AWS recently announced the preview of Amazon Aurora Limitless Database, a new capability supporting automated horizontal scaling to process millions of write transactions per second and manage petabytes of data in a single Aurora database.
-
Monzo Employs Targeted Traffic Shedding against Stampeding Herd Effect from the Mobile App
Monzo developed a solution for shedding traffic in case its platform comes under intense and unexpected load that could lead to an outage. Traffic spikes can be generated by the mobile app and triggered by push notifications or other bursts in user activity. The solution can reduce the read traffic by almost 50% with 90% overall accuracy without noticeable customer impact.
-
How Amazon Prime Video Delivers 99.999% Availability While Reducing Costs
Amazon Prime Video created a highly available live video streaming architecture by combining redundant components to achieve the five-nines of availability that they require for their platform. The company optimized the deployment topology and video encoding to reduce costs while ensuring optimal video quality for users.
-
AWS Introduces Amazon Route 53 Resolver on AWS Outposts Rack
AWS recently announced that Amazon Route 53 Resolver is now available on AWS Outposts rack providing on-premises services and applications with local Domain Name Service (DNS) resolution directly from Outposts. In addition, local Route 53 Resolver endpoints also enable DNS resolution between Outposts and on-premises DNS servers.