Older rss

Adaptive Availability for Quality of Service

Posted by Theo Schlossnagle  on  Aug 14, 2016

Theo Schlossnagle talks about lessons learned in building an always-on distributed time-series database with aggressive quality of service guarantees, and techniques for dealing with bad machines.


An Erlang-Based Philosophy for Service Reliability

Posted by Jamshid Mahdavi  on  Jun 03, 2016 2

Jamshid Mahdavi explains how WhatsApp has developed their server components, the deployment processes, and how they monitor, alert, and repair the inevitable failures in a billion-users service.


A Brief History of Chain Replication

Posted by Christopher Meiklejohn  on  Mar 13, 2016

Christopher Meiklejohn talks through a history of chain replication, starting with the original work from 2004 by van Renesse and Schneider up to new and unique designs of chain replication.


Architecting Distributed Databases for Failure

Posted by Fangjin Yang  on  Feb 27, 2016

Fangjin Yang covers common problems and failures seen with distributed systems, and discusses design patterns that can be used to maintain data integrity and availability when everything goes wrong.


Logging Makes Perfect - Real-world Monitoring and Visualizations

Posted by Itamar Syn-Hershko  on  Feb 17, 2016 2

Itamar Syn-Hershko shows using various technologies -Storm, Node.js, Riemann, collectd, D3.js, ELK, PagerDuty, Slack - to power Forter’s service and keep it highly available and under control.


How Netflix Leverages Multiple Regions to Increase Availability: An Active-Active Case Study

Posted by Ruslan Meshenberg  on  Sep 21, 2014

Ruslan Meshenberg discusses Netflix's challenges, operational tools and best practices needed to provide high availability through multiple regions.


How Facebook Scales Big Data Systems

Posted by Jeff Johnson  on  Aug 10, 2014 1

Jeff Johnson introduces Apollo, a hierarchical NoSQL data system meant to deal with Facebook's distributed storage needs.


Wix Architecture at Scale

Posted by Aviran Mordo  on  Aug 10, 2014

Aviran Mordo introduces Wix's architecture, a highly available eventually consistent system, along with patterns for rendering many websites with a relatively small number of servers.


Exploiting Loopholes in CAP

Posted by Michael Nygard  on  Jun 11, 2014

Michael Nygard discusses several loopholes in the CAP theorem that can be used to engineer practical, real-world systems with desirable features.


High Availability at Braintree

Posted by Paul Gross  on  Mar 17, 2014

Paul Gross explains how Braintree deals with high availability for their Ruby application.


Summly: An Award Winning Mobile App's Journey to the Cloud with Five-9s Availability on a Shoestring Budget

Posted by Eugene Ciurana  on  Mar 11, 2014

Eugene Ciurana describes the architectural choices, servers configuration, database, and caching systems that enabled Summly to achieve Five-9-Availability with cross-continental deployments.


Architecting for High Availability

Posted by Attila Narin  on  May 14, 2013

Attila Narin discusses AWS concepts: Availability Zones, RDS Multi-AZ deployments, SQS and Auto Scaling, Elastic IP, load balancing, DNS, DynamoDB, Amazon S3, etc., and EC2 best practices.

General Feedback
Marketing and all content copyright © 2006-2016 C4Media Inc. hosted at Contegix, the best ISP we've ever worked with.
Privacy policy

We notice you're using an ad blocker

We understand why you use ad blockers. However to keep InfoQ free we need your support. InfoQ will not provide your data to third parties without individual opt-in consent. We only work with advertisers relevant to our readers. Please consider whitelisting us.