InfoQ Homepage Performance & Scalability Content on InfoQ
-
How Project Cyclop Enabled GitHub to Reduce Push Failures to Nearly Zero
GitHub spawned Project Cyclop several months ago to identify what caused occasional push failures and to find a fix. It turns out there was no single culprit, and a careful analysis led to identifying a number of changes that improved push traffic by at least an order of magnitude, according to GitHub.
-
Microsoft Announces Azure Monitor SQL Insights for Azure SQL in Public Preview
Recently Microsoft announced Azure Monitor SQL Insights for Azure SQL in public preview. With the preview, customers will get a flexible canvas for telemetry collection, analysis, and rich custom visualization.
-
Dropbox Improves Sync Performance Using a Modified Brotli
After analyzing the performance of several common lossless compression algorithms, Dropbox engineers have slightly modified Google's Brotli encoder to improve their engine sync performance. This reduced median latency and data transfer by more than 30%, Dropbox engineers Rishabh Jain and Daniel Reiter Horn maintain.
-
GitHub Was down Multiple Times Last February: Here's Why
GitHub completed its internal investigation about what caused multiple service interruptions that affected its service last February for over eight hours. The root cause for this was a combination of unexpected database load variation and database configuration issues.
-
Network Automation at Fastly
Ryan Landry, the senior director for TechOps at Fastly, has shared how network automation enables them to manage traffic peaks during popular live-streamed events such as the Super Bowl LIV. Fastly is directly connected to numerous ISPs across the US and tries to keep their live video traffic on these direct paths with their partners to deliver video streams as close to the end-user as possible.
-
TCMalloc, Google's Customized Memory Allocator for C and C++, Now Open Source
Google's TCMalloc can be used as a replacement for C and C++ default memory allocators to provide greater efficiency at scale and better support for parallelism, says Google.
-
Dynein – an Asynchronous Background Job Service from Airbnb
At Airbnb, they move time consuming, resource intensive tasks over to asynchronous background jobs to improve scalability. The job scheduling system has become a very important component and they have therefore built Dynein, a distributed delayed job queueing service and scheduler. In a blog post, Andy Fang from Airbnb describes the background and challenges in designing and building the service.
-
HAProxy EBtree: Design for a Scheduler, and Use (Almost) Everywhere
At QCON New York 2019, Andjelko Iharos presented how CTO Willy Tarreau and the HAProxy team implemented a scheduler using an EBtree data structure to optimize performance and memory usage of the HAProxy load balancer.
-
Microsoft Introduces Azure Front Door, a Scalable Service for Protecting Web Applications
In a recent blog post, Microsoft introduced the general availability (GA) of Azure Front Door (AFD), a scalable and secure entry point for web applications. The underlying technology in Azure Front Door, has been in place inside of Microsoft for the past five years where it has enabled scaling and protection for many popular Microsoft services including Office 365, Xbox, and Microsoft Teams.
-
Scaling Graphite at Booking.com
Booking.com's engineering team scaled their Graphite deployment from a small cluster to one that handles millions of metrics per second. Along the way, they modified and optimized Graphite's core components - the carbon-relay and carbon-cache, and the rendering API.
-
Scaling Apache Kafka at Pinterest
Apache Kafka is used at Pinterest for transporting data for real time streaming applications, logging and visibility metrics for monitoring. Hosted on AWS, Pinterest’s Kafka installation uses the MirrorMaker and DoctorKafka tools for replication and high availability.
-
The Evolution of Uber’s 100+ Petabyte Big Data Platform
Uber’s engineering team wrote about how their big data platform evolved from traditional ETL jobs with relational databases to one based on Hadoop and Spark. A scalable ingestion model, standard transfer format and a custom library for incremental updates are the key components of the platform.
-
Scaling Global Traffic at Dropbox with Edge Locations and GSLB
The Dropbox engineering team shared their experience of architecting and scaling their global network of edge locations. Located around the globe, these run a custom stack of nginx and IPVS and connect to the Dropbox backend servers over their backbone network. A combination of GeoDNS and BGP Anycast ensures availability and low latency for end users.
-
Supercharging Marketo's Campaign Engine at Reactive Summit
Marketo is a marketing automation software, executing over 20 billions customer defined actions per month. Apurva Pawar, Daniel Pugliese, Dennis Bronnikov and Pei-Chiang Ma from Marketo’s engineering team explained at Reactive Summit how they rewrote the core of their system with Akka and a reactive approach.
-
Amazon S3 Increases Request Rate Performance and Drops Randomized Prefix Requirement
Amazon Web Services (AWS) recently announced significantly increased S3 request rate performance and the ability to parallelize requests to scale to the desired throughput. Notably this performance increase also "removes any previous guidance to randomize object prefixes" and enables the use of "logical or sequential naming patterns in S3 object naming without any performance implications".