InfoQ Homepage Performance & Scalability Content on InfoQ
-
How Uber Sped up SQL-based Data Analytics with Presto and Express Queries
Uber uses Presto, an open-source distributed SQL query engine, to provide analytics across several data sources, including Apache Hive, Apache Pinot, MySQL, and Apache Kafka. To improve its performance, Uber engineers explored the advantages of dealing with quick queries, a.k.a. express queries, in a specific way and found they could improve both Presto utilization and response times.
-
Improving the Efficiency of Goku Time-Series Database at Pinterest
Pinterest has modernized and enhanced its Goku time-series database. The recent updates focus on optimizing storage and resource usage without compromising service quality.
-
Software Architecture Tracks at QCon San Francisco 2024 – Navigating Current Challenges and Trends
At QCon San Francisco 2024, software architecture is front and center, with two tracks dedicated to exploring some of the largest and most complex architectures today. Join senior software practitioners as they provide inspiration and practical lessons for architects seeking to tackle issues at a massive scale.
-
Netflix’s Pushy: Evolution of Scalable WebSocket Platform That Handles 100Ms Concurrent Connections
Netflix shared details on the evolution of Pushy, a WebSocket messaging platform that supports push notifications and inter-device communication across many different devices for the company’s products. Netflix’s engineers implemented many improvements across the Pushy ecosystem to ensure the platform's scalability and reliability and support new capabilities.
-
Canva Opts for Amazon KDS over SNS+SQS to Save 85% with 25 Billion Events per Day
Canva evaluated different data massaging solutions for its Product Analytics Platform, including the combination of AWS SNS and SQS, MKS, and Amazon KDS, and eventually chose the latter, primarily based on its much lower costs. The company compared many aspects of these solutions, like performance, maintenance effort, and cost.
-
High HTTP Scaling with Azure Functions Flex Consumption: Q&A with Thiago Almeida and Paul Batum
Microsoft has introduced a significant enhancement to its Azure Functions platform with the Flex Consumption plan, designed to handle high HTTP scale efficiently. This new plan supports customizable per-instance concurrency, allowing users to achieve high throughput while managing costs effectively. In practical tests, Azure Functions Flex demonstrated the ability to scale from zero to 32,000 RPS.
-
Microsoft Introduces the Public Preview of Flex Consumption Plan for Azure Functions at Build
At the annual Build conference, Microsoft announced the flex consumption plan for Azure Functions, which brings users fast and large elastic scale, instance size selection, private networking, availability zones, and higher concurrency control.
-
Allegro Reduces Kafka Producer Latency Outliers by 82% after Switching to XFS
Allegro experimented with different performance optimization options to improve Apache Kafka producer tail latency and eventually switched all its clusters to the XFS filesystem. The company used Kafka protocol sniffing, JVM profiling, and eBPF, which proved instrumental in identifying and eliminating performance bottlenecks.
-
QCon London: Scaling Microservices Architecture and Technology Organization at Trainline
During the recent QCon London conference, Trainline’s CTO spoke about the evolution of the company’s system architecture and organizational structure over the last five years. The company had to adapt to market changes and growing customer expectations by improving the performance and reliability of its technology platform.
-
QCon London: Lessons Learned from Building LinkedIn’s AI/ML Data Platform
At the QCon London 2024 conference, Félix GV from LinkedIn discussed the AI/ML platform powering the company’s products. He specifically delved into Venice DB, the NoSQL data store used for feature persistence. The presenter shared the lessons learned from evolving and operating the platform, including cluster management and library versioning.
-
QCon London: How Duolingo Sent 4 Million Push Notifications in 6 Seconds During the Super Bowl Break
As part of the Super Bowl marketing campaign, Duolingo sent out 4 million mobile push notifications when the company’s five-second ad aired during the commercial break. At QCon London, Doulingo’s engineers presented the asynchronous AWS architecture responsible for broadcasting messages to millions of users across seven US cities.
-
QCon London: Meta Used Monolithic Architecture to Ship Threads in Only Five Months
Zahan Malkani talked during QCon London 2024 about Meta’s journey from identifying the opportunity in the market to shipping the Threads application only five months later. The company leveraged Instagram's existing monolithic architecture and quickly iterated to create a new text-first microblogging service in record time.
-
Expedia Speeds up Flights Search with Micro Frontends and GraphQL Optimizations
Expedia made flight search faster by up to 52% (page usable time) by applying a range of optimizations to web and mobile applications. To support these improvements, the company improved the observability of its applications. Expedia Flights web application has been migrated to Micro Frontend Architecture (MFA) to allow flexibility, reusability, and better optimization.
-
Hashnode Creates Scalable Feed Architecture on AWS with Step Functions, EventBridge and Redis
Hashnode created a scalable event-driven architecture (EDA) for composing feed data for thousands of users. The company used serverless services on AWS, including Lambda, Step Functions, EventBridge, and Redis Cache. The solution leverages Step Functions' distributed maps feature that enables high-concurrency processing.
-
Uber Builds Scalable Chat Using Microservices with GraphQL Subscriptions and Kafka
Uber replaced a legacy architecture built using the WAMP protocol with a new solution that takes advantage of GraphQL subscriptions. The main drivers for creating a new architecture were challenges around reliability, scalability, observability/debugibility, as well as technical debt impeding the team’s ability to maintain the existing solution.